June 6, 2011 Knowing Things¶
By Jeff Schenk
Way too much heckling and too many comments from one person who kept taking him off topic so I asked Jeff to not take so many questions. That did not go over well with the hecklers, but actually asking on the mailing list uncovered a serious problem in the user group. Now things are better.
- Kinda knowing things is easy
- Really knowing with certainty a lot of complex things is maybe harder
- Need to know things with precision is really important
- Integers (how many people clicked)
- Bling (money)
Do you want to misplace money and get fired? No? Use Decimal:
from decimal import Decimal moneys = Decimal('100.01')
Decimal oddities in rounding so use quantize:
>>> moneys / 2 Decimal('50.005') # Copied something wrong here.. # so use quantize >>> (moneys / 2).quantize('001') Decimal('50.005') # ... or maybe here cause these numbers are the same
- Timezones suck
- Computers like integers
- So they use hours since epoch
- all the work is done in about 10 lines of Python code.
- Joins are death
- If you join, you will die
- intelligent index are super
- if you’re going to group bu it or filter on it, you probably want it indexed.
- When you’re working with a lot of data, you need to aggregate chunks as you go.
- My Guess: A lot of Celery tasks!
- Spooned into a single report table that breaks normalization
They use itertools a lot!
- itertools.chain is your friend
- itertools.tee is also your friend
Algorithms makes merging of iterables really powerful:
import heapq for result in heapq.merge(query1, query2): # merge results and know they are in order print(result)
Caching is key!¶
- They need flexibility to slice and dice the data
- Once its been sliced, they want to be able to view, page, and sort the data
- Redis gives the speed of cache with the power to sort and page
- They use redis-py as their library
- Test coverage?