May 26, 2011 - Python tricks Fun with Python’s newer tools¶
by Raymond Hettinger
You should be on Python 2.7!¶
- To be supported for 5-10 years
- Earlier you convert the better
collections.Counter¶
Tool for making rapid tallies or counts
Modeled after:
- Multisets in C++
- Bags in smalltalk and Objective C
Very flexible, unrestricted implementation as a directory
You can put anything in it:
- Count with positive and negative numbers
- Count with decimals, floats, or fractions
- it is just a dictionary
Simple design as a dict subclass!¶
With __missing__() that returns zero:
c[x] += 1 # easy to tally
Has __delitem__() to match __missing__():
del c[x] # easy to delete
Convienance methods¶
most_common(n)
- returns sorted list of the n highest counts
- reduces the code to a simple one-liner
- implemented using either sorted() or heapq.nlargest()
elements
- lists all the contents individually
- if an element has a count of three, it is emitted three times
- Differs from __iter__ which returns pairs
- Done this way because otherwise you change the dict API.
Multiset use case
- With multisets, the counts are always positive
- Math operations with omit zero or negative counts from the result
- Operations are: + - & |
- The subtraction operation is said to be saturating
- When the counts are all one, works just like regular sets
collections.namedtuple()¶
Works just like a regular tuple and lets you assign names to each field.
- Makes the code self-documenting
- Makes the printed __repr__ intelligible
- Let’s you change tuple order without affecting client code
- No extra memory cost over tuple
One of the best single best changes you can make to existing code.
Doing an enum in python (not needed but its kind of cool):
Color = namedtuple('Color', 'red orange yellow')._make(range(3))
Caching¶
Simple unbounded caches can grow without bound.
How to do it:
from functools import lru_cache
@lru_cache(maxsize=100)
def big_computation(*args):
...
Fibonacci example:
@lru_cache()
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
print(fibonacci(5))
Grotesque Hacks¶
Fun things to do with Python
Not so bad¶
The iter() function has a two argument form that takes a sentinal value:
for block in iter(partial(f.read,20), "")
...
Awesome way to stop a loop!
Getting grotesque¶
Make dict sparse to speed up a dictionary:
d.update(dict(d)) # doubles space in hash table
Wrap it up in a function:
def sparser(d):
return d.update(dict(d)) # doubles space in hash table
Turn-off thread-switching (cheap locking):
ci = getcheckinterval(0)
sys.setcheckinterval(0) # switching off
value = max(tasklist)
tasklist.remove(value)
sys.setcheckinterval(ci) # switching on
Atomic actions (no locks required):
v = d.pop(key) # find and remove in one-step
d.setdefault(key, []).append(v) # one-step init
Speed-up builtin access:
from __builtin__ import *
Slow constant-function:
>>> def make_constant_function(x):
... return lambda x=x: x
>>> f = make_constant_function(3)
>>> f()
3
Fast constant functions:
>>> def make_constant_function(x):
... return itertools.repeat(x).next