July 20, 2012

Redis and Python: It’s PB and J time

Who am I?

  • Python user of 12 years

  • Former python-dev bike-shedder

  • Former maintainer of Python async sockets libraries

  • Author of a few small OS projects

    • rpqueue, parse-crontab, async_http, timezone-utels, PyPE
    • PyPE is his own text editor
  • Worked at some cool places we’ve never heard of:

    • Networks in Motion
    • Ad.ly
  • Worked at some cool places you have heard of:

    • Google
    • YouTube
  • And cool places will hear of: ChowNow

  • Very heavy user of Redis

  • Author of upcoming Redis in Action

What is Redis?

  • In-memory database/data structure server

    • Limited to main memory; vm and diskstore defunct
  • Persistence via snapshot or append-only file

  • Support for master/slave replication (multiple slaves and slave chaining supported)

    • No master-master, don’t even try. In other words, if you have two masters running on the same system, you’ll have serious, major, painful issues.
    • Client-side sharding
    • Cluster is in-progress
  • Supports data structures + publish.subscribe

    • Strings, Lists, Sets, Hashes, Sorted Sets (ZSETs)
  • Server-side scripting with Lua in Redis 2.6

Comparing Redis to other databases/caches

  • Memcached

    • in-memory, no-persistence, counters, strings, very fast, multi-threaded
  • Redis

    • in-memory, optionally persisted, data structures, very fast, server-side scripting, single-threaded
  • MongoDB

    • On-disk, speed inversely related to data integrity, bson, master/slave, sharding, multi-master, server-side map-reduce, database-level locking
  • Riak

    • on-disk, pluggable data stores, multi-master sharding, RESTful API, server side map-reduce
  • MySQL/PostgreSQL

    • On-disk/in-memory, pluggable data stores, master/slave, sharding, stored procedures, …

Redis Strings

Note

Apparent other types in Redis are really just strings. Hrm.

  • Really just scalars of a few different types

  • Character strings

    • concatenante values to the end
    • get/set individual bits
    • get/set byte ranges
  • Integers (platform long int)

    • increment/decrement
    • auto “casting”
  • Floats (IEEE 754 FP Double)

    • increment/decrement
    • auto “casting”

Redis Lists

  • Doubly-linked list of character strings

    • push/pop from both ends
    • blocking pop from multiple lists
    • blocking

Redis Sets

  • Unique unordered sequence of character strings
  • Backed by a hash table
  • add, remove. check membership, pop, random pop

Redis Hashes

  • kind of like Python dict
  • Key-value mapping inside a key
  • get/set/delete single/multiple
  • increment values by ints/floats

Sorted Sets (ZETS)

  • Like a hash, with ‘members’ and ‘scores’, which are limited to numeric values

Note

Maybe good for search generation?

Publish/Subscribe

  • Readers subscribe to “channels”, exact strings or patterns
  • Writers publish to channels, broadcasting to all subscribers
  • Messages are transient so they can be missed

Why Redis with Python?

  • Python has reasonable sane syntax/semantics

  • Python allows you to easy manipulation of data and data structures

  • Python has a large and growing community

  • Redis has reasonable sane syntax/semantics

  • Redis allows you to easy manipulation of data and data structures

  • Redis has a medium-sized and growing community

  • Available as remote server

    • Like a remote IPython, only for data
Doesn’t that sound like a good match?
from itertools import imap
import redis
def process_lines(prefix, logfile):
    conn = redis.Redis()
    for log in imap(parse_line, open(logfile, 'rb')):
        time = log.timestamp.isoformat()
        ...
        conn.zincrby(prefix + hour, )
        ...

Note

code examples coming too fast. :P

Note

TODO: Download the slides later and try to get the code examples out.

Cool stuff you can build with Redis + Python

  • Reddit
  • Caching
  • Cookies
  • Analytics
  • Configuration management
  • Autocomplete
  • Distributed Locks
  • Counting semaphores
  • Task queues
  • Publish/Subscribe
  • Messaging
  • Search engines
  • Ad targeting
  • Twitter
  • Cat rooms
  • Job searching

Note

Ahem. You can do all of this in a relational database. Or an XML filestore. Or anything.