Tuesday, July 20, 2010

Faster Python Programs through Optimization

Presenter: Mike Mueller (Python Academy)

Sample code and exercises available here. Slides have not been posted yet; I'll update when available.

Premature optimization is the root of all evil – C.A.R. Hoare

Knowing when optimization is premature defines the difference between the master engineer and the apprentice – via Twitter

Guidelines:

    • make sure your program is actually slow

    • don't optimize as you go

    • only realistic use cases should be considered

    • architecture can be essential to performance

    • check for bugs that slow the program down

    • find bottlenecks using profiling

    • always check the results of optimization with unit testing

The greatest speedup is usually achieved by optimization of algorithms

External causes

    • network

    • database

    • system calls

Measurement

    • Measuring in stones

      • test.pystone

      • a single core of my C2D laptop gets ~24k stones/second

    • Profiling CPU usage:

      • cProfile

      • pstats

    • Profiling memory usage:

      • Guppy_pe

Algorithms and Anti-patterns

  • String concatenation

    • use lists and ''.join(list) to create large strings

  • list generators

    • l = [x * x for x in range(10)]

    • sum(x * x for x in range(10)

    • g = (x * x for x in xrange(10))

      • g.next()

      • list(g)

  • data structures

    • sets vs lists

      • use sets instead of lists if performing membership tests

      • if order matters, create a set from your list

        • this has a cost, so weigh set_creation_cost vs list_search_cost * number_of_searches

      • use set.intersection to for intersection tests

    • deque vs lists

      • use deque (doubly-linked list) instead of lists for faster inserts

      • deque has a higher index access cost

    • dict vs defaultdict

      • defaultdict is faster for larger data sets

  • Python big-O

    • O(1)

      • len(list)

      • len(dict)

      • list[i]

      • del dict[i]

      • x in dict

      • x in set

      • list.append(i)

    • O(n)

      • loops on lists, strings, sets

      • string methods

      • x in list

    • O(n log n)

      • list.sort()

    • O(n^2)

      • nested loops

    • O(n^3)

      • nested nested loops

  • Caching

    • Reuse before you recalculate

    • memoization

    • deterministic caching

      • eg numerical results that are always the same for the given inputs

    • non-deterministic caching

      • expiring cache

      • eg caching of database queries

    • Memcached

  • JIT compilation

  • Numeric libraires

  • Parallel

    • Python 2.6 – import multiprocessing

    • Python <>

      • Using both cores, my laptop gets ~45k stones/second

No comments:

Post a Comment