1. Using the cProfile Module
2. list and tuple
- 2.1. characteristics
- 2.2. hash tables
3. dictionary and set
4. Iterators and Generators
5. Matrix and vector computation
- 5.1. how to use perf stat to understand CPU performance
- 5.2. how efficiently the CPU's caches are utilized.
6. Compiling to C
7. RAM
8. Using the dis Module to Examine CPython Bytecode

1 Using the cProfile Module

1.1 simple decorator to measure the time spending on a function.

from functools import wraps
def timefn(fn):
    @wraps(fn)
    def measure_time(*args, **kwargs):
        t1 = time.time()
        result = fn(*args, **kwargs)
        t2 = time.time()
        print ("@timefn:" + fn.func_name + " took " + str(t2 - t1) + " seconds")
        return result
    return measure_time

@timefn
def calculate_z_serial_purepython(maxiter, zs, cs):

1.2 Inside Ipython:

%timeit calc_pure_python(desired_width=1000, max_iterations=300)
or
%%time calc_pure_python(desired_width=1000, max_iterations=300)

1.3 python -m cProfile -s cumulative julia1_nopil.py

1.4 python -m cProfile -o profile.stats julia1.py

In [1]: import pstats
In [2]: p = pstats.Stats("profile.stats")
In [3]: p.sort_stats("cumulative")
Out[3]: <pstats.Stats instance at 0x177dcf8>
In [4]: p.print_stats()
p.print_callers()

p.print_callees() # flip callee around

1.5 Using line_profiler for line by line measurements

pip install line_profiler.
kernprof.py -l -v julia1_lineprofiler.py
@profile # add decorator on the function called.

1.6 Using memory_profiler to Diagnose Memory Usage

1.6.1 Could we use less RAM by rewriting this function to work more efficiently?

1.6.2 Could we use more RAM and save CPU cycles by caching?

1.7 Unit testing during optimization to maintain correctness

# ex.py
import unittest
@profile
def some_fn(nbr):
return nbr * 2
class TestCase(unittest.TestCase):
def test(self):
result = some_fn(2)
self.assertEquals(result, 4)

$ nosetests ex.py

1.8 use snakeeviz in python3

pip install snakeviz

snakeviz profile.stats

2 list and tuple

2.1 characteristics

These differences outline the philosophical difference between the two: tuples are for describing multiple properties of one unchanging thing, and lists can be used to store collections of data about completely disparate objects even if we create a list without append (and thus we don’t have the extra headroom introduced by an append operation), it will still be larger in memory than a tuple with the same data

2.2 hash tables

Once a list has been sorted, we can find our desired element using a binary search (Example 3-3), which has an average case complexity of O(log n). It achieves this by first looking at the middle of the list and comparing this value with the desired value. If this midpoint’s value is less than our desired value, then we consider the right half of the list, and we continue halving the list like this until the value is found, or until the value is known not to occur in the sorted list. As a result, we do not need to read all values in the list, as was necessary for the linear search; instead, we only read a small subset of them

3 dictionary and set

3.1 What are dictionaries and sets good for?

Sets and dictionaries are ideal data structures to be used when your data has no intrinsic order, but does have a unique object that can be used to reference it (the reference object is normally a string, but can be any hashable type). This reference object is called the “key,” while the data is the “value.”

3.2 How are dictionaries and sets the same?

a set is simply a collection of unique keys

3.3 What is the overhead when using a dictionary?

creating hash function

3.4 How can I optimize the performance of a dictionary?

wdict = {}
for word in words:
    try:
        wdict[word] += 1
    except KeyError:
        wdict[word] = 1

wdict = {}
get = wdict.get
for word in words:
    wdict[word] = get(word, 0) + 1

3.5 How does Python use dictionaries to keep track of namespaces?

searching local variables first

global variable

__builtin__

4 Iterators and Generators

4.1 How do generators save memory?

Since xrange already returns an iterator, calling iter on it is a trivial operation, and it simply returns the original object (so type(xrange(1,10)) == type(iter(xrange(1,10)))). However, since range returns a list, we must create a new object, a list iterator, that will iterate over all values in the list.

4.2 When is the best time to use a generator?

def fibonacci():
i, j = 0, 1
while True:
yield j
i, j = j, i + j

def fibonacci_transform():
count = 0
for f in fibonacci():
if f > 5000:
break
if f % 2:
count += 1
return count

Table of Contents

1 Using the cProfile Module

1.1 simple decorator to measure the time spending on a function.

1.2 Inside Ipython:

1.3 python -m cProfile -s cumulative julia1_nopil.py

1.4 python -m cProfile -o profile.stats julia1.py

1.5 Using line_profiler for line by line measurements

1.6 Using memory_profiler to Diagnose Memory Usage

1.6.1 Could we use less RAM by rewriting this function to work more efficiently?

1.6.2 Could we use more RAM and save CPU cycles by caching?

1.7 Unit testing during optimization to maintain correctness

1.8 use snakeeviz in python3

2 list and tuple

2.1 characteristics

2.2 hash tables

3 dictionary and set

3.1 What are dictionaries and sets good for?

3.2 How are dictionaries and sets the same?

3.3 What is the overhead when using a dictionary?

3.4 How can I optimize the performance of a dictionary?

3.5 How does Python use dictionaries to keep track of namespaces?

4 Iterators and Generators

4.1 How do generators save memory?

4.2 When is the best time to use a generator?

4.3 How can I use itertools to create complex generator workflows?

4.4 When is lazy evaluation beneficial, and when is it not?

5 Matrix and vector computation

5.1 how to use perf stat to understand CPU performance

5.2 how efficiently the CPU's caches are utilized.

6 Compiling to C

6.1 How can I have my Python code run as lower-level code?

6.2 What is the difference between a JIT compiler and an AOT compiler?

6.3 What tasks can compiled Python code perform faster than native Python?

6.4 Why do type annotations speed up compiled Python code?

6.5 How can I write modules for Python using C or Fortran?

6.6 How can I use libraries from C or Fortran in Python?

7 RAM

7.1 memory_profiler for tracking RAM usage.

7.2 Why should I use less RAM?

7.3 Why are numpy and array better for storing lots of numbers?

7.4 How can lots of text be efficiently stored in RAM?

7.5 How could I count (approximately!) to 1e77 using just 1 byte?

7.6 What are Bloom filters and why might I need them?

8 Using the dis Module to Examine CPython Bytecode