A little Python benchmarking tool.
Why do I need this?
The builtin timeit module for Python is great, but the typical usage for micro-benchmarks is to run a small script like this:
python -m timeit "a = 1; b = 2; a * b"
The problem with this approach is that the compiled code is a module, so any variables on the top-level are globals. The compiled bytecode is different to the same statements being inside a local scope (e.g. a method or function). They behave and perform very differently in CPython.
richbench encourages you to write benchmarks inside functions to properly simulate the closures and scope of production code.
Requires Python 3.6+ and can be installed using pip:
pip install richbench
Write your benchmark functions in a directory and run
richbench with that target to get the results:
$ richbench my_benchmarks/
Results are displayed in a table like this:
$ richbench --help usage: richbench [-h] [--profile] [--percentage] [--markdown] [--benchmark [BENCHMARK]] [--repeat REPEAT] [--times TIMES] target [target ...] positional arguments: target options: -h, --help show this help message and exit --profile Profile the benchmarks and store in .profiles/ --percentage Show percentage of improvement instead of multiplier --markdown Prints a markdown friendly table --benchmark [BENCHMARK] Run specific benchmark --repeat REPEAT Repeat benchmark this many times --times TIMES Run benchmark this many times
Benchmarks should be in a directory and must have the filename
The last statement in the benchmark file should be a list, called
__benchmarks__ with a list of tuples containing:
- function a
- function b
- the name of the benchmark
def sort_seven(): """Sort a list of seven items""" for _ in range(10_000): sorted([3,2,4,5,1,5,3]) def sort_three(): """Sort a list of three items""" for _ in range(10_000): sorted([3,2,4]) __benchmarks__ = [ (sort_seven, sort_three, "Sorting 3 items instead of 7") ]
Tips for benchmarking
Inside your benchmark function try to:
- Run setup commands once
- Repeat the code you want to benchmark as many times as possible to create a stable benchmark
For micro-benchmarks, this can be achieved by calling 1000 – 100,000 times.
If your benchmark code completes within
0.001s it is likely you will get unstable benchmarks because of the CPU doing other activities. Increase the times the target code is run within the function.
By adding the
--profile flag to the command line, it will generate a subdirectory
.profiles with HTML profile data of your target functions.