By Dat Nguyen | Tech Lead, AnyTag Team
This article mainly focus on Python and assumes you are already familiar with Python.
Why do we need Python Profilers?
Nowadays, our program is getting more complex in term of both business and coding level which can contain even thousands of code lines. Finding out why the program is slow by looking at its source code is a time consuming task and not easy to do.
Moreover, when the program runs on cloud service which costs more money if the program spends more usage on computing resources.
That’s why code profiler has essential role to find bottlenecks and help to optimize our program performance.
Recently there are several ways to profile code, however in this post, we will be focusing on a basic method using a built-in profilers in Python which is CProfile.
How to use Python cProfile?
cProfile is a built-in python module and is the most commonly used profiler currently, using it we can
- Get the total run time taken by the entire code.
- Show the time taken by each individual step which allows to compare and find which parts need optimization.
- Understand how many times certain functions are being called
- View stats nicely using other library such as
snakeviz
Let’s look at the first simple example
> import cProfile
> cProfile.run("sorted([1, 2, 3])")
4 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.sorted}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
- ncalls is the number of calls made
- tottime is a total of the time spent in the given function
- percall refers to the quotient of tottime divided by ncalls
- cumtime is the cumulative time spent in this and all subfunctions
- filename:lineno(function) provides the respective data of each function
There is no bottleneck for the example, let’s look at more interesting example below
import random
def sort_letters(number_letters=10**6):
random_string = ''
for i in range(number_letters):
random_string += random.choice('abcdefghijklmnopqrstuvwxyz')
return sorted(random_string)
We can also call cProfile on the command line
> python -m cProfile profile_test.py
3000053 function calls in 1.041 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 __future__.py:48(<module>)
1 0.000 0.000 0.000 0.000 __future__.py:74(_Feature)
7 0.000 0.000 0.000 0.000 __future__.py:75(__init__)
6 0.000 0.000 0.000 0.000 hashlib.py:100(__get_openssl_constructor)
1 0.001 0.001 0.001 0.001 hashlib.py:56(<module>)
1 0.003 0.003 1.041 1.041 profile_test.py:1(<module>)
1 0.289 0.289 1.034 1.034 profile_test.py:3(sort_letters)
1 0.000 0.000 0.001 0.001 random.py:100(seed)
1000000 0.455 0.000 0.562 0.000 random.py:275(choice)
1 0.002 0.002 0.004 0.004 random.py:40(<module>)
...
By looking at the output, it looks like the choice function from random module takes lot of time, we can start optimizing it from this code
How to visualize cProfile reports?
The common tool for visualizing data obtained by cProfile module is SnakeViz
, let explore how to use it
Firstly, we can export the output of cProfile in to the dump file
python -m cProfile -o stats.dump profile_test.py
Next step
snakeviz stats.dump
Start optimizing previous code
From previous example, we know that random.choice
function is slowing us down, it’s time to replace it with other similar function
from numpy import random
def sort_letters(number_letters=10**6):
letters = tuple('abcdefghijklmnopqrstuvwxyz')
random_letters = random.choice(letters, number_letters)
random_letters.sort()
return random_letters.tostring()
Now our program runs much more faster than before
Conclusion
Apart from cProfile, there are some other python module to support profiling code, but cProfile includes most of its features and is recommended to use. Stay tuned for similar topic, for example server-side profiler