This repository contains examples on how to profile Python applications using cProfile, snakeviz, line_profiler, and memory_profiler modules.
To reproduce the same environment, I suggest using Conda as your package manager. If you have it installed, you can use environment.yml to create it using
conda env create -f environment.ymland activate it with
conda activate profilingTime-based profiling allows you to see how much time your application spends in each one of its components.
We use the cProfile module to profile an entire Python script. In each example folder, you will find a time/app-overview folder that contains the relevant code, along with a profile.sh script that will run the Python code with cProfile on. This script will generate a file example.prof, that contains the profiling data.
Even though it is possible to get statistics directly from cProfile, a great way to visualize the profiling results is with snakeviz. It's very easy to use. For each example, you will find a visualize.sh script that, when run, will launch snakeviz in a browser tab. Below is how a typical result looks:
Once you spotted what functions, methods or routines are consuming most of the time in your application, you may want to dig deeper into it to see exactly what instructions under each of them are the hot ones. For each example, in time/line-by-line, we use line_profiler for that, which requires decorating the target function with @profile. The profile.sh script calls the relevant binary (kernprof) to generate the profiling data, which can then be visualized with the visualize.sh script. A typical output is:
Understanding your Python application in terms of time is definitely an important step, but to characterize your application workload better, we also need to understand how it uses memory.
We use the memory_profiler module to get an overview of how much memory a Python script is using as a function of time. For each example, the memory/app-overview folder contains the code to be profiled and a profile.sh script that uses the relevant binary (mprof) to generate the profiling data, which can be visualized using the visualize.sh script. A typical output is:
We can also target individual functions with the @profile decorator. memory_profiler will then show the amount of memory that the process associated to the Python interpreter is using as your code evolves, line by line. For each example, under memory/line-by-line, the profile.sh script runs the profiler and shows the results. A typical output is:
Profiling Jupyter notebooks directly involves jumping through some hoops. The simplest alternative is to copy the content of your cells into a Python script. It possible to get the same effect with the nbconvert module:
jupyter nbconvert <YourNB>.ipynb --to scriptwhich will generate a <YourNB>.py script. Sometimes it looks quite ugly, though.



