Skip to content

Update to always use the latest glibc's bench-malloc-thread program #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
33 changes: 20 additions & 13 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,17 +1,22 @@
#
# This makefile will build a small benchmarking utility for 'malloc' implementations and will
# run it with different implementations saving results into JSON files.
# run it with different implementations, first saving results into JSON files, and then plotting
# them graphically.
#
# Specifically this makefile downloads, configure and compiles 3 different software packages:
# - GNU libc
# - Google perftools (tcmalloc)
# - jemalloc
# Specifically, this makefile downloads, configures and compiles these different software packages:
# 1. GNU libc
# 2. Google perftools (tcmalloc)
# 3. jemalloc
#
# Tested with versions:
# - GNU libc 2.26
# - Google perftools (tcmalloc) 2.6.3
# - jemalloc 5.0.1
# First tested with these versions:
# 1. GNU libc 2.26
# 2. Google perftools (tcmalloc) 2.6.3
# 3. jemalloc 5.0.1
#
# Most-recently tested on Ubuntu 20.04 with these versions:
# 1. GNU libc 2.31
# 2. Google perftools (tcmalloc) 2.9.1
# 3. jemalloc 5.2.1-742
#

#
Expand All @@ -35,8 +40,8 @@ endif
ifdef NUMPROC
parallel_flags := -j$(NUMPROC)
else
# default value
parallel_flags := -j4
# default value: pull from the max number of hardware processes: `nproc` cmd output; ex: 8
parallel_flags := -j$(nproc)
endif

ifdef POSTFIX
Expand Down Expand Up @@ -76,6 +81,7 @@ glibc_url := git://sourceware.org/git/glibc.git
tcmalloc_url := https://github.com/gperftools/gperftools.git
jemalloc_url := https://github.com/jemalloc/jemalloc.git

# Alternate download version and source if not using the git repo above
glibc_version := 2.26
glibc_alt_wget_url := https://ftpmirror.gnu.org/libc/glibc-$(glibc_version).tar.xz

Expand Down Expand Up @@ -125,6 +131,7 @@ $(glibc_install_dir)/lib/libc.so.6:
cd $(glibc_build_dir) && \
../glibc/configure --prefix=$(glibc_install_dir) && \
make $(parallel_flags) && \
make bench-build $(parallel_flags) && \
make install
[ -x $(glibc_build_dir)/benchtests/bench-malloc-thread ] && echo "GNU libc benchmarking utility is ready!" || echo "Cannot find GNU libc benchmarking utility! Cannot collect benchmark results"

Expand All @@ -143,7 +150,6 @@ $(jemalloc_install_dir)/lib/libjemalloc.so:
( make install || true )

build:
$(MAKE) -C benchmark-src
ifeq ($(findstring glibc,$(implem_list)),glibc)
$(MAKE) $(glibc_install_dir)/lib/libc.so.6
endif
Expand All @@ -163,6 +169,7 @@ collect_results:
@sudo lshw -short -class memory -class processor > $(results_dir)/hardware-inventory.txt
@echo -n "Number of CPU cores: " >>$(results_dir)/hardware-inventory.txt
@grep "processor" /proc/cpuinfo | wc -l >>$(results_dir)/hardware-inventory.txt
# NB: you may need to install `numactl` first with `sudo apt install numactl`.
@(which numactl >/dev/null 2>&1) && echo "NUMA informations:" >>$(results_dir)/hardware-inventory.txt
@(which numactl >/dev/null 2>&1) && numactl -H >>$(results_dir)/hardware-inventory.txt

Expand All @@ -173,5 +180,5 @@ plot_results:
upload_results:
git add -f $(results_dir)/*$(benchmark_result_json) $(results_dir)/$(benchmark_result_png) $(results_dir)/hardware-inventory.txt
git commit -m "Adding results from folder $(results_dir) to the GIT repository"
@echo "Run 'git push' to push online your results (required GIT repo write access)"
@echo "Run 'git push' to push online your results (requires GIT repo write access)"

84 changes: 62 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,89 @@
See also more glibc source code and build info. here: [glibc-benchmark-info.md](glibc-benchmark-info.md).


# malloc-benchmarks

Simple benchmarking scripts to run on any machine to compare different C/C++ malloc implementations.
The scripts are not meant to face any possible problem, quite the opposite.
Simple benchmarking and plotting scripts to run on any machine to compare different C/C++ malloc implementations.
These scripts are not meant to face any possible problem; quite the opposite.
They will:
- download and build [GNU libc](https://www.gnu.org/software/libc/), [Google perftools](https://github.com/gperftools/gperftools), [Jemalloc](http://jemalloc.net/)
- use GNU libc malloc multi-thread benchmarking utility to generate JSON results for different combinations
of malloc implementation and number of threads
- use [Python matplotlib](https://matplotlib.org/) to produce a summary figure
1. Download and build [GNU libc](https://www.gnu.org/software/libc/), [Google perftools](https://github.com/gperftools/gperftools), [Jemalloc](http://jemalloc.net/)
1. Use the GNU libc malloc multi-threaded benchmarking utility to generate JSON results for different combinations
of malloc implementations and numbers of threads
1. Use [Python matplotlib](https://matplotlib.org/) to produce a plot of the results


## Dependencies

If `make` below fails, you may need to install (via `sudo apt install`) one or more of the following. If you like, just begin by running the installation commands below. Last tested in Ubuntu 20.04.

```bash
sudo apt update && sudo apt install \
numactl g++ clang llvm-dev unzip dos2unix linuxinfo bc libgmp-dev wget \
cmake python python3 ruby ninja-build libtool autoconf
# For Python
pip3 install matplotlib
```


## How to collect benchmark results and view them

```bash
git clone https://github.com/f18m/malloc-benchmarks.git
cd malloc-benchmarks
make
# OR, time the process too to help you set expectations for how long it will take
time make
```

Once you have run `make`, the plot will display. To re-plot the results without rerunning the tests, run:
```bash
make plot_results
```
git clone https://github.com/f18m/malloc-benchmarks.git
cd malloc-benchmarks
make

Note that each time you run `make`, all of the benchmark results will be stored in a folder for your computer within the `results` dir, overwriting all previous results. So, if you wish to save previous benchmarking runs, be sure to rename your computer's folder in the `results` dir prior to running `make` again.

You can customize the runs be setting environment variables as you call `make`. See the top of the `Makefile` for details. See the default values for `benchmark_nthreads` and `implem_list` in the `Makefile`.

Examples:
```bash
# Run only 1 and 2 threads, testing only malloc implementations jemalloc and tcmalloc:
NTHREADS="1 2" IMPLEMENTATIONS="jemalloc tcmalloc" time make
```


## How to collect benchmark results on a machine and plot them from another one

On the machine where you want to collect benchmark results:

```
git clone https://github.com/f18m/malloc-benchmarks.git
cd malloc-benchmarks
make download build collect_results
scp -r results IP_OF_OTHER_MACHINE:
```bash
git clone https://github.com/f18m/malloc-benchmarks.git
cd malloc-benchmarks
make download build collect_results
scp -r results IP_OF_OTHER_MACHINE:
```

On the other machine where you want to plot results:

```
git clone https://github.com/f18m/malloc-benchmarks.git
cd malloc-benchmarks
mv ../results .
make plot_results
```bash
git clone https://github.com/f18m/malloc-benchmarks.git
cd malloc-benchmarks
mv ../results .
make plot_results
```


## Example benchmarks

The following are some pictures obtained on different HW systems using however the same benchmarking utility written by
GNU libc developers. They give an idea on how much performances can be different on different CPU/memory HW and varying the number of threads.
Of course the closer the curves are to zero, the better they are (the lower the better!).
The following are some plots of results obtained on different hardware systems using the same benchmarking utility written by the
GNU libc developers. They give an idea of how much performance can differ on different CPU/memory hardware and a varying the number of threads.
Of course, the closer the curves are to zero, the better they are (the lower the better!).

**To verify the version numbers for your benchmarks, look in the following places after running `make`:**
1. **system_default:** run `apt show libc6` to see your system glibc version ([source: "Determining the Installed glibc Version"](https://www.linode.com/docs/guides/patching-glibc-for-the-ghost-vulnerability/)). Ex: `Version: 2.31-0ubuntu9.2`
1. **glibc:** See this file: `malloc-benchmarks/glibc/version.h`
1. **tcmalloc:** See the `TC_VERSION_STRING` value inside `malloc-benchmarks/tcmalloc-install/include/gperftools/tcmalloc.h`
1. **jemalloc:** See the `JEMALLOC_VERSION` value inside `malloc-benchmarks/jemalloc-install/include/jemalloc/jemalloc.h`


<table cellpadding="5" width="100%">
<tbody>
Expand Down
16 changes: 8 additions & 8 deletions bench_collect_results.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/python
#!/usr/bin/python3
"""Generate benchmarking results in JSON form, using GNU libc benchmarking utility;
different allocators are injected into that utility by using LD_PRELOAD trick.
"""
Expand All @@ -11,7 +11,7 @@
# Constants
#

internal_benchmark_util = 'benchmark-src/bench-malloc-thread'
internal_benchmark_util = 'glibc-build/benchtests/bench-malloc-thread'

glibc_install_dir = 'glibc-install'
tcmalloc_install_dir = 'tcmalloc-install'
Expand Down Expand Up @@ -43,7 +43,7 @@

def find(name, paths):
for path in paths:
#print "Searching into: ", path
#print("Searching into: ", path)
for root, dirs, files in os.walk(path, followlinks=False):
if name in files:
return os.path.join(root, name)
Expand Down Expand Up @@ -107,7 +107,7 @@ def run_benchmark(outfile,thread_values,impl_name):
os.environ["LD_PRELOAD"] = impl_preload_libs[impl_name]
if len(os.environ["LD_PRELOAD"])>0:
# the tcmalloc/jemalloc shared libs require in turn C++ libs:
#print "preload_required_libs_fullpaths is:", preload_required_libs_fullpaths
#print("preload_required_libs_fullpaths is:", preload_required_libs_fullpaths)
for lib in preload_required_libs_fullpaths:
os.environ["LD_PRELOAD"] = os.environ["LD_PRELOAD"] + ':' + lib

Expand Down Expand Up @@ -156,13 +156,13 @@ def main(args):
sys.exit(3)

outfile = os.path.join(outfile_path_prefix, implementations[idx] + '-' + outfile_postfix)
print "----------------------------------------------------------------------------------------------"
print "Testing implementation '{}'. Saving results into '{}'".format(implementations[idx],outfile)
print("----------------------------------------------------------------------------------------------")
print("Testing implementation '{}'. Saving results into '{}'".format(implementations[idx],outfile))

print "Will run tests for {} different number of threads".format(len(thread_values))
print("Will run tests for {} different number of threads".format(len(thread_values)))
success = success + run_benchmark(outfile,thread_values,implementations[idx])

print "----------------------------------------------------------------------------------------------"
print("----------------------------------------------------------------------------------------------")
return success

if __name__ == '__main__':
Expand Down
18 changes: 14 additions & 4 deletions bench_plot_results.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#!/usr/bin/python
#!/usr/bin/python3
"""Generates a figure that shows all benchmarking results
"""
import sys
Expand All @@ -8,7 +8,7 @@

import matplotlib.pyplot as plotlib

BenchmarkPoint = collections.namedtuple('BenchmarkPoint', ['threads', 'time_per_iteration'], verbose=False)
BenchmarkPoint = collections.namedtuple('BenchmarkPoint', ['threads', 'time_per_iteration'])
filled_markers = ('o', 'v', '^', '<', '>', '8', 's', 'p', '*', 'h', 'H', 'D', 'd', 'P', 'X')
colours = ('r', 'g', 'b', 'black', 'yellow', 'purple')

Expand All @@ -32,8 +32,12 @@ def plot_graphs(outfilename, benchmark_dict):
plotlib.setp(lines, 'color', colours[nmarker])

# remember max X/Y
max_x.append(max(X))
max_y.append(max(Y))
# In case you only ran some of the tests, don't attempt to get `max()` on an empty list--ie:
# for a benchmark you didn't run. Only operate if the lists aren't empty.
if X:
max_x.append(max(X))
if Y:
max_y.append(max(Y))

nmarker=nmarker+1

Expand All @@ -42,6 +46,12 @@ def plot_graphs(outfilename, benchmark_dict):
plotlib.ylim(0, max(max_y)*1.3)

print("Writing plot into '%s'" % outfilename)
print("- - -\n" +
"Close the plot to terminate the program. Run `make plot_results` to plot the results\n" +
"again. Be sure to manually make a copy of the \"results\" folder (if you wish to\n" +
"save your results) before running `make` again, or else these results will be\n" +
"overwritten.\n" +
"- - -")
plotlib.legend(loc='upper left')
plotlib.savefig(outfilename)
plotlib.show()
Expand Down
25 changes: 0 additions & 25 deletions benchmark-src/Makefile

This file was deleted.

4 changes: 0 additions & 4 deletions benchmark-src/README.md

This file was deleted.

Loading