ihmeuw · patricktnast · Jan 16, 2026 · Jan 6, 2026 · Jan 6, 2026 · Jan 6, 2026
@@ -101,19 +101,139 @@ You'll find six directories inside the main
   data or process outputs.
 
 
-Running Simulations
--------------------
+Profiling and Benchmarking
+---------------------------
 
-Before running a simulation, you should have a model specification file.
-A model specification is a complete description of a vivarium model in
-a yaml format.  An example model specification is provided with this repository
-in the ``model_specifications`` directory.
+This repository provides tools for profiling and benchmarking Vivarium simulations
+to analyze their performance characteristics. See the tutorials at
+https://vivarium.readthedocs.io/en/latest/tutorials/running_a_simulation/index.html
+and https://vivarium.readthedocs.io/en/latest/tutorials/exploration.html for general instructions
+on running simulations with Vivarium.
 
-With this model specification file and your conda environment active, you can then run simulations by, e.g.::
+Configuring Scaling Simulations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-   (vivarium_profiling) :~$ simulate run -v /<REPO_INSTALLATION_DIRECTORY>/vivarium_profiling/src/vivarium_profiling/model_specifications/model_spec.yaml
+This repository includes a custom ``MultiComponentParser`` plugin that allows you to
+easily create scaling simulations by defining multiple instances of diseases and risks
+using a simplified YAML syntax.
 
-The ``-v`` flag will log verbosely, so you will get log messages every time
-step. For more ways to run simulations, see the tutorials at
-https://vivarium.readthedocs.io/en/latest/tutorials/running_a_simulation/index.html
-and https://vivarium.readthedocs.io/en/latest/tutorials/exploration.html
+To use the parser, add it to your model specification::
+
+    plugins:
+        required:
+            component_configuration_parser:
+                controller: "vivarium_profiling.plugins.parser.MultiComponentParser"
+
+Then use the ``causes`` and ``risks`` multi-config blocks:
+
+**Causes Configuration**
+
+Define multiple disease instances with automatic numbering::
+
+    components:
+        causes:
+            lower_respiratory_infections:
+                number: 4          # Creates 4 disease instances
+                duration: 28       # Disease duration in days
+                observers: True    # Auto-create DiseaseObserver components
+
+This creates components named ``lower_respiratory_infections_1``,
+``lower_respiratory_infections_2``, etc., each with its own observer if enabled.
+
+**Risks Configuration**
+
+Define multiple risk instances and their effects on causes::
+
+    components:
+        risks:
+            high_systolic_blood_pressure:
+                number: 2
+                observers: False    # Set False for continuous risks
+                affected_causes:
+                    lower_respiratory_infections:
+                        effect_type: nonloglinear
+                        measure: incidence_rate
+                        number: 2   # Affects first 2 LRI instances
+
+            unsafe_water_source:
+                number: 2
+                observers: True     # Set True for categorical risks
+                affected_causes:
+                    lower_respiratory_infections:
+                        effect_type: loglinear
+                        number: 2
+
+See ``model_specifications/model_spec_scaling.yaml`` for a complete working example
+of a scaling simulation configuration.
+
+
+Running Benchmark Simulations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``profile_sim`` command profiles runtime and memory usage for a single simulation of a vivarium model
+given a model specification file. The underlying simulation model can
+be any vivarium-based model, including the aforementioned scaling simulations as well as models in a 
+separate repository. This will generate, in addition to the standard simulation outputs, profiling data 
+depending on the profiler backend provided. By default, runtime profiling is performed with ``cProfile``, but 
+you can also use ``scalene`` for more detailed call stack analysis.
+
+The ``run_benchmark`` command runs multiple iterations of one or more model specification, in order to compare
+the results. It requires at least one baseline model (specified as ``model_spec_baseline.yaml``) for comparison,
+and any other number of 'experiment' models to benchmark against the baseline, which can be passed via glob patterns.
+You can separately configure the sample size of runs for the baseline and experiment models. The command aggregates
+the profiling results and generates summary statistics and visualizations for a default set of important function calls
+to help identify performance bottlenecks.
+
+The command creates a timestamped directory containing:
+
+- ``benchmark_results.csv``: Raw profiling data for each run
+- ``summary.csv``: Aggregated statistics (automatically generated)
+- ``performance_analysis.png``: Performance charts (automatically generated)
+- Additional analysis plots for runtime phases and bottlenecks
+
+
+Analyzing Benchmark Results
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``summarize`` command processes benchmark results and creates visualizations.
+This runs automatically after ``run_benchmark``, but can also be run manually
+for custom analysis after the fact.
+
+By default, this creates the following files in the specified output directory:
+
+- ``summary.csv``: Aggregated statistics with mean, median, std, min, max
+  for all metrics, plus percent differences from baseline
+- ``performance_analysis.png``: Runtime and memory usage comparison charts
+- ``runtime_analysis_*.png``: Individual phase runtime charts (setup, run, etc.)
+- ``bottleneck_fraction_*.png``: Bottleneck fraction scaling analysis
+
+You can also generate an interactive Jupyter notebook including the same default plots
+and summary dataframe with a ``--nb`` flag, in which case the command also creates an
+``analysis.ipynb`` file in the output directory.
+
+
+Customizing Result Extraction
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By default, the benchmarking tools extract standard profiling metrics:
+
+- Simulation phases: setup, initialize_simulants, run, finalize, report
+- Common bottlenecks: gather_results, pipeline calls, population views
+- Memory usage and total runtime
+
+You can customize which metrics to extract by creating an extraction config YAML file.
+See ``extraction_config_example.yaml`` for a complete annotated example.
+
+**Basic Pattern Structure**::
+
+    patterns:
+      - name: my_function          # Logical name for the metric
+        filename: my_module.py     # Source file containing the function
+        function_name: my_function # Function name to match
+        extract_cumtime: true      # Extract cumulative time (default: true)
+        extract_percall: false     # Extract time per call (default: false)
+        extract_ncalls: false      # Extract number of calls (default: false)
+
+In turn, this yaml can be passed to the ``run_benchmark`` and ``summarize`` commands
+using the ``--extraction_config`` flag. ``summarize`` will automatically create runtime
+analysis plots for the specified functions. 
diff --git a/extraction_config_example.yaml b/extraction_config_example.yaml
@@ -0,0 +1,145 @@
+# Example Extraction Configuration
+#
+# This file demonstrates the YAML syntax for configuring extraction patterns
+# that define which profiling metrics to extract from benchmark results.
+#
+# Usage:
+#   run_benchmark -m models/*.yaml -r 10 -b 20 --extraction-config extraction_config.yaml
+#   summarize results/profile_*/benchmark_results.csv --extraction-config extraction_config.yaml
+#
+# Each pattern defines:
+# - What function to match in cProfile output (filename + function_name)
+# - Which metrics to extract (cumtime, percall, ncalls) - all optional with defaults
+# - How to name the resulting columns (templates) - optional with defaults
+#
+# Common use cases:
+#
+# 1. Bottlenecks: Extract all 3 metrics (cumtime, percall, ncalls)
+#    Use for performance hotspots you want to track in detail
+#
+# 2. Simulation phases: Extract only cumtime with "rt_{name}_s" column naming
+#    Use for high-level simulation phases
+#
+# 3. Custom patterns: Mix and match metrics and templates as needed
+
+patterns:
+  # ===========================================================================
+  # BOTTLENECK PATTERNS
+  # ===========================================================================
+  # These extract cumtime, percall, and ncalls for detailed bottleneck analysis
+
+  - name: gather_results
+    filename: results/manager.py
+    function_name: gather_results
+    extract_cumtime: true
+    extract_percall: true
+    extract_ncalls: true
+    # Generates columns: gather_results_cumtime, gather_results_percall, gather_results_ncalls
+
+  - name: pipeline_call
+    filename: values/pipeline.py
+    function_name: __call__
+    extract_cumtime: true
+    extract_percall: true
+    extract_ncalls: true
+    # Generates columns: pipeline_call_cumtime, pipeline_call_percall, pipeline_call_ncalls
+
+  - name: population_get
+    filename: population/population_view.py
+    function_name: get
+    extract_cumtime: true
+    extract_percall: true
+    extract_ncalls: true
+    # Generates columns: population_get_cumtime, population_get_percall, population_get_ncalls
+
+  # ===========================================================================
+  # SIMULATION PHASE PATTERNS
+  # ===========================================================================
+  # These extract only cumtime for high-level simulation phases
+  # Uses custom template for runtime column naming: rt_{name}_s
+
+  - name: setup
+    filename: /vivarium/framework/engine.py
+    function_name: setup
+    extract_cumtime: true
+    cumtime_template: "rt_{name}_s"
+    # extract_percall and extract_ncalls default to false
+    # Generates column: rt_setup_s
+
+  - name: initialize_simulants
+    filename: /vivarium/framework/engine.py
+    function_name: initialize_simulants
+    extract_cumtime: true
+    cumtime_template: "rt_{name}_s"
+    # Generates column: rt_initialize_simulants_s
+
+  - name: run
+    filename: /vivarium/framework/engine.py
+    function_name: run
+    extract_cumtime: true
+    cumtime_template: "rt_{name}_s"
+    # Generates column: rt_run_s
+
+  - name: finalize
+    filename: /vivarium/framework/engine.py
+    function_name: finalize
+    extract_cumtime: true
+    cumtime_template: "rt_{name}_s"
+    # Generates column: rt_finalize_s
+
+  - name: report
+    filename: /vivarium/framework/engine.py
+    function_name: report
+    extract_cumtime: true
+    cumtime_template: "rt_{name}_s"
+    # Generates column: rt_report_s
+
+  # ===========================================================================
+  # CUSTOM PATTERN EXAMPLES
+  # ===========================================================================
+
+  # Example: Custom function with selective metric extraction and custom templates
+  # - name: my_bottleneck
+  #   filename: my/module.py
+  #   function_name: my_function
+  #   extract_cumtime: true
+  #   extract_percall: true
+  #   extract_ncalls: false
+  #   cumtime_template: "{name}_total_time"
+  #   percall_template: "{name}_avg_time"
+  #   # Generates columns: my_bottleneck_total_time, my_bottleneck_avg_time
+
+  # Example: Minimal pattern (only cumtime with default template)
+  # - name: simple_func
+  #   filename: simple/module.py
+  #   function_name: simple_function
+  #   # extract_cumtime defaults to true
+  #   # extract_percall and extract_ncalls default to false
+  #   # cumtime_template defaults to "{name}_cumtime"
+  #   # Generates column: simple_func_cumtime
+
+# ===========================================================================
+# FIELD REFERENCE
+# ===========================================================================
+#
+# Required fields:
+#   name: Logical name used in column name templates (e.g., "setup", "gather_results")
+#   filename: Path pattern to match the source file (can be partial path)
+#   function_name: Name of the function to match in cProfile output
+#
+# Optional fields (with defaults):
+#   extract_cumtime: Extract cumulative time (default: true)
+#   extract_percall: Extract time per call (default: false)
+#   extract_ncalls: Extract number of calls (default: false)
+#   cumtime_template: Column name template for cumtime (default: "{name}_cumtime")
+#   percall_template: Column name template for percall (default: "{name}_percall")
+#   ncalls_template: Column name template for ncalls (default: "{name}_ncalls")
+#
+# Template variables:
+#   {name}: Replaced with the pattern's name field
+#
+# File path matching:
+#   - Patterns match any path ending with the specified filename
+#   - Use forward slashes even on Windows
+#   - Special regex characters (., *, etc.) are automatically escaped
+#   - Example: "results/manager.py" matches "/full/path/to/results/manager.py"
diff --git a/setup.py b/setup.py
@@ -54,6 +54,7 @@
         "matplotlib",
         "seaborn",
         "scalene",
+        "nbformat>=5.0",
     ]
 
     setup_requires = ["setuptools_scm"]
@@ -98,5 +99,6 @@
             make_artifacts=vivarium_profiling.tools.cli:make_artifacts
             run_benchmark=vivarium_profiling.tools.cli:run_benchmark
             profile_sim=vivarium_profiling.tools.cli:profile_sim
+            summarize=vivarium_profiling.tools.cli:summarize
         """,
     )
diff --git a/src/vivarium_profiling/templates/__init__.py b/src/vivarium_profiling/templates/__init__.py
@@ -0,0 +1,6 @@
+"""Templates for vivarium_profiling."""
+
+from pathlib import Path
+
+TEMPLATES_DIR = Path(__file__).parent
+ANALYSIS_NOTEBOOK_TEMPLATE = TEMPLATES_DIR / "analysis_template.ipynb"