Application Overlays

src/apps/ contains BitFly-managed application overlays that are synced into ara/apps/. This tree is the software-side definition of the workload matrix used for correctness checks and paper-style benchmarking.

Start Here

Use this directory to answer three questions:

which apps represent the proposed BMPMM path versus the RVV baseline
how one app maps to one workload slice
where shared generator and template logic should live

Taxonomy

Category	Directories	Role
proposed benchmark path	`bmpmm_*`	BMPMM-based implementation under evaluation
baseline benchmark path	`rvv_*`	RVV implementation used for comparison
correctness regression	`bmpu_verify`	focused validation of BMPU packing and low-bit execution behavior
shared infrastructure	`common`	generators, case definitions, templates, and common helpers
separate inference experiment	`llama2`	exploratory inference flow outside the main benchmark matrix

Experimental Unit

For model-split evaluation, one benchmark app corresponds to one workload slice:

<implementation>_<precision>_<model>

Examples:

bmpmm_binary_gemma3_270m
bmpmm_INT2_opt_13b
rvv_INT4_qwen25_15b

This structure keeps:

generated tensors scoped to one workload slice
simulator logs scoped to one app
runtime summaries straightforward to aggregate into paper figures

The main comparison is always:

bmpmm_* as the proposed path
rvv_* as the baseline path

under the same model-derived shape set.

Generic Versus Model-Split Apps

Form	Example	Intended Use
generic	`bmpmm_INT2`, `rvv_binary`	short regression, bring-up, or local debugging
model-split	`bmpmm_INT2_gemma3_270m`	formal benchmark campaigns and reported comparisons

The batch runner primarily targets the model-split apps.

Naming Contract

For model-split apps, the directory name itself is part of the experiment contract:

<implementation>_<precision>_<model>

That name should stay aligned with:

the generator inputs
the emitted tensors under kernel/
the reporting keys written into benchmark summaries

Common App Structure

Most benchmark app directories contain:

main.c: app entry point and case-level logging
kernel/: implementation code plus generated tensors and case metadata
tests.c / tests.h: local helpers or validation logic where needed
script/gen_data.py: app-specific data generator

Where To Edit

Change Type	Edit Location
shared case-selection or generator policy	`common/`
one app's workload tensor generation	that app's `script/gen_data.py`
one app's kernel implementation	that app's `kernel/`
one app's logging or control flow	that app's `main.c`
correctness-oriented checks for BMPU behavior	`bmpu_verify/`
local inference experiments outside the benchmark matrix	`llama2/`

Boundaries

bmpu_verify is for correctness, not paper-performance reporting.
bmpmm_* versus rvv_* is the main reported comparison.
llama2/ is not the same thing as the benchmark matrix used by run_model_split_apps.sh.
common/ is the right place to factor out behavior shared across multiple apps.

Documentation Rule

When you introduce a new maintained workflow under src/apps/, document it at the shared-tree level first unless the workflow is truly local to one app directory. This keeps the many repetitive benchmark directories from drifting out of sync.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Application Overlays

Start Here

Taxonomy

Experimental Unit

Generic Versus Model-Split Apps

Naming Contract

Common App Structure

Where To Edit

Boundaries

Documentation Rule

Related Documentation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Application Overlays

Start Here

Taxonomy

Experimental Unit

Generic Versus Model-Split Apps

Naming Contract

Common App Structure

Where To Edit

Boundaries

Documentation Rule

Related Documentation