Define stdlib component coverage matrix for unit, e2e, and benchmarks

## Problem

Ongoing atomic runtime benchmark work keeps surfacing the same structural question: how should Neva split responsibility between `unit`, `e2e`, and `benchmark` coverage for stdlib components and Go-backed runtime funcs?

Right now the pressure comes from a good place:

- we want broad atomic benchmark coverage for runtime hot paths;
- we also want behavior assertions, not only perf numbers;
- and for Go-backed runtime funcs we often need lower-level invariant checks that are awkward or too indirect at full e2e level.

Without an explicit policy, it is too easy to either duplicate effort mechanically or to leave important gaps between correctness coverage and performance coverage.

## Goal

Define a practical coverage matrix for stdlib components and runtime funcs that answers:

- what must be covered with `e2e`;
- what must be covered with focused Go `unit` tests;
- what should have dedicated `benchmarks`;
- where fixture/build/run helpers may be shared without collapsing distinct test types into one artifact.

## Working direction

### 1. Keep `benchmarks` and `e2e` separate

They can and should reuse common helpers or fixture shape where useful, but they should **not** become a 1:1 single artifact by default.

Reason:

- `e2e` exists to assert behavior and invariants.
- `benchmarks` exist to produce stable perf signal with minimal assertion noise.
- Combining them mechanically makes both signals worse.

### 2. Aim for eventual e2e coverage for every stdlib component

Target direction:

- every stdlib component should eventually have e2e coverage;
- start breadth-first with stateless / builtin-like components;
- then expand toward more stateful, IO-heavy, and runtime-backed components.

These tests should focus on user-visible invariants and representative scenarios, not Go-internal implementation details.

### 3. Require unit tests for Go-backed runtime funcs

For components implemented as runtime funcs in Go, focused unit tests should be the default place for:

- edge cases;
- normalization / bounds behavior;
- copy semantics;
- error semantics;
- concurrency-sensitive invariants where it is feasible to test them directly.

This gives faster and more precise feedback than forcing every such check through compiler+runtime e2e overhead.

### 4. Keep benchmarks selective

Benchmarks should be required only where they provide real signal:

- atomic hot paths;
- runtime-sensitive stdlib components;
- perf baselines needed for optimization work.

They should **not** be treated as mandatory duplicates of every e2e scenario.

## Deliverables

- A small coverage matrix by component/runtime-func category.
- Clear defaults for when to add `unit`, `e2e`, and `benchmark` coverage.
- Guidance for helper reuse between `pkg/e2e` and benchmark harnesses.
- An initial rollout order for the backlog.

## Related work

- #1015 Performance observability gaps (benchmarks, profiling, leak/race detection)
- #1023 Add runtime benchmark baseline before Msg redesign
- #1004 Redesign runtime `Msg`
- #1067 Improve Neva performance using goperf.dev as a reference
- #1086 runtime: add missing builtin neg/slice funcs and complete atomic builtin breadth


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define stdlib component coverage matrix for unit, e2e, and benchmarks #1088

Problem

Goal

Working direction

1. Keep `benchmarks` and `e2e` separate

2. Aim for eventual e2e coverage for every stdlib component

3. Require unit tests for Go-backed runtime funcs

4. Keep benchmarks selective

Deliverables

Related work

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Define stdlib component coverage matrix for unit, e2e, and benchmarks #1088

Description

Problem

Goal

Working direction

1. Keep benchmarks and e2e separate

2. Aim for eventual e2e coverage for every stdlib component

3. Require unit tests for Go-backed runtime funcs

4. Keep benchmarks selective

Deliverables

Related work

Metadata

Metadata

Assignees

Labels

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Keep `benchmarks` and `e2e` separate