Skip to content

Draft: nrf_edgeai_obsv — input-feature stream + mel descriptor metrics#113

Open
rrusak wants to merge 11 commits into
mainfrom
feature/new_nrf_edgeai_obsv_metrics
Open

Draft: nrf_edgeai_obsv — input-feature stream + mel descriptor metrics#113
rrusak wants to merge 11 commits into
mainfrom
feature/new_nrf_edgeai_obsv_metrics

Conversation

@rrusak

@rrusak rrusak commented Jun 19, 2026

Copy link
Copy Markdown
Member

This PR extends the observability library from a single output (class-probability) stream to a two-stream design, so metrics can also be computed over the extracted model input features. Each metric declares a source (PROBS or FEATURES) in its descriptor, and the core routes an update only to metrics matching the fed stream (nrf_edgeai_obsv_update vs nrf_edgeai_obsv_update_features), keeping the two cadences independent. On top of this it adds two feature metrics for ww/kws mel spectrograms: mel_energy_desc (mean/max energy, dynamic range, floor ratio, normalized against configured [p01, p99] percentiles) and mel_spectral_desc (eight scale-invariant spectral-shape rows: band ratios, centroid, spread, entropy, flatness, contrast). All metrics remain self-contained descriptors with caller-provided storage and emit a row-major uint32 histogram snapshot for the existing CDR transport. Distribution-style metrics that live in a uniform [0, 1] range no longer store bin edges and bin values in O(1) via a shared helper, while the probability-distribution metric keeps custom edges. The change also hardens the hot path: a runtime cap prevents scratch[] overflow in release builds where assert() is compiled out, and the spectral metric clamps negative inputs to keep logf/sqrtf well-defined.

Raman Rusak added 7 commits June 18, 2026 17:14
…e a source; the core routes updates by it. Features go

via nrf_edgeai_obsv_update_features() (SOURCE_FEATURES), probabilities
via update() (SOURCE_PROBS). Add mel_energy_desc: mean/max/dynamic-range
/floor histograms over the mel vector, scaled by a configured [p01,p99].
@rrusak rrusak requested review from Szynkaa, grochu and mbarchie June 19, 2026 14:58
Comment thread tests/obsv/src/suite_probs_entropy_dist.c Fixed
Comment thread tests/obsv/src/suite_probs_top2_margin_dist.c Fixed
@github-actions

Copy link
Copy Markdown

You can find the documentation preview for this PR here.

rrusak and others added 2 commits June 22, 2026 12:26
… type with wide type in loop condition'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
… type with wide type in loop condition'

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@@ -100,6 +100,20 @@ int nrf_edgeai_obsv_reset(nrf_edgeai_obsv_ctx_t *ctx);
*/
int nrf_edgeai_obsv_update(nrf_edgeai_obsv_ctx_t *ctx, const float *probs);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename this function - e.g. nrf_edgeai_obsv_update_probs - while it's still not "production ready" and API stability is not crucial at this point. IMO it's not intuitive why regular "update" function updates probabilities, not features.

Comment on lines 169 to 177
enum nrf_edgeai_obsv_metric_id {
NRF_EDGEAI_OBSV_METRIC_ID_TRANSITION_MATRIX = 2,
NRF_EDGEAI_OBSV_METRIC_ID_PROBS_DISTRIBUTION = 3,
NRF_EDGEAI_OBSV_METRIC_ID_PREDICTION_SWITCHING_RATE = 4,
NRF_EDGEAI_OBSV_METRIC_ID_PROBS_ENTROPY_DIST = 5,
NRF_EDGEAI_OBSV_METRIC_ID_PROBS_TOP2_MARGIN_DIST = 6,
NRF_EDGEAI_OBSV_METRIC_ID_MEL_ENERGY_DESC = 7,
NRF_EDGEAI_OBSV_METRIC_ID_MEL_SPECTRAL_DESC = 8,
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about making some separation between probs-based and feature-based metrics? Like feature-based starting from e.g. 1000? Just a thought for your consideration.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on this matter, but since you will be the main maintainer of this library, I would rely more on your opinion :) .

@mbarchie mbarchie Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thoughts - I think this may be not the best idea ;) Starting from e.g. 1000 will consume 2 more bytes in CBOR. Also, we actually do not know how many metrics we'll end up with. And if end up with more than 2 types of streams (shot in the dark).

I think we can keep it as it is.

uint32_t *counts = msd_counts(hdr);
const uint8_t bins = hdr->bin_num;

assert(n <= hdr->num_features);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSD has only an assert (which will be compiled out in release builds) while MED limits n to num_features. I think behavior could be aligned across all the metrics if no clear reason for that.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MED limit features to n = CONFIG_NRF_EDGEAI_OBSV_MEL_ENERGY_DESC_MAX_FEATURES because it's create buffer on stack "float scratch[CONFIG_NRF_EDGEAI_OBSV_MEL_ENERGY_DESC_MAX_FEATURES];" so that it doesn't happen that we go beyond the limits of this buffer, and for MSD there is no such issue. In Kconfig there is CONFIG_NRF_EDGEAI_OBSV_MEL_ENERGY_DESC_MAX_FEATURES but no similar thing for MSD, for MSD only number of bins in distribution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, then I'm fine with it

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every other nrf_obsv_metric_* file in this folder is an actual metric (has a *_create(), ops table, and metric_id), but this one is just shared binning helpers. The metric_ prefix reads like "this is a metric," and dist is easy to confuse with the real nrf_obsv_metric_probs_distribution.c.

Could we rename it to something like nrf_obsv_dist_binning (keeps the tie to the DIST_BINNING Kconfig) or nrf_obsv_binning? Location in metrics/ is fine since it's a private helper for the metrics.

Comment on lines +155 to +161
if ((p_ctx == NULL) || (p_feats == NULL)) {
return -EINVAL;
}

dispatch(p_ctx, NRF_EDGEAI_OBSV_SOURCE_FEATURES, p_feats, n);

return 0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we need to keep number of feature extractions too? (like p_ctx->num_inferences)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need this info in nrf_edgeai_obsv_model_info_t , for consistency and similarity with num_classes, I will change CDDL schema, what do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a good idea

Comment on lines +125 to +129
if (vn < 0.0f) {
vn = 0.0f;
} else if (vn > 1.0f) {
vn = 1.0f;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for not using clip01() like in MSD?

Comment on lines +64 to +68
if (h_norm < 0.0f) {
h_norm = 0.0f;
} else if (h_norm > 1.0f) {
h_norm = 1.0f;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for not using clip01() like in MSD?

* mel descriptor metrics. The core dispatches an update to this metric only
* when the fed stream matches this value.
*/
uint8_t source;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field holds an enum value but is stored as a raw uint8_t, and dispatch() compares it as a plain uint8_t. Not a big deal, but so we lose the enum's "type safety", and there's no comment explaining the narrowing. Could we either declare it enum nrf_edgeai_obsv_source source or state that narrowing to uint8_t is intentional? (e.g. for struct packing - but here I think it does not help anyway)

Comment on lines +20 to +24
for (uint8_t b = 0; b < bin_num - 1; b++) {
if (val < edges[b]) {
return b;
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit - b is used for indexing, while _dist_uniform_edges uses i for the same thing. I know it was already like this, it just caught my eye now when moved to new file :)


(void)p_cfg;

memset(counts, 0, sizeof(uint32_t) * NRF_EDGEAI_OBSV_MED_NUM_ROWS * hdr->bin_num);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics in their init() functions could call clear() functions to de-duplicate "clearing" code. I think it applies to all the metrics currently implemented.

Raman Rusak added 2 commits June 23, 2026 14:46
…e_probs and update references, rename nrf_obsv_metric_dist to nrf_obsv_dist_binning
* @param priv Opaque per-instance storage pointer.
*/
void (*update)(const float *p_probs, uint16_t n, void *priv);
void (*update)(const float *p_data, uint16_t n, void *priv);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should replace single update interface with update_probs and update_features, then remove the source field. This lib was not released yet and it's experimental so no restrictions on breaking API

#. Register the metrics with the context using the :c:func:`nrf_edgeai_obsv_register` function.
#. Bind the Memfault transport once at application startup using the :c:func:`nrf_edgeai_obsv_memfault_init` function.
#. Call the :c:func:`nrf_edgeai_obsv_update` function after every inference.
#. Call the :c:func:`nrf_edgeai_obsv_update_probs` function with the output probability vector after every inference.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understood this is mandatory event if using only feature-source metrics. If that's true emphasize it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants