Agent primer — HDS data-model

This file orients future agents (Claude or others) working on the data-model repo. Read README.md for what the repo is; read this file for the design principles and conventions you need to hold in mind before adding or modifying items, streams, or eventTypes.

Always also read:

documentation/DESIGN-NOTES.md — item design principles (now includes scale hook placement).
documentation/SYMPTOMS.md, MOOD.md, CERVICAL-POSITION.md, MENSTRUAL-CYCLE.md, PHYSICAL-ACTIVITY.md, SKIN.md — per-domain design decisions and cross-system mappings.

Core architectural principles

1. Items are the vocabulary — domain-named, source-agnostic, reusable

Every item (body-weight, symptom-pain-headache, wellbeing-mood, nutrition-appetite) is a reusable clinical concept. Item keys use kebab-case domain/body-system names — never prefixed with an app, bridge, or questionnaire.

✅ function-mobility, symptom-pain-severity, wellbeing-mental-distress
❌ <bridge>-appetite, <pro>-mobility, <vendor>-pain, <questionnaire>-anxiety

The same clinical concept should have one HDS item, regardless of which app / bridge / form / PRO introduced it. Creating questionnaire-prefixed or source-prefixed copies forks the vocabulary and makes cross-source analytics impossible.

2. One item = one `streamId` + one `eventType`

Enforced by the item loader (src/items.js:103). The pair streamId:eventType is the storage identity of an item and cannot collide across items.

Items may define variations.eventType (e.g. kg vs lb for body weight) — these are rendered as unit pickers, not as alternative items.

3. EventTypes describe shape, not questions

ratio/proportion (0..1), activity/plain (boolean presence with duration), mass/kg, temperature/c, mood/5d-vectors, cervix-position/3d-vectors. They describe units and JSON Schema shape — not clinical questions.

The same eventType is reused by many items. Adding a new eventType is rare and needs a clear reason.

4. Streams are a clinical-domain tree

body-*, symptom-*, wellbeing-*, activity-*, fertility-*, nutrition-*, medication-*, profile-*, family-*. Mirrors body systems / function domains, close to SNOMED CT and ICF categorisations.

Do not create questionnaire-branded streams (e.g. questionnaire-eq5d5l). Each data point lands in its clinical-domain stream; the questionnaire's identity lives in the form template (a CollectorRequest constructed via hds-lib-js's appTemplates.CollectorRequest / CollectorSection), not in data-model.

5. Apps and health data live in different dimensions

App-contextual content (notes, chat, service messages) is declared in definitions/appStreams.yaml and attaches under each bridge's {appStreamId} at runtime. Health data — observations, measurements, PRO responses — lives in the main stream tree.

A bridge that captures health data (for example, a fertility-tracker bridge importing appetite → nutrition-appetite) does not get a copy of the item under its app stream. The appetite observation lands in nutrition-appetite, full stop.

6. Cross-method convertibility uses converter engines

When multiple observation methods measure the same underlying construct (e.g. 15 different cervical-fluid charting methods, multiple mood taxonomies), use type: convertible with a converter-engine block. Existing engines: converters/cervical-fluid/, converters/mood/. Both use a weighted N-dimensional vector space with Euclidean-distance matching.

This is not about questionnaires. It is about sources / methods of observation converging on a single normalized representation.

7. Wording lives in the item — form-level overrides are layered, not stored in data-model

item.label, item.description, and each option.label are the canonical, generic, reusable wording. They are rendered directly by readers and by hds-forms-js when no override is provided.

Form-specific wording (e.g. EuroQol's first-person sentences for EQ-5D-5L) lives in the form template, not in data-model. Templates carry per-itemKey overrides on the CollectorRequest.sections[].itemCustomizations[itemKey].labels bag (defined by appTemplates.ItemLabels in hds-lib). Renderers prefer the override; in their absence they fall back to the item's canonical wording.

This means: do not add questionnaire-specific wording to items. Add the generic wording to items here, and have the form template override it for that questionnaire's licensed phrasing.

The `ratio/*` scale hook placement rule

When an item uses ratio/proportion (or any ratio/* eventType) with discrete select options, hook values must be chosen to align with the semantic anchors of established competing scales — not by evenly distributing N points across [0, 1].

See documentation/DESIGN-NOTES.md → "Scale hook placement" for the full rule, canonical placements, and anti-patterns.

Headline placements (memorise these):

Scale levels	Hooks	Covers
5-level severity	`0.0 / 0.25 / 0.5 / 0.75 / 1.0`	EQ-5D-5L, PROMIS 5-level, VRS-5, ICF, WHODAS
4-level severity	`0.0 / 0.25 / 0.5 / 1.0` (subset of 5-level, no `0.75`)	Apple HealthKit `HKCategoryValueSeverity`
3-level absolute	`0.25 / 0.5 / 1.0`	Mild/Moderate/Severe intake screeners
3-level relative	`0.25 / 0.5 / 0.75`	`wellbeing-sex-drive` (deviation from baseline, not absolute severity)
11-level NRS	`0.0 / 0.1 / 0.2 / … / 1.0`	Pain NRS — every other hook aligns with 5-level

Anti-pattern: never use 0.0 / 0.33 / 0.66 / 1.0 for a 4-level scale. "Moderate" belongs at 0.5, not 0.66.

Items, eventTypes, streams — when to add what

When adding a new item

Does the clinical concept already exist somewhere in definitions/items/? If yes, extend or reuse — do not duplicate.
Survey competing scales for the construct (PROMIS, ICF, SNOMED, LOINC, Apple HealthKit, FHIR questionnaires). Place your hooks at semantic anchors.
Pick the kebab-case domain key (<stream>-<concept>). No app / source / questionnaire prefixes.
Localise label, description, option labels with at minimum { en, fr }.
Add SNOMED CT / LOINC / ICF / FHIR references — per the cross-system compatibility principle. Use the existing hl7fhir: block pattern for FHIR mapping (see documentation/DESIGN-NOTES.md body-weight example).
Place the item in the correct clinical-domain stream. Create a new child stream under an existing parent before creating a new top-level domain.
Run npm run build to verify validation passes.

When adding a new eventType

Default answer: don't. Try hard to reuse an existing eventType with item-level constraints on options / hooks. The eventType registry is intentionally small.

Reasons to add one:

A genuinely new shape (new JSON Schema structure) — e.g. multi-dimensional vector, composite object, new unit family.
Existing types cannot encode the data without semantic distortion.

Reasons not to add one:

"We want a different label" (labels belong on items/options, not eventTypes).
"We want a different range" (ranges belong on items via option hooks or min/max constraints).
"This questionnaire/app has its own concept" (the data still has a shape, and the shape is probably already defined).

When adding a new stream

New streams are cheap; adding children under existing parents is cheaper still.

New top-level stream → only for a genuinely new clinical domain (e.g. function-* for ICF Activities & Participation). Document its scope in a new documentation/<DOMAIN>.md.
New child under existing parent → normal case. Match the hierarchy style (flat vs nested) of siblings.

Reuse-first architecture — examples

EQ-5D-5L (Plan 44)

The EQ-5D-5L questionnaire has 5 dimensions + a VAS. None of them are EQ-5D-5L-specific:

EQ-5D-5L dimension	HDS item (reusable)	Clinical construct	Shared with
Mobility	`function-mobility`	ICF `d450` walking function	WHODAS, SF-36 PF, PROMIS Physical Function, Barthel
Self-care	`function-self-care`	ICF `d510`/`d540` personal care	Katz ADL, Barthel, SF-36, WHODAS
Usual activities	`function-usual-activities`	ICF `d630`/`d845`/`d920` role/participation	SF-36 Role, PROMIS Social Roles, WHODAS
Pain/Discomfort	`symptom-pain-severity`	Overall pain intensity (VRS-5 / NRS)	PROMIS Pain Intensity, SF-36 BP, BPI, NRS
Anxiety/Depression	`wellbeing-mental-distress`	Summary distress severity	K6/K10, HADS-Total, PROMIS Emotional Distress
EQ VAS	`wellbeing-self-rated-health`	Self-rated health (SRH)	EQ VAS, SF-36 GH, PROMIS Global01, NHIS/MEPS

The questionnaire's identity lives in the form template (a CollectorRequest built with appTemplates.CollectorRequest plus per-section itemCustomizations.labels for the EuroQol-licensed wording), not in the data model. When PROMIS Short Forms or SF-36 are added later, they reuse the same items via new templates.

External-source bridges (general pattern)

A bridge that ingests health data from a third-party app or device maps its source observations to generic HDS items, not to bridge-prefixed copies. Typical mappings:

Source's appetite / dietary observation → nutrition-appetite.
Source's cervical-fluid observation → body-vulva-mucus-inspect (via the cervical-fluid converter, which normalises across charting methods).
Source's mood / affect observation → wellbeing-mood (via the mood converter, which normalises across emotion vocabularies).

The bridge owns only its app streams (notes, chat, service messages) under its {appStreamId} — app-contextual content that has no HDS-native home. It does not own the health-data items themselves. See definitions/appStreams.yaml.

Existing 5-level / 4-level scale interop

Apple HealthKit's 4-level HKCategoryValueSeverity maps as a subset of HDS 5-level:

HDS 5-level:   0.0 ─── 0.25 ─── 0.5 ─── 0.75 ─── 1.0
               None    Slight   Moderate  Severe   Very severe
Apple 4-level: 0.0 ─── 0.25 ─── 0.5 ─────────── 1.0
               NotPresent Mild   Moderate         Severe
                                                  (= Apple's ceiling)

Closest-value matching is correct for all 4 Apple values and 4 of 5 HDS values. Only HDS "Severe" (0.75) has no Apple equivalent — pick a tie-break direction (recommend upward to Apple Severe 1.0) and document it in the bridge.

Anti-patterns — do not

Do not create questionnaire-prefixed or source-prefixed items (e.g. <questionnaire>-mobility, <bridge>-appetite, <vendor>-pain). See §1.
Do not evenly distribute hook values for ratio/* selects (0 / 0.33 / 0.66 / 1.0 for 4 levels). Use semantic-anchor placement.
Do not rely on naive closest-value matching for cross-scale conversion at the event-type boundary. Use explicit label-lookup tables in bridges when the source scheme is known.
Do not create "administration record" / "questionnaire envelope" eventTypes for grouping events from a single form filling. The events' shared time property is the grouping key. Derived scores (like EQ-5D state code, utility index) are recomputed on the reader side.
Do not add a new eventType when item-level constraints on an existing type would do the job (see §"When adding a new eventType").
Do not stream-nest by questionnaire. Data points belong in their clinical-domain streams.
Do not bake questionnaire-specific wording (e.g. EuroQol's first-person sentences) into items. Items carry generic, reusable labels; form templates layer per-form overrides via itemCustomizations.labels (see §7).
Do not create a new top-level stream when a child under an existing parent would serve.

File / folder orientation

data-model/data-model/
├── README.md                              # What this repo is
├── AGENTS.md                              # (this file) primer for agents
├── CHANGELOG.md
├── package.json, eslint.config.mjs, etc.
│
├── definitions/                           # THE SOURCE OF TRUTH
│   ├── items/*.yaml                       # Health data point definitions (~73 items across ~11 YAML files)
│   ├── streams/*.yaml                     # Clinical-domain tree (~36 streams)
│   ├── eventTypes/
│   │   ├── eventTypes-hds.json            # Custom HDS event type JSON Schemas (~33)
│   │   └── eventTypes-legacy.json         # Standard Pryv measurement types (~200)
│   ├── converters/
│   │   ├── cervical-fluid/                # 9D vector converter (15+ charting methods)
│   │   └── mood/                          # 5D vector converter (5 methods)
│   ├── appStreams.yaml                    # App-contextual sub-streams (notes, chat)
│   ├── datasources/                       # External data source references
│   ├── conversions/                       # Unit conversions (mass, length, temperature)
│   ├── settings/settings.yaml             # HDS user-settings definitions
│   ├── hl7-defaults/category.yaml         # FHIR default categories
│   └── inputs.yaml                        # Input-type coercion map
│
├── src/
│   ├── items.js                           # Loader + checkItemVsEvenType validator
│   ├── streams.js                         # Stream loader
│   ├── eventTypes.js                      # EventType registry loader
│   ├── appStreams.js                      # App-stream loader
│   ├── converters.js                      # Converter loader
│   ├── datasources.js                     # Datasource loader
│   ├── conversions.js                     # Unit conversion loader
│   ├── settings.js                        # Settings loader
│   ├── build.js                           # Pack generation (writes dist/pack.json)
│   └── schemas/items.js                   # AJV JSON Schema for item YAML structure
│
├── documentation/                         # Design notes, per-domain references
│   ├── DESIGN-NOTES.md                    # Item design principles (+ scale hook placement)
│   ├── SYMPTOMS.md                        # Symptom domain, Apple HealthKit mapping, decisions
│   ├── MOOD.md                            # Mood converter design, circumplex/PAD model
│   ├── CERVICAL-POSITION.md               # 3D vector design
│   ├── MENSTRUAL-CYCLE.md                 # Cycle modeling
│   ├── PHYSICAL-ACTIVITY.md               # Activity items
│   └── SKIN.md                            # Skin observations
│
├── scripts/                               # setup / deploy shell scripts
├── tests/                                 # Vitest test suite
└── dist/                                  # Generated pack.json + gh-pages clone (git-managed)

Cross-system reference library

Commonly-cited scales / terminologies for mapping new items:

Severity / intensity

ICF qualifier (WHO) — 5 levels (0–4): No problem / Mild / Moderate / Severe / Complete.
Apple HealthKit HKCategoryValueSeverity — 4 levels: NotPresent / Mild / Moderate / Severe.
PROMIS Short Forms — many 5-level Likerts (Physical Function, Pain Intensity, Emotional Distress – Anxiety, Emotional Distress – Depression, etc.).
VRS-5 / VRS-4 — verbal rating scales for pain and other severities.
NRS 0–10 — numeric rating scale, clinical pain gold standard (LOINC 72514-3).

Function / disability

ICF d-codes — d450 walking, d510 washing, d540 dressing, d630-d649 household, d845-d859 work, d920 recreation.
WHODAS 2.0 — 6 domains, 5-level Likert.
Katz ADL, Barthel Index — binary/ordinal per-item ADL assessment.
SF-36 PF / RE / RP — physical function and role functioning subscales.

Mental health / distress

Kessler K6 / K10 — 6/10 items, combined psychological distress.
HADS — 14 items, split anxiety (7) + depression (7).
PHQ-2 / PHQ-9 — depression (LOINC 75275-8 for PHQ-9 total).
GAD-2 / GAD-7 — anxiety (LOINC 69737-5 for GAD-7 total).
WHO-5 — positive wellbeing.
DASS-21 — depression / anxiety / stress.
Apple HealthKit HKStateOfMind (iOS 17+) — valence + 38 emotion labels.

Self-rated health

EQ VAS — 0–100 mm (EuroQol).
SF-1 / SF-36 GH item 1 / PROMIS Global01 — 5-level (Excellent / Very good / Good / Fair / Poor).
NHIS / MEPS / ESS / SILC — national health surveys, 5-level SRH question.
SNOMED 425058002; LOINC 72006-0, 32622-6, 82589-3.

Terminologies (codes to include in item references)

SNOMED CT — https://bioportal.bioontology.org/ontologies/SNOMEDCT
LOINC — https://loinc.org/
ICD-10 / ICD-11 — WHO
ICF — WHO
ATC (medications) — WHO

`deprecated: true` on items

An item may carry deprecated: true (boolean, optional) to signal that it is kept in the data-model — so existing events keep validating and rendering — but is not used to create new data points. Plan 50 formalizes this contract.

Rules for authors:

Set deprecated: true on the item to mark it. Do not delete the item: existing events on disk still need its streamId / eventType / type to be resolvable by readers.
Add a description: line explaining why deprecated and what to use instead. (Failure mode: silently-deprecated items leave consumers guessing.)
Bump the item's version if you also adjust other fields, but the deprecated: flag itself is independent of version.

Rules for consumers (hds-lib, hds-forms-js, hds-webapp, doctor-dashboard, app-data-model-browser):

Default listings hide deprecated items. Form-builder item browsers, item pickers, ribbon "+" sheets, and the data-model browser must filter deprecated:true out of their UI lists by default (opt-in "show deprecated" toggle is acceptable for inspector tools).
Reading still works. getHDSModel().itemsDefs.forKey(key) and forEvent(event) MUST still return deprecated items so existing events render correctly. The filter is at the discovery / picker layer only, never at the resolution layer.
Writing is discouraged but not blocked. Bridges still actively writing to deprecated streams keep working until they migrate. The validator does not reject events.create against deprecated items — that's a migration concern, not a data-integrity one.

Schema: the deprecated field is recognised in src/schemas/items.js and round-trips through pack.json.

Existing design decisions (load these before proposing related changes)

From documentation/SYMPTOMS.md (2026-03-17):

Acne dual-source — when a bridge exposes acne in both a skin field and a generic symptoms field, both map to the existing body-skin-acne item (reuse). Single item regardless of source.
Anxiety/Stress in symptom arrays — wellbeing-mood (5D vectors) captures transient mood tags. Validated distress severity ratings from PROs (K6/K10, HADS, GAD/PHQ, etc.) live in wellbeing-mental-distress / -anxiety / -depression (added in v1.3.0 alongside EQ-5D-5L). The two are different constructs and coexist.
Increased appetite — modeled as nutrition-appetite (ratio/proportion, 3 hooks).
Weight fluctuations — not a symptom; derived from body-weight.
Stream hierarchy — full clinical sub-categories under symptom/.

From documentation/MOOD.md (Plan 24, 2026-03-23):

5D vector space: valence / arousal / dominance / socialOrientation / temporalFocus.
Converter engine: weighted Euclidean distance.
Methods supported: multiple external vocabularies (Apple HealthKit, Daylio, How We Feel, third-party bridges) plus a direct _raw virtual method.

Workflow expectations

Work on a feature branch (feature/<desc> or similar), never directly on main.
Run npm run build after YAML changes — the loader validates schema, stream/eventType references, and item↔eventType compatibility.
Before committing changes touching package.json: grep '"file:' package.json (local-file dependencies must be replaced with git URLs before committing).
CHANGELOG.md lives at the root; update under [Unreleased] when adding items / streams / eventTypes.
The dist/ folder is the gh-pages branch — deploys to model.datasafe.dev.

Cross-repo relationships

This repo is the source of the HDS vocabulary. Other public repos consume pack.json:

hds-lib-js — fetches pack.json via the platform service-info assets.hds-model URL, exposes items / streams / eventTypes / converters as HDSModel. See its AGENTS.md.
hds-forms-js — React renderers for HDSItemDef. Implements the slider, select, composite, convertible, datasource-search, etc. field types declared on items here.
app-data-model-browser — small public viewer for the deployed pack.json.

When a change here breaks a consumer's typechecks (e.g. removing an option, renaming an item key), bump version in package.json, document under [Unreleased] in CHANGELOG.md, and notify the consumer repos.

Living document — extend it whenever you uncover a non-obvious convention or an architectural decision worth memorising. The most recent entry in CHANGELOG.md is the canonical record of what shipped; this file captures the why behind the rules.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent primer — HDS data-model

Core architectural principles

1. Items are the vocabulary — domain-named, source-agnostic, reusable

2. One item = one `streamId` + one `eventType`

3. EventTypes describe shape, not questions

4. Streams are a clinical-domain tree

5. Apps and health data live in different dimensions

6. Cross-method convertibility uses converter engines

7. Wording lives in the item — form-level overrides are layered, not stored in data-model

The `ratio/*` scale hook placement rule

Items, eventTypes, streams — when to add what

When adding a new item

When adding a new eventType

When adding a new stream

Reuse-first architecture — examples

EQ-5D-5L (Plan 44)

External-source bridges (general pattern)

Existing 5-level / 4-level scale interop

Anti-patterns — do not

File / folder orientation

Cross-system reference library

Severity / intensity

Function / disability

Mental health / distress

Self-rated health

Terminologies (codes to include in item references)

`deprecated: true` on items

Existing design decisions (load these before proposing related changes)

Workflow expectations

Cross-repo relationships

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

Agent primer — HDS data-model

Core architectural principles

1. Items are the vocabulary — domain-named, source-agnostic, reusable

2. One item = one streamId + one eventType

3. EventTypes describe shape, not questions

4. Streams are a clinical-domain tree

5. Apps and health data live in different dimensions

6. Cross-method convertibility uses converter engines

7. Wording lives in the item — form-level overrides are layered, not stored in data-model

The ratio/* scale hook placement rule

Items, eventTypes, streams — when to add what

When adding a new item

When adding a new eventType

When adding a new stream

Reuse-first architecture — examples

EQ-5D-5L (Plan 44)

External-source bridges (general pattern)

Existing 5-level / 4-level scale interop

Anti-patterns — do not

File / folder orientation

Cross-system reference library

Severity / intensity

Function / disability

Mental health / distress

Self-rated health

Terminologies (codes to include in item references)

deprecated: true on items

Existing design decisions (load these before proposing related changes)

Workflow expectations

Cross-repo relationships

2. One item = one `streamId` + one `eventType`

The `ratio/*` scale hook placement rule

`deprecated: true` on items