This file orients future agents (Claude or others) working on the data-model repo. Read README.md for what the repo is; read this file for the design principles and conventions you need to hold in mind before adding or modifying items, streams, or eventTypes.
Always also read:
documentation/DESIGN-NOTES.md— item design principles (now includes scale hook placement).documentation/SYMPTOMS.md,MOOD.md,CERVICAL-POSITION.md,MENSTRUAL-CYCLE.md,PHYSICAL-ACTIVITY.md,SKIN.md— per-domain design decisions and cross-system mappings.
Every item (body-weight, symptom-pain-headache, wellbeing-mood, nutrition-appetite) is a reusable clinical concept. Item keys use kebab-case domain/body-system names — never prefixed with an app, bridge, or questionnaire.
- ✅
function-mobility,symptom-pain-severity,wellbeing-mental-distress - ❌
<bridge>-appetite,<pro>-mobility,<vendor>-pain,<questionnaire>-anxiety
The same clinical concept should have one HDS item, regardless of which app / bridge / form / PRO introduced it. Creating questionnaire-prefixed or source-prefixed copies forks the vocabulary and makes cross-source analytics impossible.
Enforced by the item loader (src/items.js:103). The pair streamId:eventType is the storage identity of an item and cannot collide across items.
Items may define variations.eventType (e.g. kg vs lb for body weight) — these are rendered as unit pickers, not as alternative items.
ratio/proportion (0..1), activity/plain (boolean presence with duration), mass/kg, temperature/c, mood/5d-vectors, cervix-position/3d-vectors. They describe units and JSON Schema shape — not clinical questions.
The same eventType is reused by many items. Adding a new eventType is rare and needs a clear reason.
body-*, symptom-*, wellbeing-*, activity-*, fertility-*, nutrition-*, medication-*, profile-*, family-*. Mirrors body systems / function domains, close to SNOMED CT and ICF categorisations.
Do not create questionnaire-branded streams (e.g. questionnaire-eq5d5l). Each data point lands in its clinical-domain stream; the questionnaire's identity lives in the form template (a CollectorRequest constructed via hds-lib-js's appTemplates.CollectorRequest / CollectorSection), not in data-model.
App-contextual content (notes, chat, service messages) is declared in definitions/appStreams.yaml and attaches under each bridge's {appStreamId} at runtime. Health data — observations, measurements, PRO responses — lives in the main stream tree.
A bridge that captures health data (for example, a fertility-tracker bridge importing appetite → nutrition-appetite) does not get a copy of the item under its app stream. The appetite observation lands in nutrition-appetite, full stop.
When multiple observation methods measure the same underlying construct (e.g. 15 different cervical-fluid charting methods, multiple mood taxonomies), use type: convertible with a converter-engine block. Existing engines: converters/cervical-fluid/, converters/mood/. Both use a weighted N-dimensional vector space with Euclidean-distance matching.
This is not about questionnaires. It is about sources / methods of observation converging on a single normalized representation.
item.label, item.description, and each option.label are the canonical, generic, reusable wording. They are rendered directly by readers and by hds-forms-js when no override is provided.
Form-specific wording (e.g. EuroQol's first-person sentences for EQ-5D-5L) lives in the form template, not in data-model. Templates carry per-itemKey overrides on the CollectorRequest.sections[].itemCustomizations[itemKey].labels bag (defined by appTemplates.ItemLabels in hds-lib). Renderers prefer the override; in their absence they fall back to the item's canonical wording.
This means: do not add questionnaire-specific wording to items. Add the generic wording to items here, and have the form template override it for that questionnaire's licensed phrasing.
When an item uses ratio/proportion (or any ratio/* eventType) with discrete select options, hook values must be chosen to align with the semantic anchors of established competing scales — not by evenly distributing N points across [0, 1].
See documentation/DESIGN-NOTES.md → "Scale hook placement" for the full rule, canonical placements, and anti-patterns.
Headline placements (memorise these):
| Scale levels | Hooks | Covers |
|---|---|---|
| 5-level severity | 0.0 / 0.25 / 0.5 / 0.75 / 1.0 |
EQ-5D-5L, PROMIS 5-level, VRS-5, ICF, WHODAS |
| 4-level severity | 0.0 / 0.25 / 0.5 / 1.0 (subset of 5-level, no 0.75) |
Apple HealthKit HKCategoryValueSeverity |
| 3-level absolute | 0.25 / 0.5 / 1.0 |
Mild/Moderate/Severe intake screeners |
| 3-level relative | 0.25 / 0.5 / 0.75 |
wellbeing-sex-drive (deviation from baseline, not absolute severity) |
| 11-level NRS | 0.0 / 0.1 / 0.2 / … / 1.0 |
Pain NRS — every other hook aligns with 5-level |
Anti-pattern: never use 0.0 / 0.33 / 0.66 / 1.0 for a 4-level scale. "Moderate" belongs at 0.5, not 0.66.
- Does the clinical concept already exist somewhere in
definitions/items/? If yes, extend or reuse — do not duplicate. - Survey competing scales for the construct (PROMIS, ICF, SNOMED, LOINC, Apple HealthKit, FHIR questionnaires). Place your hooks at semantic anchors.
- Pick the kebab-case domain key (
<stream>-<concept>). No app / source / questionnaire prefixes. - Localise
label,description, optionlabels with at minimum{ en, fr }. - Add SNOMED CT / LOINC / ICF / FHIR references — per the cross-system compatibility principle. Use the existing
hl7fhir:block pattern for FHIR mapping (seedocumentation/DESIGN-NOTES.mdbody-weight example). - Place the item in the correct clinical-domain stream. Create a new child stream under an existing parent before creating a new top-level domain.
- Run
npm run buildto verify validation passes.
Default answer: don't. Try hard to reuse an existing eventType with item-level constraints on options / hooks. The eventType registry is intentionally small.
Reasons to add one:
- A genuinely new shape (new JSON Schema structure) — e.g. multi-dimensional vector, composite object, new unit family.
- Existing types cannot encode the data without semantic distortion.
Reasons not to add one:
- "We want a different label" (labels belong on items/options, not eventTypes).
- "We want a different range" (ranges belong on items via option hooks or min/max constraints).
- "This questionnaire/app has its own concept" (the data still has a shape, and the shape is probably already defined).
New streams are cheap; adding children under existing parents is cheaper still.
- New top-level stream → only for a genuinely new clinical domain (e.g.
function-*for ICF Activities & Participation). Document its scope in a newdocumentation/<DOMAIN>.md. - New child under existing parent → normal case. Match the hierarchy style (flat vs nested) of siblings.
The EQ-5D-5L questionnaire has 5 dimensions + a VAS. None of them are EQ-5D-5L-specific:
| EQ-5D-5L dimension | HDS item (reusable) | Clinical construct | Shared with |
|---|---|---|---|
| Mobility | function-mobility |
ICF d450 walking function |
WHODAS, SF-36 PF, PROMIS Physical Function, Barthel |
| Self-care | function-self-care |
ICF d510/d540 personal care |
Katz ADL, Barthel, SF-36, WHODAS |
| Usual activities | function-usual-activities |
ICF d630/d845/d920 role/participation |
SF-36 Role, PROMIS Social Roles, WHODAS |
| Pain/Discomfort | symptom-pain-severity |
Overall pain intensity (VRS-5 / NRS) | PROMIS Pain Intensity, SF-36 BP, BPI, NRS |
| Anxiety/Depression | wellbeing-mental-distress |
Summary distress severity | K6/K10, HADS-Total, PROMIS Emotional Distress |
| EQ VAS | wellbeing-self-rated-health |
Self-rated health (SRH) | EQ VAS, SF-36 GH, PROMIS Global01, NHIS/MEPS |
The questionnaire's identity lives in the form template (a CollectorRequest built with appTemplates.CollectorRequest plus per-section itemCustomizations.labels for the EuroQol-licensed wording), not in the data model. When PROMIS Short Forms or SF-36 are added later, they reuse the same items via new templates.
A bridge that ingests health data from a third-party app or device maps its source observations to generic HDS items, not to bridge-prefixed copies. Typical mappings:
- Source's appetite / dietary observation →
nutrition-appetite. - Source's cervical-fluid observation →
body-vulva-mucus-inspect(via thecervical-fluidconverter, which normalises across charting methods). - Source's mood / affect observation →
wellbeing-mood(via themoodconverter, which normalises across emotion vocabularies).
The bridge owns only its app streams (notes, chat, service messages) under its {appStreamId} — app-contextual content that has no HDS-native home. It does not own the health-data items themselves. See definitions/appStreams.yaml.
Apple HealthKit's 4-level HKCategoryValueSeverity maps as a subset of HDS 5-level:
HDS 5-level: 0.0 ─── 0.25 ─── 0.5 ─── 0.75 ─── 1.0
None Slight Moderate Severe Very severe
Apple 4-level: 0.0 ─── 0.25 ─── 0.5 ─────────── 1.0
NotPresent Mild Moderate Severe
(= Apple's ceiling)
Closest-value matching is correct for all 4 Apple values and 4 of 5 HDS values. Only HDS "Severe" (0.75) has no Apple equivalent — pick a tie-break direction (recommend upward to Apple Severe 1.0) and document it in the bridge.
- Do not create questionnaire-prefixed or source-prefixed items (e.g.
<questionnaire>-mobility,<bridge>-appetite,<vendor>-pain). See §1. - Do not evenly distribute hook values for
ratio/*selects (0 / 0.33 / 0.66 / 1.0for 4 levels). Use semantic-anchor placement. - Do not rely on naive closest-value matching for cross-scale conversion at the event-type boundary. Use explicit label-lookup tables in bridges when the source scheme is known.
- Do not create "administration record" / "questionnaire envelope" eventTypes for grouping events from a single form filling. The events' shared
timeproperty is the grouping key. Derived scores (like EQ-5D state code, utility index) are recomputed on the reader side. - Do not add a new eventType when item-level constraints on an existing type would do the job (see §"When adding a new eventType").
- Do not stream-nest by questionnaire. Data points belong in their clinical-domain streams.
- Do not bake questionnaire-specific wording (e.g. EuroQol's first-person sentences) into items. Items carry generic, reusable labels; form templates layer per-form overrides via
itemCustomizations.labels(see §7). - Do not create a new top-level stream when a child under an existing parent would serve.
data-model/data-model/
├── README.md # What this repo is
├── AGENTS.md # (this file) primer for agents
├── CHANGELOG.md
├── package.json, eslint.config.mjs, etc.
│
├── definitions/ # THE SOURCE OF TRUTH
│ ├── items/*.yaml # Health data point definitions (~73 items across ~11 YAML files)
│ ├── streams/*.yaml # Clinical-domain tree (~36 streams)
│ ├── eventTypes/
│ │ ├── eventTypes-hds.json # Custom HDS event type JSON Schemas (~33)
│ │ └── eventTypes-legacy.json # Standard Pryv measurement types (~200)
│ ├── converters/
│ │ ├── cervical-fluid/ # 9D vector converter (15+ charting methods)
│ │ └── mood/ # 5D vector converter (5 methods)
│ ├── appStreams.yaml # App-contextual sub-streams (notes, chat)
│ ├── datasources/ # External data source references
│ ├── conversions/ # Unit conversions (mass, length, temperature)
│ ├── settings/settings.yaml # HDS user-settings definitions
│ ├── hl7-defaults/category.yaml # FHIR default categories
│ └── inputs.yaml # Input-type coercion map
│
├── src/
│ ├── items.js # Loader + checkItemVsEvenType validator
│ ├── streams.js # Stream loader
│ ├── eventTypes.js # EventType registry loader
│ ├── appStreams.js # App-stream loader
│ ├── converters.js # Converter loader
│ ├── datasources.js # Datasource loader
│ ├── conversions.js # Unit conversion loader
│ ├── settings.js # Settings loader
│ ├── build.js # Pack generation (writes dist/pack.json)
│ └── schemas/items.js # AJV JSON Schema for item YAML structure
│
├── documentation/ # Design notes, per-domain references
│ ├── DESIGN-NOTES.md # Item design principles (+ scale hook placement)
│ ├── SYMPTOMS.md # Symptom domain, Apple HealthKit mapping, decisions
│ ├── MOOD.md # Mood converter design, circumplex/PAD model
│ ├── CERVICAL-POSITION.md # 3D vector design
│ ├── MENSTRUAL-CYCLE.md # Cycle modeling
│ ├── PHYSICAL-ACTIVITY.md # Activity items
│ └── SKIN.md # Skin observations
│
├── scripts/ # setup / deploy shell scripts
├── tests/ # Vitest test suite
└── dist/ # Generated pack.json + gh-pages clone (git-managed)
Commonly-cited scales / terminologies for mapping new items:
- ICF qualifier (WHO) — 5 levels (0–4): No problem / Mild / Moderate / Severe / Complete.
- Apple HealthKit
HKCategoryValueSeverity— 4 levels: NotPresent / Mild / Moderate / Severe. - PROMIS Short Forms — many 5-level Likerts (Physical Function, Pain Intensity, Emotional Distress – Anxiety, Emotional Distress – Depression, etc.).
- VRS-5 / VRS-4 — verbal rating scales for pain and other severities.
- NRS 0–10 — numeric rating scale, clinical pain gold standard (LOINC 72514-3).
- ICF d-codes —
d450walking,d510washing,d540dressing,d630-d649household,d845-d859work,d920recreation. - WHODAS 2.0 — 6 domains, 5-level Likert.
- Katz ADL, Barthel Index — binary/ordinal per-item ADL assessment.
- SF-36 PF / RE / RP — physical function and role functioning subscales.
- Kessler K6 / K10 — 6/10 items, combined psychological distress.
- HADS — 14 items, split anxiety (7) + depression (7).
- PHQ-2 / PHQ-9 — depression (LOINC 75275-8 for PHQ-9 total).
- GAD-2 / GAD-7 — anxiety (LOINC 69737-5 for GAD-7 total).
- WHO-5 — positive wellbeing.
- DASS-21 — depression / anxiety / stress.
- Apple HealthKit
HKStateOfMind(iOS 17+) — valence + 38 emotion labels.
- EQ VAS — 0–100 mm (EuroQol).
- SF-1 / SF-36 GH item 1 / PROMIS Global01 — 5-level (Excellent / Very good / Good / Fair / Poor).
- NHIS / MEPS / ESS / SILC — national health surveys, 5-level SRH question.
- SNOMED
425058002; LOINC72006-0,32622-6,82589-3.
- SNOMED CT — https://bioportal.bioontology.org/ontologies/SNOMEDCT
- LOINC — https://loinc.org/
- ICD-10 / ICD-11 — WHO
- ICF — WHO
- ATC (medications) — WHO
An item may carry deprecated: true (boolean, optional) to signal that it is kept in the data-model — so existing events keep validating and rendering — but is not used to create new data points. Plan 50 formalizes this contract.
Rules for authors:
- Set
deprecated: trueon the item to mark it. Do not delete the item: existing events on disk still need itsstreamId/eventType/typeto be resolvable by readers. - Add a
description:line explaining why deprecated and what to use instead. (Failure mode: silently-deprecated items leave consumers guessing.) - Bump the item's
versionif you also adjust other fields, but thedeprecated:flag itself is independent of version.
Rules for consumers (hds-lib, hds-forms-js, hds-webapp, doctor-dashboard, app-data-model-browser):
- Default listings hide deprecated items. Form-builder item browsers, item pickers, ribbon "+" sheets, and the data-model browser must filter
deprecated:trueout of their UI lists by default (opt-in "show deprecated" toggle is acceptable for inspector tools). - Reading still works.
getHDSModel().itemsDefs.forKey(key)andforEvent(event)MUST still return deprecated items so existing events render correctly. The filter is at the discovery / picker layer only, never at the resolution layer. - Writing is discouraged but not blocked. Bridges still actively writing to deprecated streams keep working until they migrate. The validator does not reject
events.createagainst deprecated items — that's a migration concern, not a data-integrity one.
Schema: the deprecated field is recognised in src/schemas/items.js and round-trips through pack.json.
From documentation/SYMPTOMS.md (2026-03-17):
- Acne dual-source — when a bridge exposes acne in both a skin field and a generic symptoms field, both map to the existing
body-skin-acneitem (reuse). Single item regardless of source. - Anxiety/Stress in symptom arrays —
wellbeing-mood(5D vectors) captures transient mood tags. Validated distress severity ratings from PROs (K6/K10, HADS, GAD/PHQ, etc.) live inwellbeing-mental-distress/-anxiety/-depression(added in v1.3.0 alongside EQ-5D-5L). The two are different constructs and coexist. - Increased appetite — modeled as
nutrition-appetite(ratio/proportion, 3 hooks). - Weight fluctuations — not a symptom; derived from body-weight.
- Stream hierarchy — full clinical sub-categories under
symptom/.
From documentation/MOOD.md (Plan 24, 2026-03-23):
- 5D vector space: valence / arousal / dominance / socialOrientation / temporalFocus.
- Converter engine: weighted Euclidean distance.
- Methods supported: multiple external vocabularies (Apple HealthKit, Daylio, How We Feel, third-party bridges) plus a direct
_rawvirtual method.
- Work on a feature branch (
feature/<desc>or similar), never directly onmain. - Run
npm run buildafter YAML changes — the loader validates schema, stream/eventType references, and item↔eventType compatibility. - Before committing changes touching
package.json:grep '"file:' package.json(local-file dependencies must be replaced with git URLs before committing). CHANGELOG.mdlives at the root; update under[Unreleased]when adding items / streams / eventTypes.- The
dist/folder is thegh-pagesbranch — deploys tomodel.datasafe.dev.
This repo is the source of the HDS vocabulary. Other public repos consume pack.json:
- hds-lib-js — fetches
pack.jsonvia the platform service-infoassets.hds-modelURL, exposes items / streams / eventTypes / converters asHDSModel. See itsAGENTS.md. - hds-forms-js — React renderers for
HDSItemDef. Implements theslider,select,composite,convertible,datasource-search, etc. field types declared on items here. - app-data-model-browser — small public viewer for the deployed
pack.json.
When a change here breaks a consumer's typechecks (e.g. removing an option, renaming an item key), bump version in package.json, document under [Unreleased] in CHANGELOG.md, and notify the consumer repos.
Living document — extend it whenever you uncover a non-obvious convention or an architectural decision worth memorising. The most recent entry in CHANGELOG.md is the canonical record of what shipped; this file captures the why behind the rules.