CDCgov · robertmitchellv · Jun 22, 2026 · Jun 11, 2026 · Jun 12, 2026 · Jun 13, 2026
@@ -36,7 +36,7 @@ The pipeline creates an `AugmentationContext` for each document before any work
 All of this is per the eICR Data Augmentation Header template (`urn:hl7ii:2.16.840.1.113883.10.20.15.2.1.3:2025-11-01`):
 
 - **`templateId`** — signals the document conforms to the augmentation header template
-- **`id`** — new UUID with `assigningAuthorityName="ecr-refinement"`
+- **`id`** — new UUID with `assigningAuthorityName="ecr-refiner"`
 - **`effectiveTime`** — timestamp of the augmentation operation (with timezone)
 - **`setId`** — new UUID (replaces original, or inserted if absent)
 - **`versionNumber`** — reset to 1
@@ -57,9 +57,9 @@ Refinement attaches an unanchored `<footnote>` to every section in the refined e
 
 The "what was configured" and "what actually happened" columns usually agree, but they can diverge. The most common divergence is the no-match case: a jurisdiction configures a section for refinement, the matching step finds nothing in the section that matches the configured codes, and the refiner stubs the section rather than preserving an orphaned narrative. The footnote makes that decision visible — a reviewer sees "Action: refine, Outcome: Refined; no matches found" in the same row and doesn't have to wonder why a refine-configured section came out empty.
 
-The footnote ID is built from the section's LOINC code and the augmentation timestamp (`ecr-refinement-{loinc}-{timestamp}`), so every footnote in a refinement run is structurally tied to the augmentation author's `<time>` value. A consumer can verify document integrity by checking that all footnote IDs in a document carry the same timestamp the augmentation header advertises.
+The footnote ID is built from the section's LOINC code and the augmentation timestamp (`ecr-refiner-{loinc}-{timestamp}`), so every footnote in a refinement run is structurally tied to the augmentation author's `<time>` value. A consumer can verify document integrity by checking that all footnote IDs in a document carry the same timestamp the augmentation header advertises.
 
-The user-facing labels for the configuration source and the runtime outcome live in `section/constants.py` as small dicts keyed by enum values. Editing the copy is one file change with no code touches.
+The user-facing labels for the configuration source and the runtime outcome live in `narrative/constants.py` as small dicts keyed by enum values. Editing the copy is one file change with no code touches.
 
 ## Supporting modules
 

@@ -0,0 +1,61 @@
+# Section narrative writers
+
+This package owns every transformation the refiner makes to a CDA
+`<section>`'s human-readable narrative `<text>` element. A section's `<text>`
+is what a reviewer sees when they open a refined eICR in a CDA stylesheet; the
+machine-readable `<entry>` elements are handled elsewhere (the matching
+engines). These writers decide what story the `<text>` tells about what the
+refiner did.
+
+## Why this is its own module
+
+Everything that touches a section's `<text>` lives here so the narrative
+behavior — and the CDA R2 validity rules it has to respect — can be reasoned
+about in one place. The matching engines (`entry_matching`, `generic_matching`)
+and the orchestrator (`refine.py`) call into this package; they never build
+narrative elements directly.
+
+## Layout
+
+- **`elements.py`** — the shared low-level primitives. `_make_element` /
+  `_sub_element` emit namespace-qualified elements (every node written into
+  `<text>` must carry the `urn:hl7-org:v3` namespace or it fails
+  `NarrativeBlock.xsd`). `_ensure_text_element` places a `<text>` in the
+  correct CDA R2 `xs:sequence` slot. `remove_all_comments` scrubs stale source
+  comments. Every other module here builds on these.
+
+- **`footnote.py`** — the per-section provenance footnote. Refinement attaches
+  an unanchored `<footnote>` to every section (refined, retained, removed, or
+  narrative-stripped) carrying a one-row table: what the jurisdiction
+  configured vs. what the refiner actually did. The footnote's `xs:ID` encodes
+  the augmentation run's timestamp so a consumer can structurally tie every
+  footnote to the document's augmentation header.
+
+- **`writers.py`** — the narrative-body writers that replace or stub a
+  section's `<text>`:
+  - `replace_narrative_with_removal_notice` — strip the narrative to a notice
+    while keeping clinical entries for machine processing.
+  - `restore_narrative` — put back a saved `<text>` deep copy (the generic
+    matching path clears `<text>` during processing to avoid false matches,
+    then restores it).
+  - `create_minimal_section` — reduce a section to a `nullFlavor="NI"` stub
+    with a status message (no match found, or configured for removal).
+
+## Invariants
+
+- **Namespace everything.** All emitted elements go through
+  `_make_element` / `_sub_element`. A bare (unprefixed) element silently fails
+  `NarrativeBlock.xsd` validation.
+- **Respect the `xs:sequence`.** A `<text>` must sit after `<title>` (or
+  `<code>`) in `StrucDoc.Section`. Insertion always goes through the placement
+  helpers rather than a bare `append`.
+- **These functions mutate the section in place.** Consistent with the rest of
+  the `ecr` service; the pipeline owns parse/serialize.
+
+## Planned: narrative reconstruction
+
+A third narrative disposition — reconstruct the `<text>` from the entries that
+survived refinement — will land here as a `reconstruction.py` peer of
+`writers.py`, built on the same `elements.py` primitives. See
+`docs/decisions/0010_2026-06-05_narrative-reconstruction.md` for the design
+(typed-value renderer + per-`template_id` field maps + per-section joins).
@@ -0,0 +1,19 @@
+from .elements import remove_all_comments
+from .footnote import append_section_provenance_footnote
+from .reconstruction import reconstruct_narrative
+from .writers import (
+    create_minimal_section,
+    replace_narrative_with_reconstruction,
+    replace_narrative_with_removal_notice,
+    restore_narrative,
+)
+
+__all__ = [
+    "append_section_provenance_footnote",
+    "create_minimal_section",
+    "reconstruct_narrative",
+    "remove_all_comments",
+    "replace_narrative_with_reconstruction",
+    "replace_narrative_with_removal_notice",
+    "restore_narrative",
+]
@@ -33,18 +33,8 @@
 # NOTE:
 # TABLE HEADERS
 # =============================================================================
-# column headers for the narrative tables the refiner writes. the clinical
-# data table headers describe the columns for the refined clinical content
-# table; the provenance table headers describe the columns for the
-# per-section provenance footnote table
-
-CLINICAL_DATA_TABLE_HEADERS: Final[list[str]] = [
-    "Display Text",
-    "Code",
-    "Code System",
-    "Is Trigger Code",
-    "Matching Condition Code",
-]
+# column headers for the per-section provenance footnote table the refiner
+# writes into every section's narrative
 
 PROVENANCE_TABLE_HEADERS: Final[list[str]] = [
     "Section (LOINC)",

@@ -0,0 +1,105 @@
+from lxml import etree
+from lxml.etree import _Element
+
+from app.services.format import remove_element
+
+from ..model import (
+    HL7_NAMESPACE,
+    HL7_NS,
+)
+
+# NOTE:
+# ELEMENT FACTORY HELPERS
+# =============================================================================
+# every element emitted into <text> must be qualified with the HL7 v3
+# namespace for NarrativeBlock.xsd validation to pass
+
+
+def _make_element(local_name: str, **attribs: str) -> _Element:
+    """
+    Create a namespace-qualified narrative element.
+
+    Returns a detached element in the urn:hl7-org:v3 namespace. Use
+    `_sub_element` instead when the new element should be appended
+    to an existing parent.
+    """
+
+    element = etree.Element(f"{{{HL7_NAMESPACE}}}{local_name}")
+    for key, value in attribs.items():
+        element.set(key, value)
+    return element
+
+
+def _sub_element(parent: _Element, local_name: str, **attribs: str) -> _Element:
+    """
+    Create a namespace-qualified child element appended to `parent`.
+
+    Thin wrapper around etree.SubElement that applies Clark notation
+    for the HL7 v3 namespace, matching the pattern used in augment.py.
+    """
+
+    element = etree.SubElement(parent, f"{{{HL7_NAMESPACE}}}{local_name}")
+    for key, value in attribs.items():
+        element.set(key, value)
+    return element
+
+
+# NOTE:
+# TEXT PLACEMENT HELPERS
+# =============================================================================
+
+
+def _ensure_text_element(section: _Element) -> _Element:
+    """
+    Return the section's <text> element, creating one if absent.
+
+    If the section has no <text>, a new empty <text> is created and
+    inserted after <title> per the CDA R2 xs:sequence for
+    StrucDoc.Section: templateId -> id -> code -> title -> text ->
+    confidentialityCode -> languageCode -> subject -> author ->
+    informant -> entry -> component.
+
+    If there is no <title> either, the <text> is inserted after
+    <code>, which is the next-earliest required element in the
+    sequence. Last resort: append to the section.
+    """
+
+    text_element = section.find("hl7:text", namespaces=HL7_NS)
+    if text_element is not None:
+        return text_element
+
+    text_element = _make_element("text")
+
+    title_element = section.find("hl7:title", namespaces=HL7_NS)
+    if title_element is not None:
+        title_element.addnext(text_element)
+        return text_element
+
+    code_element = section.find("hl7:code", namespaces=HL7_NS)
+    if code_element is not None:
+        code_element.addnext(text_element)
+        return text_element
+
+    section.append(text_element)
+    return text_element
+
+
+# NOTE:
+# COMMENT CLEANUP
+# =============================================================================
+
+
+def remove_all_comments(section: _Element) -> None:
+    """
+    Remove all XML comments from a processed section.
+
+    After refining a section, inline comments left over from the source
+    document may no longer be accurate or relevant. This scrubs them
+    so the refined output doesn't carry misleading annotations forward.
+    """
+
+    xpath_result = section.xpath(".//comment()")
+    if isinstance(xpath_result, list):
+        for comment in xpath_result:
+            if isinstance(comment, etree._Element):
+                remove_element(comment)
@@ -0,0 +1,169 @@
+import re
+
+from lxml.etree import _Element
+
+from ..model import SectionProvenanceRecord
+from .constants import (
+    PROVENANCE_LABEL,
+    PROVENANCE_OUTCOME_NOTES,
+    PROVENANCE_SOURCE_NOTES,
+    PROVENANCE_TABLE_HEADERS,
+)
+from .elements import _ensure_text_element, _sub_element
+
+# NOTE:
+# PROVENANCE FOOTNOTE
+# =============================================================================
+# every section in the refined document carries a trailing <footnote>
+# documenting how the refiner treated it. the footnote is unanchored:
+# no <footnoteRef> points to it. This represents "annotation attached
+# to the section as a whole" — valid per NarrativeBlock.xsd's
+# StrucDoc.Text and StrucDoc.Footnote content models (both allow
+# footnote as an optional child with no anchoring requirement) and
+# sidesteps the need to walk arbitrary source narrative looking for
+# anchor points, which would be fragile across eICR vendors
+#
+# the footnote's xs:ID ties it to the augmentation run's timestamp,
+# giving the two structural consistency that a consumer can verify
+# programmatically (e.g., "every refiner footnote ID should contain
+# the timestamp present in the augmentation author's <time> value")
+#
+# the footnote's data row carries both the configured action ("what
+# the jurisdiction asked for") and the runtime outcome ("what the
+# refiner actually did"). the two columns let a reader see at a glance
+# whether a refiner policy override fired — most rows show the outcome
+# confirming the configuration, but the no-match policy override
+# produces an outcome that diverges from the configured action
+
+
+def _build_footnote_id(
+    loinc_code: str,
+    augmentation_timestamp: str,
+    occurrence_index: int = 0,
+) -> str:
+    """
+    Build a document-unique xs:ID for a refiner provenance footnote.
+
+    The ID is of the form
+    `ecr-refiner-{loinc}-{timestamp-digits}`, optionally with a
+    `-{n}` suffix for the rare case where the same LOINC appears on
+    multiple top-level sections in a single document. The timestamp
+    digits are extracted from the augmentation author's <time> value
+    (HL7 V3 `YYYYMMDDHHMMSS±ZZZZ` format) by keeping the leading
+    run of digits — the timezone offset is stripped because `+` and
+    the offset digits are not wanted in the ID.
+
+    xs:ID cannot start with a digit or hyphen, so the `ecr-refiner-`
+    prefix is load-bearing: it ensures the resulting string always
+    satisfies the XML Name production.
+
+    Args:
+        loinc_code: The section's LOINC code (e.g., "46240-8").
+        augmentation_timestamp: The augmentation author's time value,
+            shared across all footnotes in this refinement run.
+        occurrence_index: Zero-based disambiguator for the rare case
+            where the same LOINC appears on multiple top-level
+            sections. Zero (the normal case) produces no suffix;
+            nonzero values append `-N`.
+
+    Returns:
+        A document-unique xs:ID-safe string.
+    """
+
+    match = re.match(r"^\d+", augmentation_timestamp)
+    timestamp_digits = match.group(0) if match else ""
+    base = f"ecr-refiner-{loinc_code}-{timestamp_digits}"
+    return base if occurrence_index == 0 else f"{base}-{occurrence_index}"
+
+
+def append_section_provenance_footnote(
+    section: _Element,
+    provenance: SectionProvenanceRecord,
+    augmentation_timestamp: str,
+    occurrence_index: int = 0,
+) -> None:
+    """
+    Append an unanchored <footnote> carrying refiner provenance.
+
+    Called by refine_eicr after processing every section (refine,
+    retain, remove, narrative-removed) so that every section in the
+    refined document carries a consistent provenance record.
+
+    The footnote contains a bolded label paragraph followed by a
+    single-row table summarizing the jurisdiction's configuration
+    and the runtime outcome for this section. The table follows
+    NarrativeBlock.xsd's StrucDoc.Table content model with proper
+    <thead>/<th> header semantics and <tbody>/<tr>/<td> body rows.
+
+    The provenance record passed in must have its `outcome` field
+    finalized — refine_eicr does this via dataclasses.replace before
+    calling this function. If the field still holds its default
+    value at render time, that's a bug in refine_eicr's
+    interpretation logic, not in this function.
+
+    If the section has no <text> element (e.g., a retained section
+    where the source document omitted it), one is created and inserted
+    per `_ensure_text_element`'s CDA R2 xs:sequence rules.
+
+    Args:
+        section: The section element to annotate.
+        provenance: The SectionProvenanceRecord built during plan
+            creation and finalized by refine_eicr.
+        augmentation_timestamp: The augmentation run's <time> value,
+            shared across all footnotes in this refinement run.
+        occurrence_index: Disambiguator for repeated-LOINC sections;
+            zero for the normal case.
+    """
+
+    text_element = _ensure_text_element(section)
+
+    footnote_id = _build_footnote_id(
+        loinc_code=provenance.loinc_code,
+        augmentation_timestamp=augmentation_timestamp,
+        occurrence_index=occurrence_index,
+    )
+    footnote = _sub_element(text_element, "footnote", ID=footnote_id)
+
+    # bolded label paragraph
+    label_paragraph = _sub_element(footnote, "paragraph")
+    label_content = _sub_element(label_paragraph, "content", styleCode="Bold")
+    label_content.text = PROVENANCE_LABEL
+
+    # provenance table
+    table = _sub_element(footnote, "table", border="1")
+    thead = _sub_element(table, "thead")
+    header_row = _sub_element(thead, "tr")
+    for header in PROVENANCE_TABLE_HEADERS:
+        th = _sub_element(header_row, "th")
+        th.text = header
+
+    tbody = _sub_element(table, "tbody")
+    row = _sub_element(tbody, "tr")
+    _add_provenance_cell(row, provenance.loinc_code)
+    _add_provenance_cell(row, provenance.display_name)
+    _add_provenance_cell(row, "Yes" if provenance.include else "No")
+    _add_provenance_cell(row, provenance.action)
+    _add_provenance_cell(row, "Yes" if provenance.narrative == "retain" else "No")
+    _add_provenance_cell(
+        row,
+        f"v{provenance.config_version}"
+        if provenance.config_version is not None
+        else "—",
+    )
+    _add_provenance_cell(
+        row,
+        PROVENANCE_SOURCE_NOTES.get(provenance.source, str(provenance.source)),
+    )
+    _add_provenance_cell(
+        row,
+        PROVENANCE_OUTCOME_NOTES.get(provenance.outcome, str(provenance.outcome)),
+    )
+
+
+def _add_provenance_cell(row: _Element, text: str) -> None:
+    """
+    Append a single <td> with text content to a provenance table row.
+    """
+
+    td = _sub_element(row, "td")
+    td.text = text