You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Second Surveillance-Toolkit PR in the cross-repo PartnerReport
upstreaming effort. Continues from PR-07a-code with the post-feedback
Reference / Partner Report polish, confidence-interval plumbing, the
Validation Report build-out, and the NeoIPC-Tools / NeoIPC-BuildTools
PowerShell module integration.
Code-only scope: this PR ships the `.qmd` / `.Rmd` / `.R` / `.ps1` /
`*.yml` / `*.yaml` / `*.lua` / `*.tex` / `*.psd1` source. Translation
artefacts produced by po4a (`.po`, `.pot`, `content.<lang>/_sR.yaml`,
`common.<lang>.yaml`, `glossary.<lang>.yaml`, locale-specific `.qmd`
wrappers, compiled `.mo`) land separately as PR-07b-translations,
which opens once this PR merges. The split is driven by Copilot's
hard review-size limits on translation-heavy PRs.
What the PR delivers
--------------------
* Validation Report build-out. 42 validation rule chunks under
`reports/Validation-Report/rules/` driving the report's findings
table, plus the localized problem-detail + solution content under
`<lang>/_problem_detail_*.Rmd` and `<lang>/_solution_*.Rmd`.
Renders to PDF via the existing Quarto + po4a pipeline.
* Confidence intervals throughout the Reference and Partner Reports.
Layered controls (`includeOverallConfidenceIntervals`,
`includeOwnConfidenceIntervals`, `includeReferenceConfidenceIntervals`)
plumb the neoipcr CI helpers into all rate tables; sparse-data
thresholds and footnotes mark cells where the underlying counts
fall below the report threshold.
* NeoIPC-Tools + NeoIPC-BuildTools modules. Vendored under
`scripts/modules/`, replacing the dot-sourced helper scripts.
NeoIPC-Tools wraps the DHIS2 client surface (private GET / DELETE
helpers in `Private/DHIS2Http.ps1`, public cmdlets in
`Public/{OrgUnits,Tracker,UserInfo,ReportHelpers,PAT}.ps1`).
NeoIPC-BuildTools centralizes the metadata-conversion + antibiotic
/ pathogen / code-map / object-properties helpers the build
scripts share.
* `Read-UserInfo` cmdlet. Query DHIS2 user accounts with optional
org-unit filtering, the `-Path` API-base parameter, and an edit-URL
builder for direct navigation. `Get-NeoipcServerKey` exposes the
cache-key construction the ArgumentCompleter and site-code writer
share.
* PowerShell wrapper alignment. `Language` parameter renamed to
`Locale` across the wrappers (carrying through Quarto's locale
conventions), JSON build-report sidecars under
`<report>/_output/`, `[CmdletBinding(SupportsShouldProcess)]` +
`$PSCmdlet.ShouldProcess()` gates on the render call sites, and
`$null -ne $Dhis2Port` guards on the `Nullable[int]` parameter
(so port 0 isn't conflated with absent).
* po4a YAML key updater hardening. `Update-Po4aYamlKeys.ps1` parses
the `[type: yaml] <master> $lang:<target>` config lines without
clobbering the surrounding `$Matches` state; `# manual-keys`
entries skip the key regeneration as intended.
* Locale alignment. Python tooling defaults switched from `gr` to
`el` for Greek to match the IETF / Weblate locale code; po4a
configs follow.
A handful of report-content concerns deferred from PR-07a land here.
See `tmp/upstreaming-review-log.md` in the workspace for the
deferral targets and the `Brar/Surveillance-Toolkit:PartnerReport`
reference points.
Copy file name to clipboardExpand all lines: .github/copilot-instructions.md
+31-16Lines changed: 31 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,20 +6,27 @@ This file documents the Surveillance-Toolkit repository. If this repository is c
6
6
7
7
## Guardrails
8
8
9
-
The first seven rules below are **universal** — mirrored in every NeoIPC repository's instruction files. If you add or change a universal guardrail here, add `<!-- SYNC: propagate to all repos -->` next to it so the change gets propagated when the workspace is next used. The last rule is specific to this repository.
9
+
The first ten rules below (those without a *(repo-specific)* tag) are **universal** — mirrored in every NeoIPC repository's instruction files. If you add or change a universal guardrail here, add `<!-- SYNC: propagate to all repos -->` next to it so the change gets propagated when the workspace is next used. The remaining rules are specific to this repository.
10
10
11
-
-**Never** put personal names or other identifying information in source code (comments, strings, commit messages, etc.).
12
-
-**Never** read, write, or access files under `secrets/`, `data/local/`, or `.env`.
11
+
-**Never** put personal names or other identifying information in source code (comments, strings, commit messages, etc.), except in copyright statements and file-header attribution lines (e.g. `Author:`, `@author`, `Copyright (c)` fields).
12
+
-**Never** read, write, or access files under `secrets/`, `data/local/`, or `.env`. This includes listing, globbing, searching, or interacting with these paths in any way — not just reading file contents. If the user provides a path under these directories, use it as-is without exploring the directory.
13
13
-**Never** push directly to `main` or `master` on this repository.
14
14
-**Never** make HTTP calls to the DHIS2 API or attempt to read JSON files returned from the DHIS2 API. These files contain sensitive surveillance data and are not needed for code-level tasks.
15
15
-**Never** put absolute local paths into files that get checked in. Use relative paths or generic placeholders. Local checkout paths are developer-specific and meaningless to others.
16
16
- Treat infection definitions in this repository as normative. When a conflict exists between code and definitions, **fix the code**, not the definitions.
17
+
-**Never** invent or paraphrase clinical definitions, thresholds, or measurement criteria. Always look up the normative text in `doc/protocol/` (or the relevant definition file) before writing or modifying footnotes, tooltips, or explanatory text that describes how a metric is defined or measured. If no protocol definition exists for the concept, flag it rather than guessing. *(repo-specific)*
17
18
-**Never** introduce non-permissive dependencies (fonts, libraries, templates). All fonts must be SIL OFL or equivalent.
18
19
-**Always** keep `CLAUDE.md` and `.github/copilot-instructions.md` in sync within this repository. When you modify one, apply the same change to the other.
20
+
-**Always** push back when evidence contradicts the user's suggestion or implied assumption. Do not defer to the user's position when authoritative sources (AMA Manual of Style, protocol definitions, language specifications, etc.) say otherwise. Present the evidence clearly and let the user decide.
21
+
-**Always** consider both personal data protection (GDPR) and organizational/reputational concerns when making decisions about data shared between partners, published in reports, or exposed through APIs. Small cell counts in shared reports can expose which departments had specific rare pathogens or resistance patterns.
22
+
-**Never** add an unconditional reference (formal `@tbl-*`/`@fig-*` or textual) to content that is conditionally included. If a table, figure, section, or any content depends on a configuration flag, all references to it must be conditional on the same flag. This applies to all conditionally present content: tables, figures, sections, reference data, confidence intervals, and any other content whose presence depends on configuration. When a text contains a cross-reference to conditional content, split it into a base string (always shown) and a conditional suffix (shown only when the target is present), provide two complete variants, or use a glue placeholder that resolves to the cross-reference when the target is present and to empty when it is not. *(repo-specific)*
19
23
- Do not use the R `argparse` package (it requires Python). Use shared `parse-args.R` or JSON parameter files instead. *(repo-specific)*
20
24
-**Never** use single letters or bare numbers as YAML keys in string resource files. po4a's YAML module fails to extract some single-letter keys (e.g., `u`), and short keys are not expressive. Use descriptive names instead (e.g., `female`/`male`/`undetermined` instead of `f`/`m`/`u`). When a YAML key must map to a short code from DHIS2, add a mapping in the R code. *(repo-specific)*
21
25
- String values must not be duplicated across YAML layers (glossary, common, report-specific) or across report-specific files. If two reports share a string, move it to `common.yaml`. Run `scripts/Test-StringResourceLayers.ps1` to check before committing changes to string resource files. *(repo-specific)*
22
-
- The **AMA Manual of Style** is the reference for human-language style questions (capitalisation, punctuation, terminology). The glossary may carry multiple casing variants of a term (e.g., lowercase for running text, title case for headings) — use whichever fits the context. *(repo-specific)*
26
+
- The **AMA Manual of Style** is the reference for human-language style questions (capitalisation, punctuation, terminology). The glossary may carry multiple casing variants of a term (e.g., lowercase for running text, title case for headings) — use whichever fits the context. Disease names are common nouns and are lowercase in running text (e.g., "necrotising enterocolitis", "pneumonia") unless they contain a proper noun (e.g., "Crohn's disease"). The sentence-case glossary variants (`_sc`) exist for labels and headings, not because the terms are proper nouns. *(repo-specific)*
27
+
-**Never** use imperative voice in Partner Report string resources (outlier interpretation, callout text, or any user-facing prose in `_sR.yaml`). The report cannot know the full clinical context; use suggestive phrasing ("this may indicate…", "…may warrant attention") instead of directives ("Review…", "Confirm…", "Read this…"). *(repo-specific)*
28
+
-**Always** use table-visible labels in outlier interpretation strings. The terms in callout prose must match the row labels shown in the corresponding table so readers can identify the referenced metric — but apply running-text casing, not label casing. For example, use "pneumonia" (from the Table 1 row label "Pneumonia") not "HAP", and "CVC-associated sepsis/BSI" (from the Table 2 row label) not "CVC-associated infection rate". When the same metric ID appears in multiple tables with different display labels (e.g., "CVC" in Table 2 vs Table 8), the `localize_metric_name()` function uses `table_name` context to resolve the correct label. *(repo-specific)*
29
+
- **Never** edit files that are generated by po4a or by `scripts/update-glossary-po.py`. These files are overwritten on every pipeline run. Generated files include: `common.<lang>.yaml`, `content.<lang>/` directories, `_quarto-<lang>.yml`, `Validation-Report/<lang>/` directories, `doc/protocol/<lang>/`, `glossary.<lang>.yaml`, and any other file that appears as a translation target in `po/*.po4a.cfg`. **Never** edit `.pot` files either — they are regenerated by po4a / the glossary script. When changing translatable content, follow this order: **(1)** edit the English source file (e.g., `common.yaml`, `content/_sR.yaml`, `glossary.yaml`), **(2)** run `scripts/Invoke-Localization.ps1 -Update` (or the appropriate po4a / `scripts/update-glossary-po.py` command) so the pipeline regenerates the `.pot` and updates the `msgid` entries in the `.po` files, **(3)** only then edit `msgstr` values in `po/<scope>.<lang>.po` (or use Weblate) to provide or fix translations against the now-current `msgid`. Editing `.po` files before step 2 risks writing translations against stale `msgid` strings that po4a will mark fuzzy or discard on the next run. *(repo-specific)*
23
30
24
31
---
25
32
@@ -137,23 +144,29 @@ Translatable content is managed via [po4a](https://po4a.org/) with Weblate for c
137
144
138
145
po4a is a Perl tool that is **incompatible with native Windows**. On Windows, always run it via **WSL**.
139
146
140
-
A recent version is required for all features. Use a git checkout of the master branch:
147
+
A recent version is required for all features. The repository includes po4a as a git submodule at `tools/po4a/`. Initialize it with:
141
148
142
149
```bash
143
-
# Typical setup (in WSL or Linux/macOS)
144
-
cd~/dev
145
-
git clone https://github.com/mquinson/po4a.git
150
+
git submodule update --init tools/po4a
146
151
```
147
152
148
-
**Invocation**: The dev checkout must be called with `PERLLIB` set so it finds its own libraries (not system-installed ones):
153
+
**Preferred interface**: Use `scripts/Invoke-Localization.ps1` instead of invoking po4a directly. It handles WSL, path resolution, and the full pipeline automatically:
154
+
155
+
```powershell
156
+
./scripts/Invoke-Localization.ps1 -Update # full pipeline (all configs + glossary)
157
+
./scripts/Invoke-Localization.ps1 -Update -Config reports # po4a for reports only
**Note:** The glossary (`glossary.yaml`) is **not** managed by po4a. It uses a custom script (`scripts/update-glossary-po.py`) that generates monolingual gettext PO with `msgctxt` for Weblate variant grouping and plural support. See the helper scripts table below.
168
182
169
183
### Target languages
170
184
171
-
af, de, es, et, fr, gr, it, ne, tr (9 languages)
185
+
af, de, el, es, et, fr, it, ne, tr (9 languages)
172
186
173
187
### Helper scripts (in `scripts/`)
174
188
175
189
| Script | Purpose |
176
190
|--------|---------|
191
+
|`Invoke-Localization.ps1`| Unified localization wrapper with tab completion. `-Update` runs the full pipeline (fix layers → YAML keys → po4a → glossary). `-Test` runs read-only validation. See `-Config`, `-Force`, `-DryRun` switches. |
177
192
|`Update-Po4aYamlKeys.ps1`| Auto-extract YAML keys for po4a config (run after changing YAML structure) |
178
193
|`Test-PoPlaceholders.ps1`| Validate placeholder consistency between source and translations |
179
194
|`update-glossary-po.py`| Convert `glossary.yaml` to/from monolingual gettext PO (replaces po4a for glossary). Requires `ruamel.yaml` and `polib`. Run after editing `glossary.yaml` to regenerate `.pot` and merge `.po` files. Use `--generate-yaml` to produce localized `glossary.<lang>.yaml`. |
@@ -190,7 +205,7 @@ When adding a new file to po4a that already has manual translations:
190
205
3. Add the file entry to the relevant `.po4a.cfg` (if not already present).
191
206
4. Use `po4a-gettextize` to import the existing translation into a **temporary**`.po` file:
5.**Remove fuzzy flags** from the gettextize output. `po4a-gettextize` marks most translations as `fuzzy` (even correct ones), and po4a ignores fuzzy translations when generating output. Strip them before merging:
196
211
```bash
@@ -200,7 +215,7 @@ When adding a new file to po4a that already has manual translations:
7. Verify with a round-trip: `PERLLIB=~/dev/po4a/lib ~/dev/po4a/po4a <config-file>` — check that the generated files match the backup.
218
+
7. Verify with a round-trip: `PERLLIB=tools/po4a/lib tools/po4a/po4a <config-file>` — check that the generated files match the backup.
204
219
205
220
**Important**: Run steps 4–6 in a **single WSL session** (one `wsl -e bash -c '...'` invocation). Temp files in `/tmp` do not persist across separate WSL invocations on Windows.
0 commit comments