Conversation
) (#186) * fix(drift): structured stub sentinel + WARNING for doc-only probes (#177) The 7 documentation-only plugins (alphagenome, cosmic_cgc, dbnsfp, ensembl_gene, gevir, gnomad_metrics, pqtl) returned a placeholder 'sha256:phase-k-stub-not-implemented' fingerprint. That fake sha256 was stamped into provenance as if a real probe ran, and at drift time the missing baseline surfaced as an indistinct 'probe_failed' (or a false-green 'clean' had a baseline been regenerated from the stub). Option 1 from #177: - Add api.stub_fingerprint(reason) returning a structured sentinel {probe_status: 'stub', reason, fingerprint: 'stub:no-programmatic-source'}. run_builder records the honest token instead of a fake sha256. - drift_runner invokes the probe first and short-circuits a probe_status == 'stub' result to a distinct status='stub' (no baseline required). - drift_cli prints a visible WARNING (stderr) for stubs and exits clean, so an intentional doc-only stub is never a silent false-green. - Rewrite the 7 stub probes to call stub_fingerprint() with a provider-specific reason. Adds one regression test guarding stub -> status='stub' (not clean/ probe_failed). * fix(drift): emit stub WARNING to stderr in --json mode + surface stubs in CI summary (#186 review) Addresses Copilot's review on PR #186: the stub WARNING was only emitted on the human-readable output path, so `hvantk drift --all --json` (used by .github/workflows/drift.yml, which captures stdout to a file) re-hid documentation-only stubs. - drift_cli: move the stub WARNING into a shared loop that runs in BOTH --json and human modes, writing to stderr so machine-readable stdout stays clean while CI step logs stay visibly non-green. - drift_to_pr.py: count `stub` entries in the summary and (new handle_stub) write a `- STUB:` line to GITHUB_STEP_SUMMARY, symmetric with drifted/probe_failed, so stubs are not dropped from the rendered CI summary. Docstring updated to document the `stub` status. - test: regression test asserting --json mode emits the stub WARNING to stderr (and keeps stdout clean JSON). The drift_to_pr.py step-summary handler was surfaced by an adversarial review of the CLI fix and verified offline via a --dry-run synthetic report. * test(drift): make stub-WARNING test work on Click >= 8.2 (mix_stderr removed) CI runs Click >= 8.2, which removed the `CliRunner(mix_stderr=...)` kwarg (streams are always captured separately there), so the test raised TypeError: CliRunner.__init__() got an unexpected keyword argument 'mix_stderr'. - Construct the runner with mix_stderr=False on Click 8.1.x, falling back to CliRunner() on >= 8.2 (try/except TypeError). - Assert on `result.stdout` (stdout-only on both versions) rather than `result.output` (which mixes stderr back in on >= 8.2). Verified the exact assertions pass under both Click 8.1.8 and 8.4.1.
…rden dedup, qc sci-notation, lint) (#187) * fix(#185): safe deferred-review polish (catalog data_source, gff3, burden dedup, qc sci-notation, lint) Addresses the mechanical / behavior-preserving subset of #185. Each item was re-verified at HEAD (post PR-5..PR-8 refactor) before changing. Catalog metadata: - Normalize data_source "Custom" -> canonical names across the 10 migrated genomics catalogs (ClinGen, ClinVar, dbNSFP, Ensembl, GeVIR, gnomAD, GTEx, GWAS Catalog, INSIDER, MSigDB) so `catalog list --data-source <Name>` and unified_registry.search(data_source=...) match the real source name. - ensembl_gene file format "gff" -> "gff3" (path is *.gff3.gz; cosmetic). Code: - enrichex/burden.py _build_gene_to_sets_ht: dedup genes within a set via dict.fromkeys (order-preserving) so a duplicated gene is not double-counted after explode_rows. Latent (parse_geneset_tsv dedups upstream); guards other callers. + regression test (verified non-vacuous). - hgc/qc_report.py render_qc_summary_markdown: scientific notation for very small magnitudes (tiny HWE p-values) instead of collapsing to "0.000"; exact 0 stays "0.000". - tools/plugins/plugins_cli.py: split the bundled-schema read and the user catalog read into separate try blocks (correct error attribution) and chain exceptions with `from exc` (Ruff B904). - tools/enrichex/burden_cli.py: drop placeholder-less f-string prefixes (Ruff F541), 7 occurrences. Deliberately NOT included (need maintainer decisions, tracked in #185): - schema-enum tightening for data_source - gnomad_metrics variant_types SV/CNV scope (domain call) - gwas_catalog / msigdb URL pinning vs provenance-only - unified_registry.py fail-fast for an unmapped primary_domain (blast radius) * fix(#185): dedup gene_sets at compute_geneset_burden_mt entry for size/coverage consistency (#187 review) Copilot flagged that deduping only inside _build_gene_to_sets_ht left gene_set_size (len(genes)), gene_coverage_pct, and the min_gene_set_size filter computed from the RAW list — so a caller passing duplicate genes within a set would get a deduped membership (n_genes_found) inconsistent with a non-deduped size/denominator. Fix: normalize each set's gene list once at the top of compute_geneset_burden_mt (order-preserving dict.fromkeys) so the min-size filter, gene_set_size, gene_coverage_pct, and the gene->set membership all derive from the same deduped genes. No-op for the canonical pipeline (parse_geneset_tsv already dedups). The _build_gene_to_sets_ht dedup is kept as defense for direct callers. + regression test asserting gene_set_size reflects the deduped count.
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a "stub" probe sentinel and shared stub helper; enriches drift-runner results and flows to surface stub warnings; converts several probe modules to use the stub helper; adds gene-set deduplication and tests; corrects dataset catalog metadata and minor CLI/formatting tweaks. ChangesDrift Probe Stub Status Infrastructure
Gene-Set Burden Deduplication
Dataset Catalog and Metadata Updates
Supporting Improvements
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| Documentation | 26 minor |
| ErrorProne | 1 high |
| Security | 11 high |
🟢 Metrics 8 complexity
Metric Results Complexity 8
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Pull request overview
This PR improves the drift-detection pipeline to explicitly support documentation-only (“stub”) datasets, ensuring they surface as visible warnings rather than producing misleading “clean” or “probe_failed” results. It also tightens gene set burden calculations by deduplicating within-set gene lists consistently, and applies small catalog and report-rendering corrections.
Changes:
- Add a first-class stub probe contract (
stub_fingerprint/PROBE_STATUS_STUB) and propagate stub status through drift runner, CLI output, and CI reporting. - Normalize/deduplicate gene lists within gene sets to prevent double-counting and inconsistent set-size/coverage metrics; add regression tests.
- Apply minor catalog metadata fixes and improve QC markdown formatting for very small numeric values.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
.github/scripts/drift_to_pr.py |
Surfaces stub entries in CI step summary output without opening PRs for them. |
hvantk/core/plugin/api.py |
Introduces standardized stub fingerprint API/constants for probes that cannot be programmatically fingerprinted. |
hvantk/core/plugin/drift_runner.py |
Propagates stub probe results as status="stub" and changes runner ordering to probe-first to avoid false “probe_failed” on stub baselines. |
hvantk/tools/plugins/drift_cli.py |
Emits stub WARNINGs to stderr in both human and --json modes while keeping stdout machine-readable. |
hvantk/tools/plugins/plugins_cli.py |
Improves error handling when reading bundled catalog schema / catalog JSON. |
hvantk/algorithms/enrichex/burden.py |
Deduplicates genes within each gene set (order-preserving) to keep sizes/membership consistent and avoid double counting. |
hvantk/tools/enrichex/burden_cli.py |
Minor string formatting cleanup in CLI output. |
hvantk/algorithms/hgc/qc_report.py |
Uses scientific notation for very small magnitudes to avoid rendering as 0.000. |
hvantk/tests/test_drift_runner.py |
Adds regression test ensuring stub probes are reported as stub (not false-green or probe_failed) when baseline is missing. |
hvantk/tests/test_drift_cli.py |
Adds regression test ensuring stub WARNING appears on stderr even in --json mode. |
hvantk/tests/enrichex/test_burden_hail.py |
Adds regression tests for within-set deduplication affecting gene_set_size and gene→set membership. |
hvantk/skills/alphagenome/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/cosmic_cgc/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/dbnsfp/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/ensembl_gene/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/gevir/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/gnomad_metrics/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/pqtl/drift_probe.py |
Converts stub probe implementation to standardized stub_fingerprint(...). |
hvantk/skills/clingen/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/clinvar/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/dbnsfp/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/ensembl_gene/catalog/datasets.json |
Corrects data_source metadata and fixes file format from gff to gff3. |
hvantk/skills/gevir/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/gnomad_metrics/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/gtex_eqtl/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/gwas_catalog/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/insider/catalog/datasets.json |
Corrects data_source metadata. |
hvantk/skills/msigdb/catalog/datasets.json |
Corrects data_source metadata. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| observed = _invoke_with_timeout(spec.drift_probe, timeout=timeout) | ||
| except DriftProbeError as exc: | ||
| return DriftResult( | ||
| dataset_name=spec.name, | ||
| status="probe_failed", | ||
| expected=expected, | ||
| probe_error=exc, |
There was a problem hiding this comment.
Valid — verified at drift_runner.py (the #186 probe-first reordering): the two probe-failure except branches early-returned probe_failed without ever reading spec.test_paths.drift_fingerprint, so a probe exception combined with a missing baseline surfaced only the probe error.
Fixed in c4484ab: both probe-failure paths now go through a _probe_failed() helper that appends expected fingerprint is missing at <path> when the baseline file is absent, while preserving the underlying probe error. No change when a baseline exists (so real plugins are unaffected). Added a regression test asserting the error mentions both the probe failure and the missing fingerprint.
…188 review) Copilot (PR #188) noted that the probe-first ordering introduced in #186 means a probe_failed early-return no longer mentions a missing expected fingerprint: if the probe raises AND no baseline is committed, only the probe error surfaced, masking the missing-baseline config error. Route both probe-failure paths through a _probe_failed() helper that appends "expected fingerprint is missing at <path>" when the baseline file is absent, while preserving the underlying probe error. No change when a baseline exists. + regression test (probe raises + no baseline -> error mentions both).
This pull request introduces robust support for documentation-only ("stub") datasets in the drift detection pipeline. It standardizes the handling and reporting of sources that lack programmatically accessible data, ensuring they are surfaced as visible warnings rather than producing misleading results. Additionally, the pull request improves deduplication logic in gene set burden calculations and makes minor catalog corrections.
Drift detection improvements:
status="stub"for datasets with no programmatic probe, updating the drift pipeline and reporting to handle these as surfaced warnings rather than silent successes or failures. This includes a newhandle_stubfunction, updates to CLI summary output, and changes to the drift runner logic to recognize and propagate stub status. (.github/scripts/drift_to_pr.py,hvantk/core/plugin/drift_runner.py) [1] [2] [3] [4] [5] [6] [7]stub_fingerprint,PROBE_STATUS_STUB,STUB_FINGERPRINT_TOKEN) to be used by drift probes that cannot programmatically fingerprint their source. (hvantk/core/plugin/api.py)alphagenome,cosmic_cgc,dbnsfp,ensembl_gene,gevir) to use the newstub_fingerprinthelper and provide clear reasons for stub status. (hvantk/skills/alphagenome/drift_probe.py,hvantk/skills/cosmic_cgc/drift_probe.py,hvantk/skills/dbnsfp/drift_probe.py,hvantk/skills/ensembl_gene/drift_probe.py,hvantk/skills/gevir/drift_probe.py) [1] [2] [3] [4] [5]Gene set burden calculation improvements:
hvantk/algorithms/enrichex/burden.py) [1] [2]Data catalog corrections:
data_sourcefield in several dataset catalogs to reflect the actual upstream source instead of "Custom" and fixed a file format typo from"gff"to"gff3". (hvantk/skills/clingen/catalog/datasets.json,hvantk/skills/clinvar/catalog/datasets.json,hvantk/skills/dbnsfp/catalog/datasets.json,hvantk/skills/ensembl_gene/catalog/datasets.json,hvantk/skills/gevir/catalog/datasets.json) [1] [2] [3] [4] [5] [6]Reporting improvements:
hvantk/algorithms/hgc/qc_report.py)Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests