Skip to content

Commit 499fb4d

Browse files
committed
Merge Polish v0.8.22 documentation gaps after release-readiness sweep
2 parents 7a9406a + 10dff2a commit 499fb4d

11 files changed

Lines changed: 324 additions & 13 deletions

docs/blog/skillbench-biomcp-skills.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Healthcare Agents Need Skills, Not Just Models
22

3-
*Curated skills lift healthcare agent pass rates from 34% to 86%. BioMCP ships one — optimized for 30+ biomedical sources across 12 entity types.*
3+
*Curated skills lift healthcare agent pass rates from 34% to 86%. BioMCP ships one — optimized for 30+ biomedical sources across the current public entity surface.*
44

55
![Agent Skills for Healthcare — BioMCP ships an embedded skill that lifts healthcare agent pass rates by +51.9pp](images/slide-7-agent-skills.png)
66

@@ -28,13 +28,14 @@ But a fast tool without domain knowledge is just a fast way to get lost.
2828

2929
BioMCP ships an embedded skill — a `SKILL.md` with procedural guidance that teaches agents how to actually do biomedical work. Not what the APIs return, but *how to investigate* a variant, *how to profile* a drug's safety, *how to trace* a resistance mechanism across genes, drugs, trials, and literature.
3030

31-
The skill covers all 12 entity types:
31+
The skill covers the public entity surface:
3232

3333
| Entity | What agents can do | Sources |
3434
|--------|-------------------|---------|
3535
| Gene | Function, pathways, druggability, disease associations | MyGene, Enrichr, OpenTargets |
3636
| Variant | Pathogenicity, clinical evidence, population frequency, trial matching | ClinVar, CIViC, OncoKB, MyVariant |
3737
| Trial | Search by condition, mutation, drug, status, location | ClinicalTrials.gov, NCI CTS |
38+
| Diagnostic | Find gene- and disease-linked diagnostic tests | NCBI GTR, WHO IVD, OpenFDA |
3839
| Drug | Labels, interactions, adverse events, approvals, targets | DrugBank, ChEMBL, OpenFDA |
3940
| Article | Literature search, citation graphs, full text | PubMed, Semantic Scholar, Europe PMC |
4041
| Disease | Ontology, gene associations, phenotype matching | Monarch, MyDisease, OpenTargets |
@@ -111,7 +112,7 @@ biomcp skill install
111112

112113
BioMCP auto-detects your agent — Claude Code, Codex, Gemini CLI, Pi, Cursor, Copilot — and installs the skill to the right directory.
113114

114-
Two commands. Thirty sources. Twelve entity types. The domain expertise to use them.
115+
Two commands. Thirty sources. The public entity surface. The domain expertise to use it.
115116

116117
---
117118

docs/blog/we-deleted-35-tools.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ They're not wrong. We learned the same lesson the hard way.
2020

2121
## The Problem
2222

23-
BioMCP connects AI agents to 20+ biomedical databases — PubMed, ClinVar, ClinicalTrials.gov, gnomAD, OpenFDA, and more — covering 12 entity types from genes and variants to clinical trials and adverse events. The original Python version worked, but it was fighting the agent at every turn.
23+
BioMCP connects AI agents to 30+ biomedical data sources — PubMed, ClinVar, ClinicalTrials.gov, gnomAD, OpenFDA, NCBI GTR, WHO IVD, and more — across a public entity surface that spans genes, variants, diagnostics, clinical trials, drugs, adverse events, and literature. The original Python version worked, but it was fighting the agent at every turn.
2424

2525
**36 tools ate the context window.** Every MCP connection loaded all tool descriptions — `article_searcher`, `trial_getter`, `variant_searcher`, and 33 more — consuming ~16,600 tokens before a single query. That's 8% of a 200K context window gone on tool signatures alone.
2626

docs/concepts/what-is-biomcp.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,18 @@ need a different mental model per endpoint.
2525

2626
## Entities
2727

28-
BioMCP covers entities across clinical, research, and regulatory domains:
28+
BioMCP covers a public entity surface across clinical, research, and regulatory
29+
domains:
2930

30-
**Core clinical entities:** gene, variant, trial, article, drug, disease
31+
**Core clinical and research entities:** gene, variant, trial, article, drug,
32+
disease, and diagnostic
3133

32-
**Extended entities:** pathway, protein, adverse-event, PGx (pharmacogenomics)
34+
**Extended entities:** pathway, protein, adverse-event, PGx (pharmacogenomics),
35+
GWAS, and phenotype
3336

34-
**Discovery entities:** GWAS and phenotype search
37+
**Discovery and local-analytics surfaces:** `discover` resolves free-text
38+
biomedical phrases into follow-up commands, while `study` operates on local
39+
downloaded cBioPortal-style datasets.
3540

3641
## Why this matters for agents
3742

docs/sources/gtr.md

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
---
2+
title: "NCBI Genetic Testing Registry MCP Tool for Diagnostics | BioMCP"
3+
description: "Use BioMCP to search GTR-backed genetic tests, fetch source-native diagnostic cards, and manage the local NCBI Genetic Testing Registry bundle."
4+
---
5+
6+
# NCBI Genetic Testing Registry
7+
8+
NCBI Genetic Testing Registry (GTR) is the right source when you need gene-centric genetic tests, laboratory offerings, testing methods, or condition-linked diagnostics. In BioMCP, GTR is the local-runtime backbone for the multi-source `diagnostic` entity: `search diagnostic --source gtr` stays on the NCBI bundle, while the default `--source all` route can merge GTR rows with WHO IVD rows and keeps source provenance visible.
9+
10+
BioMCP auto-downloads the GTR `test_version.gz` and `test_condition_gene.txt` bulk exports into `BIOMCP_GTR_DIR` or the default platform data directory on first diagnostic use, refreshes stale data after 7 days, reports `GTR local data (<root>)` in full health output, and exposes `biomcp gtr sync` for forced refreshes.
11+
12+
## What BioMCP exposes
13+
14+
| Command | What BioMCP gets from this source | Integration note |
15+
|---|---|---|
16+
| `search diagnostic --gene <symbol> --source gtr` | Gene-linked genetic test rows | Uses the local GTR condition/gene relation bundle for gene-centric lookup |
17+
| `search diagnostic --disease <name> --source gtr` | Condition-linked GTR diagnostic rows | Applies the same bounded disease phrase matching used by the multi-source diagnostic route |
18+
| `search diagnostic --type <method> --source gtr` | GTR testing-method filtered rows | Matches source-native GTR methods |
19+
| `search diagnostic --manufacturer <name> --source gtr` | Laboratory or provider filtered rows | Filters over GTR lab/provider metadata when available |
20+
| `get diagnostic GTR000006692.3` | Source-native GTR diagnostic summary card | GTR accessions are the detail identifiers |
21+
| `get diagnostic GTR000006692.3 genes` | Joined gene symbols for the test | JSON keeps the full deduped symbol arrays even when markdown rows are compact |
22+
| `get diagnostic GTR000006692.3 conditions` | Joined condition names for the test | Preserves GTR condition provenance |
23+
| `get diagnostic GTR000006692.3 methods` | GTR source-native testing methods | GTR supports `genes`, `conditions`, `methods`, and opt-in `regulatory` overlay sections |
24+
| `biomcp health` | GTR local readiness row | Reports whether the local bundle is configured, stale, missing, or available at the default path |
25+
| `biomcp gtr sync` | Explicit GTR refresh | Force-refreshes the local GTR bulk files |
26+
27+
## Example commands
28+
29+
```bash
30+
biomcp search diagnostic --gene BRCA1 --source gtr --limit 5
31+
```
32+
33+
Returns GTR genetic-test rows linked to BRCA1 with accession IDs ready for `get diagnostic`.
34+
35+
```bash
36+
biomcp search diagnostic --disease "hereditary breast cancer" --source gtr --limit 5
37+
```
38+
39+
Searches GTR condition-linked diagnostic rows using the bounded disease phrase filter.
40+
41+
```bash
42+
biomcp get diagnostic GTR000006692.3 genes conditions methods
43+
```
44+
45+
Fetches a source-native GTR diagnostic card with the main GTR-supported detail sections.
46+
47+
```bash
48+
biomcp gtr sync
49+
```
50+
51+
Force-refreshes the local GTR bulk bundle without waiting for the next automatic refresh.
52+
53+
```bash
54+
biomcp health
55+
```
56+
57+
Shows the `GTR local data` readiness row alongside WHO IVD and the other local-runtime bundles.
58+
59+
## API access
60+
61+
No BioMCP API key required. BioMCP auto-downloads the public GTR bulk files into
62+
`BIOMCP_GTR_DIR` or the default data directory on first use. The optional
63+
`regulatory` section for diagnostic cards is an OpenFDA device overlay and can
64+
benefit from `OPENFDA_API_KEY`; base GTR search and detail do not require a key.
65+
66+
## Official source
67+
68+
[NCBI Genetic Testing Registry](https://www.ncbi.nlm.nih.gov/gtr/) is the
69+
official NCBI diagnostic-test registry behind BioMCP's GTR-backed diagnostic
70+
workflow. BioMCP reads the public bulk export at
71+
<https://ftp.ncbi.nlm.nih.gov/pub/GTR/data/>.
72+
73+
## Related docs
74+
75+
- [Diagnostic](../user-guide/diagnostic.md)
76+
- [Disease](../user-guide/disease.md)
77+
- [Data Sources](../reference/data-sources.md)
78+
- [Troubleshooting](../troubleshooting.md)

docs/sources/index.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: "Biomedical Data Sources for AI Agents | BioMCP"
3-
description: "Explore BioMCP source guides for PubMed, ClinicalTrials.gov, ClinVar, OpenFDA, CDC WONDER VAERS, UniProt, gnomAD, Reactome, Semantic Scholar, ChEMBL, OpenTargets, SEER Explorer, CIViC, OncoKB, cBioPortal, DDInter, EMA, WHO Prequalification, WHO Prequalified IVD, CDC CVX/MVX, KEGG, PharmGKB / CPIC, Human Protein Atlas, and Monarch Initiative."
3+
description: "Explore BioMCP source guides for PubMed, ClinicalTrials.gov, ClinVar, OpenFDA, CDC WONDER VAERS, UniProt, gnomAD, Reactome, Semantic Scholar, ChEMBL, OpenTargets, SEER Explorer, CIViC, OncoKB, cBioPortal, DDInter, EMA, WHO Prequalification, NCBI Genetic Testing Registry, WHO Prequalified IVD, CDC CVX/MVX, MedlinePlus, KEGG, PharmGKB / CPIC, Human Protein Atlas, and Monarch Initiative."
44
---
55

66
# Biomedical Data Sources for AI Agents
@@ -31,8 +31,10 @@ Use these pages when you already know the provider you trust, the keyword you ar
3131
| DDInter | Structured drug-drug interactions, severity levels, and class-oriented partner review | [DDInter](ddinter.md) |
3232
| EMA | EU regulatory, safety, and shortage context for medicines | [EMA](ema.md) |
3333
| WHO Prequalification | WHO-backed medicine and vaccine prequalification search plus global access checks | [WHO Prequalification](who-prequalification.md) |
34+
| NCBI Genetic Testing Registry | Gene-centric genetic tests, GTR diagnostic cards, and local bundle lifecycle | [NCBI Genetic Testing Registry](gtr.md) |
3435
| WHO Prequalified IVD | Infectious-disease diagnostic products, assay formats, and WHO product-card provenance | [WHO Prequalified IVD](who-ivd.md) |
3536
| CDC CVX/MVX | Vaccine brand-to-antigen bridge for EMA/default lookups and explicit WHO vaccine search | [CDC CVX/MVX](cdc-cvx.md) |
37+
| MedlinePlus | Plain-language disease/symptom context for `discover` and disease `clinical_features` | [MedlinePlus](medlineplus.md) |
3638
| KEGG | KEGG pathway IDs, summary cards, and pathway genes | [KEGG](kegg.md) |
3739
| PharmGKB / CPIC | Pharmacogenomic recommendations, frequencies, and clinical annotations | [PharmGKB / CPIC](pharmgkb.md) |
3840
| Human Protein Atlas | Tissue expression, localization, and cancer-expression context | [Human Protein Atlas](human-protein-atlas.md) |

docs/sources/medlineplus.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: "MedlinePlus MCP Tool for Plain-Language Disease Context | BioMCP"
3+
description: "Use BioMCP to add MedlinePlus plain-language context to discover results and opt-in disease clinical-feature summaries."
4+
---
5+
6+
# MedlinePlus
7+
8+
MedlinePlus is the right source when you need plain-language disease or symptom context alongside structured biomedical identifiers. In BioMCP, MedlinePlus supplements `biomcp discover` for disease and symptom-oriented prompts; it is suppressed for gene, drug, pathway, and other flows where consumer-health prose would add noise.
9+
10+
MedlinePlus also backs the opt-in `get disease <name_or_id> clinical_features` section. That section surfaces reviewed clinical-summary rows for configured diseases and can fall back to embedded reviewed fixtures when live MedlinePlus search is unavailable.
11+
12+
## What BioMCP exposes
13+
14+
| Command | What BioMCP gets from this source | Integration note |
15+
|---|---|---|
16+
| `biomcp discover <query>` | Plain-language context for disease and symptom queries | Supplemental only; OLS4 remains the required structured-concept backbone |
17+
| `get disease <name_or_id> clinical_features` | MedlinePlus clinical-summary feature rows | Opt-in disease section that keeps consumer-health context out of default disease cards |
18+
19+
## Example commands
20+
21+
```bash
22+
biomcp discover "symptoms of Marfan syndrome"
23+
```
24+
25+
Returns structured discover follow-ups with supplemental MedlinePlus plain-language context when the query resolves as a disease or symptom flow.
26+
27+
```bash
28+
biomcp get disease "uterine leiomyoma" clinical_features
29+
```
30+
31+
Fetches the opt-in MedlinePlus-backed clinical-features section for a configured disease.
32+
33+
```bash
34+
biomcp get disease MONDO:0007947 clinical_features
35+
```
36+
37+
Uses a resolved disease identifier while keeping the MedlinePlus clinical-summary section explicit.
38+
39+
## API access
40+
41+
No BioMCP API key required. BioMCP uses the public MedlinePlus Search endpoint
42+
for supplemental discover context and disease clinical-feature summaries, with
43+
embedded reviewed fixtures as the offline fallback for configured diseases.
44+
45+
## Official source
46+
47+
[MedlinePlus](https://medlineplus.gov/) is the official NLM consumer-health
48+
site. BioMCP uses the public [MedlinePlus Search](https://wsearch.nlm.nih.gov/ws/query)
49+
endpoint for the surfaces described here.
50+
51+
## Related docs
52+
53+
- [Discover](../user-guide/discover.md)
54+
- [Disease](../user-guide/disease.md)
55+
- [Data Sources](../reference/data-sources.md)
56+
- [Source Licensing](../reference/source-licensing.md)

docs/sources/semantic-scholar.md

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,13 @@ description: "Use BioMCP to add Semantic Scholar TLDRs, citations, references, a
77

88
Semantic Scholar matters when you already have the paper and need the graph around it: the TLDR, the follow-up literature, the references it builds on, and the related papers worth checking next. It turns a flat article lookup into a literature-review workflow that an agent can keep extending without losing the thread.
99

10-
In BioMCP, `search article` does not expose `--source semantic-scholar`. Instead, Semantic Scholar is an automatic optional search leg when the filter set is compatible. The dedicated helper commands on this page are the direct reason to come here: `get article <id> tldr`, `article citations`, `article references`, and `article recommendations`.
10+
In BioMCP, `search article` does not expose `--source semantic-scholar`. Instead, Semantic Scholar is an automatic optional search leg when the filter set is compatible, with shared-pool mode at 1 req/2sec without `S2_API_KEY` and authenticated mode at 1 req/sec with the key. The dedicated helper commands on this page are the direct reason to come here: `get article <id> tldr`, `article citations`, `article references`, and `article recommendations`.
1111

1212
## What BioMCP exposes
1313

1414
| Command | What BioMCP gets from this source | Integration note |
1515
|---|---|---|
16-
| `search article` | Optional compatible search-leg enrichment | Semantic Scholar joins article search automatically when the filter set allows it |
16+
| `search article` | Optional compatible search-leg enrichment plus source status | Semantic Scholar joins article search automatically when the filter set allows it; `--source semantic-scholar` is not a public source switch |
1717
| `get article <id> tldr` | TLDR text, influence counts, and related article metadata | Dedicated Semantic Scholar helper |
1818
| `article citations <id>` | Citation graph rows | Dedicated Semantic Scholar helper |
1919
| `article references <id>` | Reference graph rows | Dedicated Semantic Scholar helper |
@@ -49,6 +49,27 @@ Returns a recommendations table with PMID, title, journal, and year columns.
4949

5050
Optional `S2_API_KEY` for dedicated quota and higher reliability. Configure it with the [API Keys](../getting-started/api-keys.md) guide and request one from the [Semantic Scholar API page](https://www.semanticscholar.org/product/api).
5151

52+
Without `S2_API_KEY`, BioMCP uses the shared unauthenticated pool at
53+
1 req/2sec. A shared-pool HTTP 429 fails fast with guidance to set the key
54+
instead of retrying against the same public pool. With `S2_API_KEY`, BioMCP
55+
sends authenticated requests at 1 req/sec and honors authenticated
56+
`Retry-After` responses before retrying. Source status and debug-plan output
57+
report `auth_mode` as `shared_pool` or `authenticated`, but never print the
58+
secret key or key prefix.
59+
60+
## Runtime behavior
61+
62+
`search article` exposes Semantic Scholar as an automatic compatible leg rather
63+
than a user-selectable source flag. Keep using `--source all`, `pubtator`,
64+
`europepmc`, `pubmed`, or `litsense2` for the public source switch; Semantic
65+
Scholar joins only when the article filters can support it.
66+
67+
JSON search responses can include redacted Semantic Scholar source status under
68+
`_meta.source_status[]`, and `--debug-plan` mirrors that redacted status in the
69+
article leg so operators can distinguish `ok`, `degraded`, and `unavailable`
70+
without exposing credentials. Degradation of the optional Semantic Scholar leg
71+
should not be read as a PubMed, Europe PMC, or PubTator failure.
72+
5273
## Official source
5374

5475
[Semantic Scholar](https://www.semanticscholar.org/) is the official literature-graph product behind BioMCP's TLDR and citation helper workflows.

docs/troubleshooting.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,46 @@ vaccine name/brand search when `--product-type vaccine` is set. It does not
314314
change WHO finished-pharma/API lookups or `get drug`, and pure `--region us`
315315
searches do not use the CVX root.
316316

317-
## 16) Diagnostic local data not available
317+
## 16) DDInter local data not available
318+
319+
Drug interaction commands depend on the local DDInter CSV bundle. BioMCP
320+
auto-downloads the eight ATC-sliced DDInter files on first use for
321+
`biomcp drug interactions <name>` and `get drug <name> interactions`, but full
322+
`biomcp health` is still the right readiness surface when you need to debug the
323+
local DDInter state:
324+
325+
```bash
326+
biomcp health
327+
```
328+
329+
Interpret the DDInter row like this:
330+
331+
- `configured`: `BIOMCP_DDINTER_DIR` is set and all required DDInter CSV files are present
332+
- `configured (stale)`: `BIOMCP_DDINTER_DIR` is set and complete, but at least one file is older than the 72-hour refresh window
333+
- `available (default path)`: BioMCP found a complete DDInter bundle in the default platform data directory
334+
- `available (default path, stale)`: the default-path DDInter bundle is complete but older than the 72-hour refresh window
335+
- `not configured`: no complete DDInter root was found at the default path, so DDInter-backed interaction rows are currently unavailable but the install is not considered broken
336+
- `error (missing: ...)`: BioMCP found a partial DDInter root or unreadable file; install the missing files or point `BIOMCP_DDINTER_DIR` at a complete bundle
337+
338+
If a refresh fails, retry explicitly with `biomcp ddinter sync`:
339+
340+
```bash
341+
biomcp ddinter sync
342+
```
343+
344+
If you need to override the default path:
345+
346+
```bash
347+
export BIOMCP_DDINTER_DIR="/path/to/ddinter"
348+
biomcp health
349+
```
350+
351+
Manual preseed remains supported for offline or controlled environments. A
352+
complete DDInter root must contain the eight public DDInter CSV files downloaded
353+
from the ATC-sliced DDInter bundle. Empty interaction results stay scoped to the
354+
current DDInter bundle and do not prove absence of clinical interactions.
355+
356+
## 17) Diagnostic local data not available
318357

319358
Diagnostic search/get flows depend on two local bundles:
320359

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,10 @@ nav:
5454
- DDInter: sources/ddinter.md
5555
- EMA: sources/ema.md
5656
- WHO Prequalification: sources/who-prequalification.md
57+
- NCBI Genetic Testing Registry: sources/gtr.md
5758
- WHO Prequalified IVD: sources/who-ivd.md
5859
- CDC CVX/MVX: sources/cdc-cvx.md
60+
- MedlinePlus: sources/medlineplus.md
5961
- KEGG: sources/kegg.md
6062
- PharmGKB / CPIC: sources/pharmgkb.md
6163
- Human Protein Atlas: sources/human-protein-atlas.md

0 commit comments

Comments
 (0)