idea: Automated evidence synthesis for NF Data Portal (papers + datasets → consensus summaries)

# Automated evidence synthesis for NF Data Portal (papers + datasets → consensus summaries)

## Problem

Users have to read dozens of NF (Neurofibromatosis) papers and datasets to answer a simple question. That’s slow and inconsistent. We want automation that ingests each publication/dataset we collect, extracts key findings, and generates **auditable, citation-backed consensus** answers to user questions.

## Goal

Build a “Consensus-style” (https://consensus.app/home/about-us/) capability inside the NF portal:

* Batch-analyze publications and datasets on ingest.
* Let users ask scoped questions about NF and receive a **graded, evidence-weighted summary** with links to the underlying sources.
* Support filters (study type, cohort size, NF subtype, outcome, patient pop, etc.) and show uncertainty.

## User stories

* **Researcher/Clinician:** “Does MEK inhibition reduce plexiform tumor volume in NF1 pediatrics?” → portal returns a 3–6 sentence consensus with confidence, plus an evidence table of studies/datasets and direct citations.
* **Data curator:** New PubMed IDs or datasets are added → background job parses, extracts metadata/results, indexes embeddings, and (optionally) precomputes snapshots.
* **PI/Analyst:** Filter to RCTs vs. observational; export evidence table + citations for a grant or report.

## Scope (MVP)

1. **Ingestion & parsing**

   * Publications: PubMed/DOI metadata, PDF parsing (section-aware; e.g., GROBID or equivalent).
   * Datasets: core metadata (modality, NF subtype, sample size, outcomes), link to Synapse/accession.
2. **Extraction**

   * Study design (RCT, cohort, case series), cohort size, endpoints, effect directions/magnitudes when available.
   * NF-specific entities: NF1/NF2/Schwannomatosis, intervention, tumor type, age group.
3. **Indexing & retrieval**

   * Embeddings over abstract + results + methods + tables captions.
   * Store structured fields (Postgres) and vector index (OpenSearch/FAISS).
4. **Answer synthesis**

   * LLM generates a short consensus answer with:

     * Inline citation brackets mapping to exact papers/datasets
     * Confidence grade derived from evidence quality (see “Evidence grading”)
     * “What we don’t know yet” bullets (uncertainty)
5. **UI**

   * **Consensus Card** (answer + confidence + top 5 sources)
   * **Evidence Table** (sortable: study type, N, outcome, effect, link)
   * **Filters** (study type, NF subtype, age, outcome)
6. **Automation**

   * Background worker triggers on new/updated records.
   * Optional nightly job to refresh embeddings/summaries.
7. **Provenance & guardrails**

   * Every claim links back to **exact passage** (page + sentence range if possible).
   * “View Snapshot” per paper/dataset: key findings, methods, limitations.

## Evidence grading (initial heuristic)

* Weight = f(study type, sample size, recency, journal tier/peer review, consistency across studies).
* Display: **High / Moderate / Low** confidence.
* Show why: e.g., “3 RCTs (N=212) consistent direction; 1 small conflicting cohort study.”

## API sketch

* `POST /consensus/query` → {question, filters} → {answer, confidence, citations[], evidence[]}
* `GET /evidence?filters=...` → tabular results
* `POST /ingest/publication|dataset` → triggers parse/extract/index
* `GET /snapshot/{id}` → per-source extracted summary + provenance

## Data model (key fields)

* `Source`: id, type (publication|dataset), title, authors, year, journal, link, access
* `NF tags`: subtype, tumor type, population (age/sex), intervention
* `Study`: design, N, endpoints, effect (dir/magnitude if reported), limitations
* `Embeddings`: chunks + vector ids
* `Provenance`: docId, page, char offsets

## Acceptance criteria

* [ ] Upload/ingest ≥10 NF publications + ≥3 datasets → indexed without manual cleanup.
* [ ] Query returns a consensus answer in ≤ ~2–3 sentences **with** confidence label and ≥3 citations.
* [ ] Evidence table supports filter by **study type** and **NF subtype**.
* [ ] Each citation opens a per-source snapshot showing the **exact** supporting passage.
* [ ] No uncited claims in the consensus text.
* [ ] Background job auto-processes new items and updates the index.
* [ ] Basic red-team checks: model refuses to answer outside NF scope; shows “insufficient evidence” when appropriate.

## Nice-to-haves (post-MVP)

* Effect size extraction from tables/figures.
* Interactive comparison (“paper vs. paper” diffs, forest-plot style view).
* Human-in-the-loop feedback to correct extractions and boost good sources.
* Export to DOCX/CSV with citations.

## Risks & mitigations

* **Hallucinations:** Strict cite-or-silent rule; show source snippet; retrieval-first pipeline.
* **PII/licensing:** Only process permitted PDFs/datasets; respect data-use agreements.
* **Heterogeneous outcomes:** Normalize outcome terminology; map to controlled vocab where possible.

## Open questions

* Which NF ontologies/vocabularies do we standardize on?
* Do we restrict to human studies for clinical queries by default?
* Preferred embedding/model stack for our environment?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea: Automated evidence synthesis for NF Data Portal (papers + datasets → consensus summaries) #23

Automated evidence synthesis for NF Data Portal (papers + datasets → consensus summaries)

Problem

Goal

User stories

Scope (MVP)

Evidence grading (initial heuristic)

API sketch

Data model (key fields)

Acceptance criteria

Nice-to-haves (post-MVP)

Risks & mitigations

Open questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

idea: Automated evidence synthesis for NF Data Portal (papers + datasets → consensus summaries) #23

Description

Automated evidence synthesis for NF Data Portal (papers + datasets → consensus summaries)

Problem

Goal

User stories

Scope (MVP)

Evidence grading (initial heuristic)

API sketch

Data model (key fields)

Acceptance criteria

Nice-to-haves (post-MVP)

Risks & mitigations

Open questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions