Skip to content

Latest commit

 

History

History
64 lines (42 loc) · 2.51 KB

File metadata and controls

64 lines (42 loc) · 2.51 KB

BioAnalyzer Curator Table

Sortable, searchable online table of BioAnalyzer predictions for candidate curatable articles from PubMed, for real-world testing by curators (see Levi’s suggestion).

Zero-cost option: A static version of the curator table can be hosted on GitHub Pages (no server, no Docker). See docs/curator-table/README.md for setup. Use the Streamlit app when you need to collect curator feedback; use the static table for viewing and sharing predictions with curators at minimal cost.

Quick start

Recommended – use the CLI (Streamlit is included in the main project):

BioAnalyzer run table

Then open http://localhost:8501. Use --port to change the port:

BioAnalyzer run table --port 8502

Alternative – run Streamlit directly from the repo root (after pip install -e . or in Docker):

streamlit run curator_table/app.py

Or from this directory:

pip install -r requirements.txt   # only if not using main project install
streamlit run app.py

Data format

The app expects a CSV or Parquet file with at least:

  • PMID (required)
  • Title (recommended)
  • The 6 status columns:
    Host Species Status, Body Site Status, Condition Status,
    Sequencing Type Status, Taxa Level Status, Sample Size Status
    (values: PRESENT, PARTIALLY_PRESENT, ABSENT)

Optional: Journal, Summary, Year, Publication Date, Processing Time.

You can use:

  • Exports from the BioAnalyzer CLI/API (e.g. analysis_results.csv).
  • The validation dataset format (e.g. after merging predictions + metadata into one table with PMID, Title, and the six status columns).

Features

  • Search by PMID, title, journal, or summary.
  • Sort by any column (PMID, Title, or any status).
  • PubMed link per row (opens in a new tab).
  • Curator feedback: record verdict (Correct / Incorrect / Uncertain) per PMID; feedback is saved to curator_feedback.csv in the current working directory and can be downloaded for later analysis.

Scale and next steps

  • First version: 1k–5k rows with a single CSV/Parquet file (as above).
  • Larger runs: Run a big batch on SuperStudio, export results to CSV/Parquet (or a DB), then point the app at that file (or add a DB backend later).

See docs/CURATOR_TABLE_DESIGN.md for full design (scale, fields, feedback loop, APIs, and implementation plan).