Skip to content

feat: add Streamlit UI for consent form compliance checking#39

Open
GovindhKishore wants to merge 13 commits intoga4gh:mainfrom
GovindhKishore:feat/streamlit-ui
Open

feat: add Streamlit UI for consent form compliance checking#39
GovindhKishore wants to merge 13 commits intoga4gh:mainfrom
GovindhKishore:feat/streamlit-ui

Conversation

@GovindhKishore
Copy link
Copy Markdown

Summary

Adds src/ui/app.py - a Streamlit frontend that exposes the full compliance pipeline to end users without requiring any CLI or code interaction.

Closes #38

Changes

New files

  • src/ui/__init__.py
  • src/ui/app.py - Streamlit frontend

Modified files

  • src/compliance/checker.py
    • extract_text updated to accept either a file path string (CLI usage) or a BytesIO object (Streamlit usage). For BytesIO, uses pypdf.PdfReader directly instead of PyPDFLoader which requires a file path on disk. filename parameter added to detect file type when BytesIO is passed.
    • run_compliance_check signature changed from file_path: str to consent_text: str.
    • llm parameter made optional (llm=None) to support Gemini backend which does not use a LangChain LLM instance.

UI behaviour

  1. User selects LLM backend - Gemini 2.5 Flash (free, local testing) or OpenAI GPT-4 (production)
  2. User enters API key for selected backend
  3. User uploads consent form (PDF or TXT)
  4. On button click:
    • Text extracted directly from BytesIO - no temp file written to disk
    • Study type auto-detected and displayed before LLM call
    • Total check count shown (universal + conditional breakdown)
    • Full pipeline runs - retrieval + single LLM call
  5. Results rendered in two sections:
    • Tier 1 - Universal Checks (8 checks, every form)
    • Tier 2 - Conditional Checks (triggered by detected study type)
    • Each result shows ✅ / ❌, check name, reason, GA4GH citation

Testing

Tested manually end-to-end with Gemini 2.5 Flash using a made up pediatric rare disease consent form. 19 checks ran (8 universal + 11 conditional across clinical_genomic, pediatric, rare_disease, large_scale). Verdicts and citations accurate against real GA4GH documents.

Notes for reviewer

  • All existing tests still pass
  • checker.py changes are backward compatible - extract_text still accepts file paths, existing CLI usage unchanged.
  • Gemini is for local testing only and is not added to requirements.txt. Production LLM remains OpenAI as specified in the project requirements.

@GovindhKishore
Copy link
Copy Markdown
Author

Hi @dedyli. This PR adds optional Gemini support only for testing convenience since it provides a free API, making it easier for reviewers and contributors to run the pipeline. The production backend remains OpenAI as per project requirements, and the implementation ensures that Gemini is purely optional and does not affect the existing OpenAI-based workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add Streamlit UI for consent form compliance checking

1 participant