Skip to content

Latest commit

 

History

History
54 lines (45 loc) · 2.63 KB

File metadata and controls

54 lines (45 loc) · 2.63 KB

Digital Phenotyping Foundation Model (DPFM)

Project Context

Samsung Medical Center x Samsung MX Health collaboration (2025.10~2028.06, n=1,250).

Core: AI research on lifelog data — foundation model representation learning, cross-modal alignment with genomic/clinical data, clinical outcome prediction.

Scientific framework: Allostasis (predictive regulation) provides the theoretical model for WHY lifelog patterns are meaningful — watch data is a continuous readout of the body's allostatic regulation. Allostasis is the lens, not the computational target.

Project Structure

digital-phenotyping-fm/
├── literature/        # 논문 공부 — notes, reviews, references.bib
├── src/dpfm/          # Core ML library
│   ├── data/          # Data processors (lifelog, omics, clinical)
│   ├── models/        # FM, alignment, predictor architectures
│   ├── training/      # Lightning modules (pretrain, align, finetune)
│   └── evaluation/    # Metrics
├── configs/           # Hydra YAML configs
├── data/              # Data (gitignored except schemas/)
│   └── schemas/       # Data dictionary (tracked)
├── experiments/       # Per-experiment directories (date_name/)
├── notebooks/         # Jupyter (exploration, analysis, figures)
├── reports/           # 발표자료, 보고서, 논문 원고
│   ├── presentations/ # PPT/Keynote
│   ├── progress/      # 연구경과 보고서
│   └── papers/        # 논문 drafts
├── scripts/           # CLI entry points
└── tests/             # Unit tests

Conventions

  • Python 3.10+, PyTorch + Lightning, Hydra configs
  • Package: pip install -e ".[dev]"
  • Tests: pytest
  • Lint: ruff check src/
  • Literature notes: literature/notes/{author}_{year}_{keyword}.md
  • Experiments: experiments/{YYYY-MM-DD}_{short_name}/
  • Data schemas in data/schemas/ — the only tracked data files

Data Modalities

  1. Lifelog — Samsung Health watch: HR, steps, sleep, stress, SpO2, calories, body composition
  2. Omics — WGS + 287 PGS, Olink Explore HT (proteomics), 16s rRNA (microbiome)
  3. Clinical — InBody, BP, CGM, blood chemistry (glucose, HbA1c, insulin, lipid panel)

Target Clinical Phenotypes

  • Primary: Hypertension, T2DM, ASCVD
  • Secondary: Dementia risk, Depression/Anxiety, Insomnia, Dyslipidemia, Obesity

NotebookLM Knowledge Base

  • "Allostasis at the Core of Brain Function" — 424 sources on allostasis theory
    • ID: 1846219f-a072-4544-9721-65a6aa89904f
    • Use mcp__notebooklm__notebook_query for source-grounded queries on allostasis framework