Skip to content

Commit 2bfe5e7

Browse files
committed
docs: add Hugging Face model card
1 parent a9d9041 commit 2bfe5e7

2 files changed

Lines changed: 222 additions & 0 deletions

File tree

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
---
2+
license: apache-2.0
3+
language:
4+
- en
5+
library_name: geno-lewm
6+
base_model:
7+
- HuggingFaceBio/Carbon-500M
8+
datasets:
9+
- abdelstark/geno-lewm-data
10+
tags:
11+
- genomics
12+
- bioinformatics
13+
- variant-effect-prediction
14+
- world-model
15+
- carbon-500m
16+
- research
17+
---
18+
19+
# GenoLeWM model package
20+
21+
<p>
22+
<a href="https://huggingface.co/spaces/abdelstark/geno-lewm"><img alt="Space" src="https://img.shields.io/badge/Space-GenoLeWM-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000000"></a>
23+
<a href="https://huggingface.co/abdelstark/geno-lewm"><img alt="Model" src="https://img.shields.io/badge/Checkpoint-abdelstark%2Fgeno--lewm-FFD21E?style=for-the-badge&logo=huggingface&logoColor=000000"></a>
24+
<a href="https://huggingface.co/abdelstark/geno-lewm-runs/tree/main/geno-lewm-v021-strong-4f36eef-10k-r1"><img alt="Run tree" src="https://img.shields.io/badge/Run%20Tree-v0.2.1-0B7285?style=for-the-badge&logo=huggingface&logoColor=ffffff"></a>
25+
<a href="https://github.com/AbdelStark/GenoLeWM"><img alt="GitHub" src="https://img.shields.io/badge/GitHub-GenoLeWM-181717?style=for-the-badge&logo=github&logoColor=ffffff"></a>
26+
</p>
27+
28+
GenoLeWM is an alpha research project for action-conditioned latent world
29+
models over genomic edits. This repository contains the public v0.1 model
30+
package: the trainable GenoLeWM predictor/action-encoder artifacts, calibration
31+
file, training evidence, evaluation evidence, and checksums.
32+
33+
This is not a standard `transformers.AutoModel.from_pretrained()` package. The
34+
checkpoint is loaded by the `geno-lewm` runtime. Carbon-500M is a frozen state
35+
encoder dependency and is not bundled in this repository.
36+
37+
## Claim Boundary
38+
39+
Use this checkpoint as a research artifact for reproducible local scoring,
40+
artifact inspection, and method development. Do not use it for clinical
41+
diagnosis, clinical decision support, deployment readiness claims, privacy
42+
claims, or broad claims that GenoLeWM outperforms Carbon. The measured results
43+
below are narrow artifact-level evaluations.
44+
45+
## Published Artifacts
46+
47+
| Artifact | Location | Notes |
48+
| --- | --- | --- |
49+
| v0.1 release checkpoint | this repository | Stable public package `geno-lewm-v0.1.0-r1` |
50+
| Generated package model card | [`model_card.md`](https://huggingface.co/abdelstark/geno-lewm/blob/main/model_card.md) | Checksum-bound output from `tools.release.model_package` |
51+
| Training evidence | [`training_run_manifest.json`](https://huggingface.co/abdelstark/geno-lewm/blob/main/training_run_manifest.json), [`training_run_card.md`](https://huggingface.co/abdelstark/geno-lewm/blob/main/training_run_card.md), [`training_run_SHA256SUMS`](https://huggingface.co/abdelstark/geno-lewm/blob/main/training_run_SHA256SUMS) | Carbon-backed training run evidence |
52+
| Evaluation evidence | [`eval_metrics.json`](https://huggingface.co/abdelstark/geno-lewm/blob/main/eval_metrics.json), [`eval_report.md`](https://huggingface.co/abdelstark/geno-lewm/blob/main/eval_report.md), [`eval_config.effective.yaml`](https://huggingface.co/abdelstark/geno-lewm/blob/main/eval_config.effective.yaml) | Held-out chr21 ClinVar evaluation |
53+
| Efficiency evidence | [`efficiency_report.json`](https://huggingface.co/abdelstark/geno-lewm/blob/main/efficiency_report.json) | Release efficiency measurement |
54+
| Integrity manifest | [`SHA256SUMS`](https://huggingface.co/abdelstark/geno-lewm/blob/main/SHA256SUMS) | Package file hashes |
55+
| Interactive Space | [`abdelstark/geno-lewm`](https://huggingface.co/spaces/abdelstark/geno-lewm) | Artifact browser and checkpoint-backed scoring UI |
56+
| Dataset package | [`abdelstark/geno-lewm-data`](https://huggingface.co/datasets/abdelstark/geno-lewm-data) | Public data snapshot and data card |
57+
| v0.2.1 run tree | [`abdelstark/geno-lewm-runs`](https://huggingface.co/abdelstark/geno-lewm-runs/tree/main/geno-lewm-v021-strong-4f36eef-10k-r1) | Newer benchmark/demo checkpoint and result artifacts |
58+
59+
The generated `model_card.md` in this repository is intentionally terse because
60+
it is part of the checksum-bound release package. This top-level card is the
61+
human-facing Hugging Face model documentation.
62+
63+
## Model Identity
64+
65+
| Field | Value |
66+
| --- | --- |
67+
| Release id | `geno-lewm-v0.1.0-r1` |
68+
| Model version | `0.1.0` |
69+
| Manifest id | `sha256:861ec142cc87f3fac01751ef538553356dfba439e6da99064b4adb121e75c215` |
70+
| Predictor artifact | `predictor.safetensors` |
71+
| Predictor hash | `sha256:6642c604a1352727969c86664f291fd6d2193c1c65bc6f9baf9b716469c52731` |
72+
| Action encoder hash | `sha256:8b2311d768855ab440b26dbbef5ddbda252cc8bb2c69509d28fa4bcf8eff025a` |
73+
| Calibration hash | `sha256:d4cf4778ac8e5557d363aca43cd13723b0ed9983b83215ab164d2b642b886201` |
74+
| Frozen encoder | Carbon-500M, mounted as `/carbon` in release jobs |
75+
| Encoder revision | `5d31d59b3c845b288a13aedb1358934196852eec` |
76+
| Dataset snapshot | `geno-lewm-data-v0.1.0-r1` |
77+
78+
The newer Space default checkpoint is separate:
79+
`geno-lewm-v0.2.1-r1` in
80+
[`geno-lewm-v021-strong-4f36eef-10k-r1/suite/model`](https://huggingface.co/abdelstark/geno-lewm-runs/tree/main/geno-lewm-v021-strong-4f36eef-10k-r1/suite/model).
81+
It is published as run-tree evidence, not as a replacement for this stable v0.1
82+
model package.
83+
84+
## Training Summary
85+
86+
The v0.1 checkpoint was trained as a JEPA-style predictor over frozen
87+
Carbon-500M latent states.
88+
89+
| Field | Value |
90+
| --- | --- |
91+
| Run id | `first-snv-carbon-500m-r1` |
92+
| Config | `training_config.effective.yaml` |
93+
| Commit | `cd2bfccb33ec5a2df3c4707e8be8443f4682dad3` |
94+
| Samples | 160,000 |
95+
| Steps | 20,000 |
96+
| Final training loss | 0.36124 |
97+
| Status | completed |
98+
99+
## v0.1 Evaluation
100+
101+
Held-out ClinVar GRCh38 chr21, binary P/LP versus B/LB labels. Scores use
102+
`sigma_raw`; intervals are deterministic stratified bootstrap confidence
103+
intervals from `eval_metrics.json`.
104+
105+
| Split | N | Positives | Negatives | Metric | Value | 95% CI |
106+
| --- | ---: | ---: | ---: | --- | ---: | --- |
107+
| `eval_clinvar_chr21` | 3,000 | 494 | 2,506 | AUROC | 0.519160 | 0.491366 to 0.546846 |
108+
| `eval_clinvar_chr21` | 3,000 | 494 | 2,506 | Average precision | 0.165174 | 0.155331 to 0.177035 |
109+
| `eval_clinvar_chr21` | 3,000 | 494 | 2,506 | Balanced accuracy at 0.5 | 0.500000 | 0.500000 to 0.500000 |
110+
| `eval_clinvar_chr21` | 3,000 | 494 | 2,506 | Accuracy at 0.5 | 0.164667 | 0.164667 to 0.164667 |
111+
112+
Negative finding: this v0.1 slice does not establish useful clinical
113+
performance, non-coding performance, multi-edit behavior, or superiority over
114+
Carbon.
115+
116+
## v0.1 Efficiency
117+
118+
Measured by `tools.release.efficiency_report` on `cuda:NVIDIA H200`.
119+
120+
| Measurement | Value |
121+
| --- | ---: |
122+
| Single-variant latency | 494.056 ms |
123+
| Batched throughput | 2.024 variants/s |
124+
| Peak memory | 1,152,656,384 bytes |
125+
126+
## v0.2.1 Run-Tree Benchmark Evidence
127+
128+
The Space also exposes the newer `geno-lewm-v0.2.1-r1` checkpoint from the run
129+
tree. Its benchmark suite is broader than v0.1 and includes Carbon zero-shot
130+
comparisons, but the results are mixed and mostly negative relative to Carbon on
131+
the measured slices.
132+
133+
| Slice | N | Metric | GenoLeWM | Baseline | Delta |
134+
| --- | ---: | --- | ---: | ---: | ---: |
135+
| ClinVar coding | 16 | AUROC | 0.734375 | 0.921875 | -0.187500 |
136+
| ClinVar coding | 16 | Average precision | 0.852976 | 0.951923 | -0.098947 |
137+
| ClinVar coding | 16 | Balanced accuracy | 0.750000 | 0.687500 | +0.062500 |
138+
| ClinVar non-coding | 16 | AUROC | 0.562500 | 0.875000 | -0.312500 |
139+
| ClinVar non-coding | 16 | Average precision | 0.605456 | 0.914423 | -0.308967 |
140+
| ClinVar non-coding | 16 | Balanced accuracy | 0.437500 | 0.687500 | -0.250000 |
141+
| BRCA2 saturation | 32 | Spearman rho | 0.149194 | 0.476906 | -0.327713 |
142+
| TraitGym Mendelian | 32 | Spearman rho | -0.027965 | -0.083894 | +0.055929 |
143+
| Phased-haplotype rollout | 8 | Cosine mean | 0.288861 | 0.997831 | -0.708970 |
144+
| Synthetic edit-chain rollout | 8 | Cosine mean | 0.301608 | 0.991240 | -0.689631 |
145+
146+
The v0.2.1 readiness report is `ok=true` for artifact coverage and provenance.
147+
That is not a model-quality success claim. The rollout speed report is
148+
`ok=false`: k=5 measured 2.41x speedup against a 2x target, while k=20 measured
149+
2.47x against a 5x target and missed the target.
150+
151+
The v0.2.1 efficiency report measured one sample with no warmup on
152+
`cuda:NVIDIA H200`: 115,262.94 ms single-variant latency, 0.3095 variants/s
153+
throughput, and 1,966,149,632 bytes peak memory. Treat that as run evidence, not
154+
a production serving benchmark.
155+
156+
## Loading Artifacts
157+
158+
Install the package:
159+
160+
```bash
161+
python -m pip install "geno-lewm[train,eval]==0.2.1"
162+
```
163+
164+
Download the v0.1 model package:
165+
166+
```python
167+
from huggingface_hub import snapshot_download
168+
169+
model_dir = snapshot_download("abdelstark/geno-lewm")
170+
```
171+
172+
Download the v0.2.1 run-tree model artifacts:
173+
174+
```python
175+
from huggingface_hub import snapshot_download
176+
177+
run_dir = snapshot_download(
178+
"abdelstark/geno-lewm-runs",
179+
allow_patterns="geno-lewm-v021-strong-4f36eef-10k-r1/suite/model/*",
180+
)
181+
```
182+
183+
For scoring, Carbon-500M must also be available. The release manifests record
184+
the encoder as `/carbon` because training, evaluation, and demo jobs mounted
185+
`HuggingFaceBio/Carbon-500M` there at revision
186+
`5d31d59b3c845b288a13aedb1358934196852eec`. The Space can resolve and remap
187+
that encoder from the Hub before scoring.
188+
189+
Example single-variant invocation once the model directory and Carbon encoder
190+
are available:
191+
192+
```bash
193+
geno-lewm-score \
194+
--model-dir "$MODEL_DIR" \
195+
--backend auto \
196+
--variant chrSynthetic:3073:A:T \
197+
--window ACGTACGTACGTACGT \
198+
--window-start-bp 3064 \
199+
--receipt receipt.json
200+
```
201+
202+
The `REF` allele in `--variant` must match the supplied reference window at the
203+
variant locus. If it does not, scoring fails before model inference.
204+
205+
## Limitations
206+
207+
- Alpha research checkpoint; not a clinical, diagnostic, or deployment model.
208+
- v0.1 evaluation is narrow: held-out chr21 ClinVar P/LP versus B/LB labels.
209+
- v0.2.1 benchmark evidence is broader but mixed, with multiple negative deltas
210+
versus Carbon zero-shot and source-state rollout baselines.
211+
- Carbon-500M is required at runtime and is resolved separately from this model
212+
package.
213+
- Calibration is proof-scale and should be interpreted only within the reported
214+
artifact context.
215+
- Fixture outputs and UI demos are not model-quality evidence.
216+
217+
## License
218+
219+
Apache-2.0.

mkdocs.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,9 @@ nav:
151151
- FAQ: faq.md
152152
- Maintainers: maintainers.md
153153
- Implementation tracker: roadmap/IMPLEMENTATION.md
154+
- Release:
155+
- Hugging Face model card: release/huggingface-model-card.md
156+
- Signing keys: release/signing-keys.md
154157
- Community:
155158
- Contributing: contributing.md
156159
- Code of Conduct: code-of-conduct.md

0 commit comments

Comments
 (0)