Skip to content

Commit 8ffd8d8

Browse files
authored
Add Evo 1.5 (#104)
* add evo 1.5 * update readme
1 parent 8683dcb commit 8ffd8d8

3 files changed

Lines changed: 22 additions & 2 deletions

File tree

README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,12 @@ Evo has 7 billion parameters and is trained on [OpenGenome](https://huggingface.
88

99
We describe Evo in the paper [“Sequence modeling and design from molecular to genome scale with Evo”](https://www.science.org/doi/10.1126/science.ado9336).
1010

11+
We describe Evo 1.5 in the paper [“Semantic mining of functional _de novo_ genes from a genomic language model”](https://www.biorxiv.org/content/10.1101/2024.12.17.628962). We used the Evo 1.5 model to generate [SynGenome](https://evodesign.org/syngenome/), the first AI-generated genomics database containing over 100 billion base pairs of synthetic DNA sequences.
12+
1113
We provide the following model checkpoints:
1214
| Checkpoint Name | Description |
1315
|----------------------------------------|-------------|
16+
| `evo-1.5-8k-base` | A model pretrained with 8,192 context obtained by extending the pretraining of `evo-1-8k-base` to process 50% more training data. |
1417
| `evo-1-8k-base` | A model pretrained with 8,192 context. We use this model as the base model for molecular-scale finetuning tasks. |
1518
| `evo-1-131k-base` | A model pretrained with 131,072 context using `evo-1-8k-base` as the base model. We use this model to reason about and generate sequences at the genome scale. |
1619
| `evo-1-8k-crispr` | A model finetuned using `evo-1-8k-base` as the base model to generate CRISPR-Cas systems. |
@@ -194,3 +197,17 @@ Please cite the following publication when referencing Evo.
194197
URL = {https://www.science.org/doi/abs/10.1126/science.ado9336},
195198
}
196199
```
200+
201+
Please cite the following publication when referencing Evo 1.5.
202+
203+
```
204+
@article {merchant2024semantic,
205+
author = {Merchant, Aditi T and King, Samuel H and Nguyen, Eric and Hie, Brian L},
206+
title = {Semantic mining of functional de novo genes from a genomic language model},
207+
year = {2024},
208+
doi = {10.1101/2024.12.17.628962},
209+
publisher = {Cold Spring Harbor Laboratory},
210+
URL = {https://www.biorxiv.org/content/early/2024/12/18/2024.12.17.628962},
211+
journal = {bioRxiv}
212+
}
213+
```

evo/models.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99

1010

1111
MODEL_NAMES = [
12+
'evo-1.5-8k-base',
1213
'evo-1-8k-base',
1314
'evo-1-131k-base',
1415
'evo-1-8k-crispr',
@@ -35,7 +36,8 @@ def __init__(self, model_name: str = MODEL_NAMES[1], device: str = None):
3536

3637
if model_name == 'evo-1-8k-base' or \
3738
model_name == 'evo-1-8k-crispr' or \
38-
model_name == 'evo-1-8k-transposon':
39+
model_name == 'evo-1-8k-transposon' or \
40+
model_name == 'evo-1.5-8k-base':
3941
config_path = 'configs/evo-1-8k-base_inference.yml'
4042
elif model_name == 'evo-1-131k-base':
4143
config_path = 'configs/evo-1-131k-base_inference.yml'
@@ -59,6 +61,7 @@ def __init__(self, model_name: str = MODEL_NAMES[1], device: str = None):
5961

6062

6163
HF_MODEL_NAME_MAP = {
64+
'evo-1.5-8k-base': 'evo-design/evo-1.5-8k-base',
6265
'evo-1-8k-base': 'togethercomputer/evo-1-8k-base',
6366
'evo-1-131k-base': 'togethercomputer/evo-1-131k-base',
6467
'evo-1-8k-crispr': 'LongSafari/evo-1-8k-crispr',

evo/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
version = '0.3'
1+
version = '0.4'

0 commit comments

Comments
 (0)