You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MixFlow is a conditional flow-matching framework for descriptor-controlled generation. Instead of relying on a single Gaussian base distribution, MixFlow learns a mixture base and a descriptor-conditioned flow jointly, trained via shortest-path flow matching. This joint modeling is designed to extrapolate smoothly to unseen conditions and improve out-of-distribution generalization across tasks.
31
+
This repository contains research code for **shortest-path flow matching** with **descriptor-conditioned mixture bases** for descriptor-controlled generation.
32
+
33
+
Instead of relying on a single Gaussian base distribution, the method learns a **condition-dependent mixture base** jointly with a **descriptor-conditioned flow field**, trained via shortest-path (optimal transport) flow matching. Conditioning the base enables the model to adapt its starting distribution across conditions, improving **out-of-distribution (OOD) generalization** to unseen conditions.
30
34
31
35
## Publication
32
36
33
-
This project is based on the **MixFlow** manuscript.
37
+
This repository accompanies the arXiv manuscript:
34
38
35
-
-**Title:**MixFlow: Mixture-Conditioned Flow Matching for Out-of-Distribution Generalization
36
-
-**Authors:**Andrea Rubbi, Amir Akbarnejad, Mohammad Vali Sanian, Aryan Yazdan Parast, Hesam Asadollahzadeh, Arian Amani, Naveed Akhtar, Sarah Cooper, Andrew Bassett, Lassi Paavolainen, Pietro Liò, Sattar Vakili, Mo Lotfollahi
37
-
-**Link:**_TODO_
39
+
-**Title:***Shortest-Path Flow Matching with Mixture-Conditioned Bases for OOD Generalization to Unseen Conditions*
We construct a synthetic benchmark of letter populations, where each condition corresponds to a letter and a specific rotation. Each descriptor encodes the letter identity and rotation, and MixFlow learns a mixture base distribution per condition. This setup allows us to test extrapolation to unseen letters and rotation angles.
47
+
We construct a synthetic benchmark of letter populations, where each condition corresponds to a letter and a specific rotation. Each descriptor encodes the letter identity and rotation, and the model learns a mixture base distribution per condition. This setup allows us to test extrapolation to unseen letters and rotation angles.
44
48
45
49
### Morphological Perturbations
46
50
47
-
We evaluate MixFlow on high-content imaging data in feature space. Cells (from BBBC021 and RxRx1) are embedded with a vision backbone, and the model is trained to generate unseen phenotypic responses from compound descriptors alone.
51
+
We evaluate on high-content imaging data in feature space. Cells (from BBBC021 and RxRx1) are embedded with a vision backbone, and the model is trained to generate unseen phenotypic responses from compound descriptors alone.
48
52
49
53
### Perturbation Datasets
50
54
51
-
For transcriptomic perturbations, we use Chemical- or CRISPR-based single-cell datasets (Norman, Combosciplex, Replogle and iAstrocytes). Conditions correspond to perturbations' embeddings from pretrained models, and MixFlow is trained to model the distribution of perturbed cells.
55
+
For transcriptomic perturbations, we use Chemical- or CRISPR-based single-cell datasets (Norman, ComboSciPlex, Replogle and iAstrocytes). Conditions correspond to perturbation embeddings from pretrained models, and the model is trained to model the distribution of perturbed cells.
52
56
53
57
## Documentation
54
58
55
-
Check the <ahref="./docs/index.md"> documentation </a> for more information about how to use the model and get the data.
56
-
57
-
## License
58
-
59
-
This work is released with the MIT license, please see <ahref="./LICENSE"> the license file </a> for more information.
60
-
61
-
## Authors
59
+
Check the <ahref="./docs/index.md">documentation</a> for more information about how to use the model and get the data.
62
60
63
-
Andrea Rubbi, Amir Akbarnejad, Mohammad Vali Sanian, Aryan Yazdan Parast, Hesam Asadollahzadeh, Arian Amani, Naveed Akhtar, Sarah Cooper,
64
-
Andrew Bassett, Pietro Liò, Lassi Paavolainen, Sattar Vakili,
65
-
Mo Lotfollahi
61
+
## License
66
62
63
+
This work is released with the MIT license, please see <ahref="./LICENSE">the license file</a> for more information.
0 commit comments