Skip to content

Commit 093c451

Browse files
author
Sebastian Birk
committed
Merge
2 parents d2db7e4 + 5fcbb9c commit 093c451

5 files changed

Lines changed: 263 additions & 34 deletions

File tree

README.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# inflow
1+
# MintFlow
22

33
[![Tests][badge-tests]][link-tests]
44
[![Documentation][badge-docs]][link-docs]
55

6-
[badge-tests]: https://img.shields.io/github/actions/workflow/status/sebastianbirk/inflow/test.yaml?branch=main
7-
[link-tests]: https://github.com/sebastianbirk/inflow/actions/workflows/test.yml
8-
[badge-docs]: https://img.shields.io/readthedocs/inflow
6+
[badge-tests]: https://img.shields.io/github/actions/workflow/status/sebastianbirk/mintflow/test.yaml?branch=main
7+
[link-tests]: https://github.com/sebastianbirk/mintflow/actions/workflows/test.yml
8+
[badge-docs]: https://img.shields.io/readthedocs/mintflow
99

10-
Cellular decomposition of intrinsic and neighborhood-induced omic effects
10+
Microenvironment-induced and INtrinsic Transcriptomic FLOWs
1111

1212
## Installing the Python Environment
1313
**SANGER INTERNAL**: The environment is already available on farm.
@@ -20,8 +20,8 @@ conda activate /nfs/team361/aa36/PythonEnvs_2/envinflowdec27/
2020

2121
Alternatively, you can create the python environment yourself:
2222
```commandline
23-
git clone https://github.com/Lotfollahi-lab/inflow.git # clone the repo
24-
cd ./inflow/
23+
git clone https://github.com/Lotfollahi-lab/mintflow.git # clone the repo
24+
cd ./mintflow/
2525
conda env create -f environment.yml --prefix SOME_EMPTY_PATH
2626
```
2727

@@ -30,24 +30,24 @@ It's highly recommended to setup wandb before proceeding.
3030

3131
To do so:
3232
- Go to https://wandb.ai/ and create an account.
33-
- Create a project called "inFlow".
33+
- Create a project called "MintFlow".
3434

3535
## Quick Start
36-
You can use inflow as a local package, because it's not pip installable at the moment.
36+
You can use mintflow as a local package, because it's not pip installable at the moment.
3737

3838
To do so:
3939
```commandline
40-
git clone https://github.com/Lotfollahi-lab/inflow.git # clone the repo
41-
cd ./inflow/
40+
git clone https://github.com/Lotfollahi-lab/mintflow.git # clone the repo
41+
cd ./mintflow/
4242
```
43-
The easiest way to run inflow is through the command line interface (CLI).
43+
The easiest way to run MintFlow is through the command line interface (CLI).
4444
This involves two steps
4545
1. Creating four config files (you duplicate/modify template config files).
46-
2. Running inflow with a single command line.
46+
2. Running mintflow with a single command line.
4747

4848
### Rule of thumbs §1 for modifying the config files
4949
In the template config files, there are `TODO`-s of different types that you may need to modify
50-
- Category 1: `TODO:ESSENTIAL:TUNE`: the basic/essential parts to run inflow.
50+
- Category 1: `TODO:ESSENTIAL:TUNE`: the basic/essential parts to run mintflow.
5151
- Category 2: `TODO:TUNE`: less essneitial and/or technical details.
5252
- Category 3: `TODO:check`: parameters of even less importance compared to category 1 and category 2.
5353

@@ -58,7 +58,7 @@ If you are, for example, a biologist with no interest/experience in computationa
5858
Please follow these steps
5959
- Training data config file:
6060
- Make a copy of `./cli/SampleConfigFiles/config_data_train.yml` and rename it to `YOUR_CONFIG_DATA_TRAIN.yml`
61-
- Read the block of comments tarting with *"# Inflow expects a list of .h5ad files stored on disk, ..."*.
61+
- Read the block of comments tarting with *"# MintFlow expects a list of .h5ad files stored on disk, ..."*.
6262
- Modify some parts marked by `TODO:...` and according to *"Rule of thumbs §1"* explained above.
6363

6464

@@ -76,29 +76,29 @@ Please follow these steps
7676
- Make a copy of `./cli/SampleConfigFiles/config_training.yml` and rename it to `YOUR_CONFIG_TRAINING.yml`.
7777
- Modify some parts marked by `TODO:...` and according to *"Rule of thumbs §1"* explained above.
7878

79-
### Step 2 of Using the CLI: Running inflow
79+
### Step 2 of Using the CLI: Running MintFlow
8080

8181
```commandline
82-
cd ./inflow/ # if you haven't already done it above.
82+
cd ./mintflow/ # if you haven't already done it above.
8383
cd ./cli/
8484
85-
python inflow_cli.py \
85+
python mintflow_cli.py \
8686
--file_config_data_train YOUR_CONFIG_DATA_TRAIN.yml \
8787
--file_config_data_test YOUR_CONFIG_DATA_TEST.yml \
8888
--file_config_model YOUR_CONFIG_MODEL.yml \
8989
--file_config_training YOUR_CONFIG_TRAINING.yml \
9090
--path_output "./Your/Output/Path/ToDump/Results/" \
9191
--flag_verbose "True" \
9292
```
93-
The recommended way of accessing inflow predictions is by `adata_inflowOutput_norm.h5ad` and `adata_inflowOutput_unnorm.h5ad` created in the provided `--path_output`and `adata.obsm` and `adata.uns` in these files.
94-
In the former file `..._norm.h5ad` the readcount matrix `adata.X` as well as inflow predictions Xint and Xspl are row normalised, while in the latter file `_unnorm.h5ad` they are not.
93+
The recommended way of accessing MintFlow predictions is by `adata_mintflowOutput_norm.h5ad` and `adata_mintflowOutput_unnorm.h5ad` created in the provided `--path_output`and `adata.obsm` and `adata.uns` in these files.
94+
In the former file `..._norm.h5ad` the readcount matrix `adata.X` as well as MintFlow predictions Xint and Xspl are row normalised, while in the latter file `_unnorm.h5ad` they are not.
9595

96-
Inflow dumps a README file in the provided `--path_output`, as well as each subfolder therein.
96+
MintFlow dumps a README file in the provided `--path_output`, as well as each subfolder therein.
9797

9898
## Common Issues
99-
- Use absolute paths (and not relative paths like `../../some/path/`) in the config files, as well as when running `python inflow_cli.py ...`.
99+
- Use absolute paths (and not relative paths like `../../some/path/`) in the config files, as well as when running `python mintflow_cli.py ...`.
100100
- TODO: intro to the script for tune window width.
101-
- It's common to face out of memory issue in the very last step where the big anndata objects `adata_inflowOutput_norm.h5ad` and `adata_inflowOutput_unnorm.h5ad` are created and dumped.
101+
- It's common to face out of memory issue in the very last step where the big anndata objects `adata_mintflowOutput_norm.h5ad` and `adata_mintflowOutput_unnorm.h5ad` are created and dumped.
102102
If that step fails, the results are still accesible in the output path the subfolder `CheckpointAndPredictions/`.
103103
One can laod the `.pt` files by
104104
```python

cli/SampleConfigFiles/config_training.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,3 +69,7 @@ flag_finaleval_createanndata_alltissuescombined: "True" # TODO:check
6969
# - adata_inflowOutput_unnorm: the anndata "before" applying `sc.pp.normalize_total` where `adata.X` is not row normalised --> inflow predictions `Xint` and `Xspl` sum up to the unnormalised version of `adata.X`.
7070
# - adata_inflowOutput_norm: the anndata "after" applying `sc.pp.normalize_total` where `adata.X` is row normalised --> inflow predictions `Xint` and `Xspl` sum up to the normalised version of `adata.X`.
7171

72+
73+
method_ODE_solver: "dopri5"
74+
# The ODE solver, i.e. the `method` argument passed to the function `torchdiffeq.odeint`.
75+
# TODO: report the effect on running time.

cli/inflow_cli.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -594,7 +594,8 @@ def _convert_TrueFalse_to_bool(dict_input):
594594
'coef_zinb_spl_loglik': 1.0,
595595
'dict_config_batchtoken': {
596596
'flag_enable_batchtoken_flowmodule': config_model['flag_enable_batchtoken_flowmodule']
597-
}
597+
},
598+
'method_ODE_solver':config_training['method_ODE_solver']
598599
}
599600

600601
# create a list of `AdjMatPredLoss`-s ====
@@ -1450,7 +1451,7 @@ def _convert_TrueFalse_to_bool(dict_input):
14501451
if issparse(vects_sl['muxspl']):
14511452
vects_sl['muxspl'] = vects_sl['muxspl'].toarray() # TODO:implement visualizations directly for sparse Xspl.
14521453

1453-
list_predXspl.append(vects_sl['muxspl'])
1454+
list_predXspl.append(vects_sl['muxspl_before_sc_pp_normalize_total'])
14541455

14551456
del vects_sl
14561457
gc.collect()

src/inflow/cli/analresults/disentanglement_violinplot.py

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
import seaborn as sns
1212
import pandas as pd
1313
from tqdm.autonotebook import tqdm
14+
from scipy.sparse import issparse
1415

1516

1617
def func_eqeq(a, b):
@@ -50,25 +51,37 @@ def vis(
5051
]
5152
list_geneindex_inLR.sort()
5253

54+
np_X = adata_unnorm.X
55+
if issparse(np_X):
56+
np_X = np_X.toarray()
57+
5358
for cnt_vertical_slice in tqdm(range(min_cnt_vertical_slice, max_cnt_vertical_slice), desc="Creating violin plots for tissue {}".format(idx_slplus1)):
5459

5560
for nameop, op_eqorbiggerthaneq, func_operator in zip(['eq', 'biggerthaneq'], ['==', '>='], [func_eqeq, func_biggerthaneq]):
5661

57-
mask_inLR = func_operator(adata_unnorm.X.toarray()[:, list_geneindex_inLR], cnt_vertical_slice)
62+
mask_inLR = func_operator(np_X[:, list_geneindex_inLR], cnt_vertical_slice)
5863

59-
mask_notinLR = func_operator(adata_unnorm.X.toarray()[:, list(set(range(adata_unnorm.shape[1])) - set(list_geneindex_inLR))], cnt_vertical_slice)
64+
mask_notinLR = func_operator(np_X[:, list(set(range(adata_unnorm.shape[1])) - set(list_geneindex_inLR))], cnt_vertical_slice)
6065

61-
mask_all = func_operator(adata_unnorm.X.toarray(), cnt_vertical_slice)
66+
mask_all = func_operator(np_X, cnt_vertical_slice)
6267

6368

6469
slice_pred_inLR = pred_Xspl_rownormcorrected[:, list_geneindex_inLR][mask_inLR].flatten()
6570
slice_pred_notinLR = pred_Xspl_rownormcorrected[:, list(set(range(adata_unnorm.shape[1])) - set(list_geneindex_inLR))][mask_notinLR].flatten()
6671

72+
# make the denumerators `denum_notinLRDB` and `denum_inLRDB`
73+
if op_eqorbiggerthaneq == '==':
74+
denum_notinLRDB = cnt_vertical_slice
75+
denum_inLRDB = cnt_vertical_slice
76+
else:
77+
denum_notinLRDB = np_X[:, list(set(range(adata_unnorm.shape[1])) - set(list_geneindex_inLR))][mask_notinLR].flatten()
78+
denum_inLRDB = np_X[:, list_geneindex_inLR][mask_inLR].flatten()
79+
6780
plt.figure()
6881
sns.violinplot(
6982
data={
70-
'not in LR-DB': slice_pred_notinLR / ((cnt_vertical_slice + 0.0) if(op_eqorbiggerthaneq == '==') else adata_unnorm.X.toarray()[:, list(set(range(adata_unnorm.shape[1])) - set(list_geneindex_inLR))][mask_notinLR].flatten()),
71-
'in LR-DB': slice_pred_inLR / ((cnt_vertical_slice + 0.0) if(op_eqorbiggerthaneq == '==') else adata_unnorm.X.toarray()[:, list_geneindex_inLR][mask_inLR].flatten()),
83+
'not in LR-DB': slice_pred_notinLR / denum_notinLRDB,
84+
'in LR-DB': slice_pred_inLR / denum_inLRDB,
7285
},
7386
cut=0
7487
)

0 commit comments

Comments
 (0)