-
Notifications
You must be signed in to change notification settings - Fork 163
Description
Hello there. It has been more than a month now that I have been trying to run scvelo without positive outcome. The script that I use (after numerous trials) is the following:
import dask.array as da
import scanpy as sc
import pandas as pd
import matplotlib.pyplot as plt
import scvelo as scv
spliced = sc.read_mtx("/data/ppapadopoulos/scvelo/spliced_reads/spliced.mtx")
spliced.var_names = pd.read_csv("/data/ppapadopoulos/scvelo/spliced_reads/genes.txt", header=None)[0]
spliced.obs_names = pd.read_csv("/data/ppapadopoulos/scvelo/spliced_reads/barcodes.txt", header=None)[0]
unspliced = sc.read_mtx("/data/ppapadopoulos/scvelo/unspliced_reads/unspliced.mtx")
unspliced.var_names = pd.read_csv("/data/ppapadopoulos/scvelo/unspliced_reads/genes.txt", header=None)[0]
unspliced.obs_names = pd.read_csv("/data/ppapadopoulos/scvelo/unspliced_reads/barcodes.txt", header=None)[0]
common_genes = spliced.var_names.intersection(unspliced.var_names)
common_obs = spliced.obs_names.intersection(unspliced.obs_names)
spliced = spliced[common_obs, :][:, common_genes]
unspliced = unspliced[common_obs, :][:, common_genes]
adata = sc.AnnData(X=spliced.X, var=spliced.var, obs=spliced.obs)
adata.layers['unspliced'] = unspliced.X
adata.layers['spliced'] = spliced.X
adata = adata[::100, :]
scv.pp.filter_and_normalize(adata, min_counts=1, min_counts_u=1, min_cells=1, log=True)
sc.pp.neighbors(adata, n_neighbors=10, n_pcs=20)
scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
scv.tl.recover_dynamics(adata, n_jobs=8)
I have access to a pc with 132GB RAM and 32 threads, although I try to find a memory efficient way of executing the process (several memory related errors). My main issues are obviously related to the last 3 commands. When I remove the duplicates from my samples, I finally get velocity vectors that are equal to zero. If I don't remove them, I get a corrupted neighbor graph that messes up the rest of the process. Even when everything seems to work, the scv.tl.recovery_dynamics fails after trying to be executed overnight.. If you have any advice, I would really appreciate it...