Skip to content

error in run_ulm() #171

Open
Open
@rebeccaorourke-cu

Description

@rebeccaorourke-cu

Describe the bug
I get an error when trying to run the following similar to your Transcription Factor activity inference in the Pseudo-bulk functional analysis vignette

Mes1_tf_acts, Mes1_tf_pvals = dc.run_ulm(mat=Mes1_mat, net=collectri)
Mes1_tf_acts

TypeError Traceback (most recent call last)
Cell In[80], line 2
1 # Infer pathway activities with ulm
----> 2 Mes1_tf_acts, Mes1_tf_pvals = dc.run_ulm(mat=Mes1_mat, net=collectri)
3 Mes1_tf_acts

File ~/Documents/Projects/Bates/py_decoupler/.venv/lib/python3.9/site-packages/decoupler/method_ulm.py:109, in run_ulm(mat, net, source, target, weight, batch_size, min_n, verbose, use_raw)
107 net = rename_net(net, source=source, target=target, weight=weight)
108 net = filt_min_n(c, net, min_n=min_n)
--> 109 sources, targets, net = get_net_mat(net)
111 # Match arrays
112 net = match(c, targets, net)

File ~/Documents/Projects/Bates/py_decoupler/.venv/lib/python3.9/site-packages/decoupler/pre.py:258, in get_net_mat(net)
255 targets = X.index.values
256 X = X.values
--> 258 return sources.astype('U'), targets.astype('U'), X.astype(np.float32)

TypeError: float() argument must be a string or a number, not 'NAType'

To Reproduce
The error actually seems to stem from get_net_mat() which can be run without my count matrix

collectri = dc.get_collectri(organism='mouse', split_complexes=False)
collectri

source target weight pmid
Myc Tert 1 10022128;10491298;10606235;10637317;10723141;1...
AP1 Jun 1 10022869;10037172;10208431;10366004;11281649;1...
AP1 Jun 1 10022869;10037172;10208431;10366004;11281649;1...
AP1 Jun 1 10022869;10037172;10208431;10366004;11281649;1...
AP1 Jun 1 10022869;10037172;10208431;10366004;11281649;1...
... ... ... ...
Runx1 Lcp2 1
Runx1 Prr5l 1
Twist1 Gli1 1
Usf1 Nup188 1 22951020
Zfp148 Rnls 1 25295465

58549 rows × 4 columns

because the collecti has duplicated source/target lines, I removed these

collectri = collectri.drop_duplicates(subset=['source','target'], keep = False)
collectri = collectri.dropna()
collectri = collectri.drop('pmid', axis=1)
collectri = collectri.reset_index(drop=True)
collectri

source target weight
Myc Tert 1
Smad3 Jun 1
Smad4 Jun 1
Stat5a Il2 1
Stat5b Il2 1
... ... ...
Gata2 Psd4 1
Gata2 Tnfaip8l1 1
Max Serf2 1
Usf1 Nup188 1
Zfp148 Rnls 1

33717 rows × 3 columns

dc.get_net_mat(collectri)

TypeError Traceback (most recent call last)
Cell In[102], line 1
----> 1 dc.get_net_mat(collectri)

File ~/Documents/Projects/Bates/py_decoupler/.venv/lib/python3.9/site-packages/decoupler/pre.py:258, in get_net_mat(net)
255 targets = X.index.values
256 X = X.values
--> 258 return sources.astype('U'), targets.astype('U'), X.astype(np.float32)

TypeError: float() argument must be a string or a number, not 'NAType'

Expected behavior
A clear and concise description of what you expected to happen.

System

  • OS: [e.g. macOS M1 Sequoia]
  • Python version [3.9.1]
  • scanpy version [1.10.3]
  • decoupler version [1.9.0]
  • numpy version [2.0.2]
  • pandas version [2.2.3]

Additional context
It seemed appropriate to open a new issue rather than post this to the previous issue since to me they seem unrelated. I previously posted my issue with get_pseudobulk which was solved with installing decoupler version 1.9.0. I was unable to install version 1.9.2 as you had suggested. If the solution to the run_ulm error is to install 1.9.2, I need some help doing so.

python3 -m pip install 'decoupler==1.9.2'

ERROR: Could not find a version that satisfies the requirement decoupler==1.9.2 (from versions: 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.3.1, 1.3.2, 1.3.3, 1.3.4, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0)
ERROR: No matching distribution found for decoupler==1.9.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions