Skip to content

Commit 7e5babd

Browse files
committed
Updated docs
1 parent f8de9ea commit 7e5babd

7 files changed

Lines changed: 209 additions & 9 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ and this project adheres to [Semantic Versioning][].
1010

1111
## 2.0.0
1212

13-
Major update to accomodate the scverse template.
13+
Major update to accomodate the scverse template {cite}`scverse`.
1414

1515
All functions have been rewritten to follow the new API, errors when running previous versions (`1.X.X`) are expected if `decoupler >= 2.0.0` is installed.
1616

src/decoupler/mt/_consensus.py

Lines changed: 32 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,40 @@ def consensus(
6565
result: dict | AnnData,
6666
verbose: bool = False,
6767
) -> Tuple[pd.DataFrame, pd.DataFrame] | None:
68-
"""
68+
r"""
6969
Consensus score across methods.
7070
71+
For each method, enrichment scores are split into positive and negative subsets
72+
and transformed independently into z-scores.
73+
74+
1. Subset values based on sign (direction).
75+
2. Mirror each subset into positive and negative values with the same magnitude.
76+
3. Compute z-scores for each subset: :math:`z_i = \frac{x_i - \mu}{\sigma}`.
77+
4. Restore the original signs to the z-scored values
78+
79+
This transformation ensures comparability across methods while preserving the
80+
biological interpretation of activation (positive) and inhibition (negative).
81+
The final consensus enrichment score :math:`ES` is computed as the mean of
82+
these signed z-scores across methods.
83+
84+
.. math::
85+
86+
ES = \frac{\sum_{m=1}^{M} z_{i}^{(m)}}{M}
87+
88+
Where:
89+
90+
- :math:`M` is the number of methods
91+
- :math:`z_{i}^{(m)}` is the z-score from method :math:`m`.
92+
93+
A two-sided :math:`p_{value}` is then calculated from the consensus score using
94+
the survival function of the standard normal distribution.
95+
96+
.. math::
97+
98+
p = 2 \times \mathrm{sf}\bigl(\lvert \mathrm{ES} \rvert \bigr)
99+
100+
%(yestest)s
101+
71102
Parameters
72103
----------
73104
result

src/decoupler/mt/_gsea.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,7 @@ def _func_gsea(
198198
199199
ES = L_{arg max |L|}
200200
201-
When multiple random permutations are done, statistical significance is assessed via empirical testing.
201+
When multiple random permutations are done (``times > 1``), statistical significance is assessed via empirical testing.
202202
203203
.. math::
204204
@@ -220,7 +220,7 @@ def _func_gsea(
220220
- :math:`\mu{+}` is the mean of positive values in :math:`ES_{rand}`
221221
- :math:`\mu{-}` is the mean of negative values in :math:`ES_{rand}`
222222
223-
Finally, the obtained math:`p_value` are adjusted by Benjamini-Hochberg correction.
223+
%(yestest)s
224224
225225
%(params)s
226226
%(times)s

src/decoupler/mt/_ora.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ def _func_ora(
176176
.. figure:: /_static/images/ora.png
177177
:alt: Over Representation Analysis (ORA) schematic.
178178
:align: center
179-
:width: 75%
179+
:width: 100%
180180
181181
Over Representation Analysis (ORA) scheme.
182182

src/decoupler/mt/_viper.py

Lines changed: 92 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -181,9 +181,100 @@ def _func_viper(
181181
penalty: int | float = 20,
182182
verbose: bool = False,
183183
) -> Tuple[np.ndarray, np.ndarray]:
184-
"""
184+
r"""
185185
Virtual Inference of Protein-activity by Enriched Regulon analysis (VIPER) :cite:`viper`.
186186
187+
This approach first ranks features based on their absolute values and computes a one-tail score.
188+
189+
.. math::
190+
191+
\begin{align}
192+
w &= \frac{w}{max(|w|)} \\
193+
l_{orig} &= 1_{w \neq 0} \\
194+
l &= \frac{l_{orig}}{\sum_{i=1}^{k} \frac{l_i}{max(l_{orig})}max(l_{orig})} \\
195+
q^{norm} &= \Phi^{-1}(2|q-0.5| + (1 + max(|q-0.5|))) \\
196+
S_1 &= \sum_{i=1}^{k}q_i^{norm}l_i(1-|w_i|) \\
197+
\end{align}
198+
199+
Where:
200+
201+
- :math:`w \in [-1, +1]` is a vector of interaction weights across features
202+
- :math:`l \in [0, 1]` is a vector of interaction likelihoods across features
203+
- :math:`q \in [0, 1]` is a vector of quantiles based on the molecular readouts across features
204+
- :math:`k` is the number of features in :math:`q`
205+
- :math:`\Phi^{-1}` is is the inverse of the cumulative distribution function of the standard normal distribution
206+
- :math:`q^{norm} \in [-\infty,+\infty]` are the z-scores of the deviation of quantiles from 0.5
207+
208+
:math:`S_1` encodes for the magnitude of the enrichment score, irrespective of the interaction signs in ``net``.
209+
210+
Then, :math:`q` are z-transformed and weighted by their interaction strength and likelihood.
211+
212+
.. math::
213+
214+
S_2 = \sum_{i=1}^{k}w_il_i(\Phi^{-1}(q_i))
215+
216+
In this case, :math:`S_2` takes the direction (sign) of interactions into consideration.
217+
218+
Afterwards, a summary score :math:`S_3` is obtained.
219+
220+
.. math::
221+
222+
S_3 =
223+
\begin{cases}
224+
(|S_2| + S_1) \times \mathrm{sgn}(S_2) & \text{if } S_1 > 0 \\
225+
S_2 & \text{if } S_1 < 0
226+
\end{cases}
227+
228+
An enrichment score :math:`ES` is obtained by comparing :math:`S_3` to a
229+
null model generated through an analytical approach that shuffles features.
230+
231+
.. math::
232+
233+
ES = S_3\sqrt{\sum_{i=1}^{k}l_{orig,i}^{2}}
234+
235+
Together with a :math:`p_{value}`
236+
237+
.. math::
238+
239+
p_{value} = \Phi(ES)
240+
241+
Additionaly, computing multiple sources simultaneously, a pleiotropic correction is employed.
242+
243+
In brief, all possible pairs of sources AB are generated under two conditions:
244+
245+
1. both A and B are significantly enriched (p < ``reg_sign=0.05``)
246+
2. they share at least ``n_targets=10`` features
247+
248+
Subsequently, a :math:`ES` and its associated :math:`p_{value}` is computed for
249+
both A (:math:`pA`) and B (:math:`pB`) based only on the shared features.
250+
Then the pleiotropy score (:math:`PS`) is computed.
251+
252+
.. math::
253+
254+
PS =
255+
\begin{cases}
256+
\frac{1}{(1+|\log_{10}(pB) - \log_{10}(pA)|)^{\frac{20}{n_a}}} \text{ if } pA < pB \\
257+
\frac{1}{(1+|\log_{10}(pA) - \log_{10}(pB)|)^{\frac{20}{n_b}}} \text{ if } pA > pB
258+
\end{cases}
259+
260+
Where:
261+
262+
- :math:`n_a` is the number of test pairs involving the source A
263+
- :math:`n_b` is the number of test pairs involving the source B
264+
265+
This score is used to update :math:`l_{orig}`.
266+
267+
.. math::
268+
269+
l_{orig, i} =
270+
\begin{cases}
271+
PS \times 1_{\{i \in A\}} \text{ if } pA < pB \\
272+
PS \times 1_{\{i \in B\}} \text{ if } pA > pB
273+
\end{cases}
274+
275+
A new :math:`ES` and :math:`p_{value}` are calculated following all
276+
the previous steps but using the updated :math:`l_{orig}`
277+
187278
%(yestest)s
188279
189280
%(params)s

src/decoupler/mt/_waggr.py

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,9 +146,55 @@ def _func_waggr(
146146
seed: int | float = 42,
147147
verbose: bool = False,
148148
) -> Tuple[np.ndarray, np.ndarray]:
149-
"""
149+
r"""
150150
Weighted Aggregate (WAGGR) :cite:`decoupler`.
151151
152+
This approach aggregates the molecular features :math:`x_i` from one observation :math:`i` with
153+
the feature weights :math:`w` of a given feature set :math:`j` into an enrichment score :math:`ES`.
154+
155+
This method can use any aggregation function, which by default is the weighted mean.
156+
157+
.. math::
158+
159+
ES = \frac{\sum_{i=1}^{n} w_i x_i}{\sum_{i=1}^{n} w_i}
160+
161+
Another simpler option is the weighted sum.
162+
163+
.. math::
164+
165+
ES = \sum_{i=1}^{n} w_i x_i
166+
167+
Alternatively, this method can also take any defined function :math:`f` as long at it aggregates :math:`x_i` and
168+
:math:`w` into a single :math:`ES`.
169+
170+
.. math::
171+
172+
ES = f(w_i, x_i)
173+
174+
This functionality makes it relatively easy to implement and try new enrichment methods.
175+
176+
When multiple random permutations are done (``times > 1``), statistical significance is assessed via empirical testing.
177+
178+
.. math::
179+
180+
p_{value}=\frac{ES_{rand} \geq ES}{P}
181+
182+
Where:
183+
184+
- :math:`ES_{rand}` are the enrichment scores of the random permutations
185+
- :math:`P` is the total number of permutations
186+
187+
Additionaly, :math:`ES` is updated to a normalized enrichment score :math:`NES`.
188+
189+
.. math::
190+
191+
NES = \frac{ES - \mu(ES_{rand})}{\sigma(ES_{rand})}
192+
193+
Where:
194+
195+
- :math:`\mu` is the mean
196+
- :math:`\sigma` is the standard deviation
197+
152198
%(yestest)s
153199
154200
%(params)s

src/decoupler/mt/_zscore.py

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,41 @@ def _func_zscore(
1515
flavor: str = 'RoKAI',
1616
verbose: bool = False,
1717
) -> Tuple[np.ndarray, np.ndarray]:
18-
"""
18+
r"""
1919
Z-score (ZSCORE) :cite:`zscore`.
2020
21+
This approach computes the mean value of the molecular features for known targets,
22+
optionally subtracts the overall mean of all measured features,
23+
and normalizes the result by the standard deviation of all features and the square
24+
root of the number of targets.
25+
26+
This formulation was originally introduced in KSEA, which explicitly includes the
27+
subtraction of the global mean to compute the enrichment score :math:`ES`.
28+
29+
.. math::
30+
31+
ES = \frac{(\mu_s-\mu_p) \times \sqrt m }{\sigma}
32+
33+
Where:
34+
35+
- :math:`\mu_s` is the mean of targets
36+
- :math:`\mu_p` is the mean of all features
37+
- :math:`m` is the number of targets
38+
- :math:`\sigma` is the standard deviation of all features
39+
40+
However, in the RoKAI implementation, this global mean subtraction was omitted.
41+
42+
.. math::
43+
44+
ES = \frac{\mu_s \times \sqrt m }{\sigma}
45+
46+
A two-sided :math:`p_{value}` is then calculated from the consensus score using
47+
the survival function :math:`sf` of the standard normal distribution.
48+
49+
.. math::
50+
51+
p = 2 \times \mathrm{sf}\bigl(\lvert \mathrm{ES} \rvert \bigr)
52+
2153
%(yestest)s
2254
2355
%(params)s
@@ -41,7 +73,7 @@ def _func_zscore(
4173
n = np.sqrt(np.count_nonzero(adj, axis=0))
4274
mean = mat.dot(adj) / np.sum(np.abs(adj), axis=0)
4375
es = ((mean - mean_all.reshape(-1, 1)) * n) / stds.reshape(-1, 1)
44-
pv = sts.norm.cdf(-np.abs(es))
76+
pv = 2 * sts.norm.sf(np.abs(z))
4577
return es, pv
4678

4779

0 commit comments

Comments
 (0)