Release v0.7.0 · settylab/kompot

[0.7.0] - 2026-04-13

Breaking changes

Drop Python 3.9 support: kompot now requires Python ≥ 3.10 (driven by mellon ≥ 1.7.0 dependency).

New simplified API

kompot.de(), kompot.da(), and kompot.smooth_expression() now use Settings dataclasses (GPSettings, FDRSettings, FilterSettings, StorageSettings, OutputSettings) so the common case stays simple while advanced options remain discoverable. The old compute_differential_* and compute_smoothed_expression() functions still work but emit a deprecation warning.
dry_run=True on de() prints a resource plan (memory, disk, field overwrites) without running the analysis. Replaces the standalone dry_run_differential_expression().
ModelSettings lets you inject pre-fitted predictors into de(), da(), and smooth_expression() to skip fitting or reuse models across runs.

New features

Null distribution inspection: return_full_results=True now includes a "null" key in the result dict exposing all null gene data: Mahalanobis distances, smoothed expression, fold changes, z-scores, and standard deviations. A lightweight alternative (OutputSettings(return_null_data=True)) returns only the summary table and metadata (gene indices, names, seed, provenance) without the full expression matrices.
External null distributions for FDR: supply your own null distribution instead of relying on column-shuffled null genes.
- FDRSettings(null_mahalanobis=...): pre-computed null Mahalanobis distances (e.g., from a control-vs-control run).
- FDRSettings(null_expression=(expr1, expr2)): raw null expression matrices fitted through the same GP model.
- FDRSettings(combine_with_internal=True): concatenate external and internal null distributions.
kompot.compute_fdr(real_mahal, null_mahal): standalone FDR computation from Mahalanobis distances (no AnnData needed). Returns a DataFrame with mahalanobis, pvalue, local_fdr, tail_fdr, is_de.
kompot.extract_null_distribution(adata): extract Mahalanobis distances from a DE run for reuse as a null distribution elsewhere.
kompot.recompute_fdr(adata, null_mahalanobis): recompute FDR on existing DE results with a new null distribution, updating adata.var in place.
DifferentialExpression.compute_fdr(null_mahal): sklearn-like method to compute FDR after predict(compute_mahalanobis=True).
Empirical variance (GPSettings(use_empirical_variance=True)): estimates per-gene heteroscedastic noise from GP residuals and adjusts Mahalanobis distances accordingly. Works with or without biological replicates.
CenteredLinear kernel for better extrapolation at cell-state boundaries (opt-in via cov_func; default remains Matern52).
More accurate uncertainty: density estimators now use mellon 1.7.1's default Laplacian optimizer instead of ADVI.

Run history and reproducibility

Run parameters are now stored grouped by Settings dataclass, making them directly reconstructible.
RunInfo.call_args() returns a kwargs dict that reproduces the run — edit it and pass to de()/da() to re-run with tweaked parameters.
RunInfo.to_settings() returns the Settings objects from a previous run for inspection.

Improvements

Input validation at construction time: all Settings dataclasses now validate fields in __post_init__. Invalid values like GPSettings(sigma=-1) or FDRSettings(threshold=1.5) raise immediately with a clear message instead of failing deep inside mellon or JAX. The public API functions (de(), da(), smooth_expression()) also validate AnnData inputs upfront (obsm key shape, condition existence, condition1 != condition2, gene names, landmarks dimensions).
Plotting functions return Optional[plt.Figure] (controlled by return_fig) instead of (fig, ax) tuples, and no longer call plt.show().
Consistent parameter naming across plot functions: background_color_key → color, de_column → direction_column, embedding_key → basis.
RunInfo HTML display now shows parameters hierarchically by Settings group (gp.sigma, fdr.threshold, …) instead of a flat list.
RunComparison shows individual changed fields (e.g. gp.ls_factor: 10.0 → 5.0) instead of opaque dict diffs.
kompot smooth CLI command for single-condition GP smoothing from the command line, matching the full Python API (condition selection, gene subsetting, empirical variance, sample variance).
--no-progress flag added to the DA CLI; progress bars can now be fully suppressed in both DA and DE.
DA CLI now exposes --store-arrays-on-disk, --disk-storage-dir, and --max-memory-ratio, matching the DE CLI's StorageSettings coverage.
FDR is disabled by default when sample_col is provided (not yet calibrated for sample variance). Override with FDRSettings(null_genes=...).
Remove statsmodels dependency.

Bug fixes

Restore shared-landmark precomputation in DE (requires mellon ≥ 1.7.1). Mellon's compute_landmarks had a silent string-vs-enum bug where gp_type="fixed" did not match GaussianProcessType.FIXED, causing the function to return None instead of the documented fall-through. Kompot's shared-landmark precomputation in DifferentialExpression.fit() and the per-condition fallback in ExpressionModel.fit() both routed through this code path, so on every DE call kompot was silently dropping the cross-condition shared landmark grid (each condition ended up with an independent full GP) and ignoring the user-supplied random_state for landmark selection (mellon's internal _compute_landmarks fell back to the hardcoded DEFAULT_RANDOM_SEED=42). Pinning mellon>=1.7.1 enables the fix transparently — no kompot code changes were required.
Shared landmarks across conditions in DA. DifferentialAbundance.fit() now passes gp_type="fixed" to compute_landmarks and forwards gp_type="fixed" to the per-condition DensityEstimators. Previously, when either condition had fewer cells than n_landmarks, mellon's auto-selection fell back to gp_type=FULL for that estimator, silently discarding the shared-landmark grid that DA had just computed on the combined data — the two density predictors then used independent full GPs, breaking the symmetry assumption behind the Mahalanobis-style abundance comparison. This brings DA into structural parity with DE.
Fix local FDR numerical instability (Grenander estimator replaces statsmodels Poisson GLM).
Fix tail FDR: replace Benjamini-Hochberg on empirical p-values (which breaks when n_null << n_genes) with fdrtool-style survival function ratio Fdr(d) = S_null(d) / S_mix(d).
Fix cell_filter docs: parameter includes matching cells, not excludes.
Fix missing field_mapping in DA run history: append_to_run_history was called before field_mapping was computed, so DA history entries never recorded which fields were written.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.7.0] - 2026-04-13

Breaking changes

New simplified API

New features

Run history and reproducibility

Improvements

Bug fixes

Uh oh!