You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
[0.7.0] - 2026-04-13
Breaking changes
Drop Python 3.9 support: kompot now requires Python ≥ 3.10 (driven by mellon ≥ 1.7.0 dependency).
New simplified API
kompot.de(), kompot.da(), and kompot.smooth_expression() now use Settings dataclasses (GPSettings, FDRSettings, FilterSettings, StorageSettings, OutputSettings) so the common case stays simple while advanced options remain discoverable. The old compute_differential_* and compute_smoothed_expression() functions still work but emit a deprecation warning.
dry_run=True on de() prints a resource plan (memory, disk, field overwrites) without running the analysis. Replaces the standalone dry_run_differential_expression().
ModelSettings lets you inject pre-fitted predictors into de(), da(), and smooth_expression() to skip fitting or reuse models across runs.
New features
Null distribution inspection: return_full_results=True now includes a "null" key in the result dict exposing all null gene data: Mahalanobis distances, smoothed expression, fold changes, z-scores, and standard deviations. A lightweight alternative (OutputSettings(return_null_data=True)) returns only the summary table and metadata (gene indices, names, seed, provenance) without the full expression matrices.
External null distributions for FDR: supply your own null distribution instead of relying on column-shuffled null genes.
FDRSettings(null_mahalanobis=...): pre-computed null Mahalanobis distances (e.g., from a control-vs-control run).
FDRSettings(null_expression=(expr1, expr2)): raw null expression matrices fitted through the same GP model.
FDRSettings(combine_with_internal=True): concatenate external and internal null distributions.
kompot.compute_fdr(real_mahal, null_mahal): standalone FDR computation from Mahalanobis distances (no AnnData needed). Returns a DataFrame with mahalanobis, pvalue, local_fdr, tail_fdr, is_de.
kompot.extract_null_distribution(adata): extract Mahalanobis distances from a DE run for reuse as a null distribution elsewhere.
kompot.recompute_fdr(adata, null_mahalanobis): recompute FDR on existing DE results with a new null distribution, updating adata.var in place.
DifferentialExpression.compute_fdr(null_mahal): sklearn-like method to compute FDR after predict(compute_mahalanobis=True).
Empirical variance (GPSettings(use_empirical_variance=True)): estimates per-gene heteroscedastic noise from GP residuals and adjusts Mahalanobis distances accordingly. Works with or without biological replicates.
CenteredLinear kernel for better extrapolation at cell-state boundaries (opt-in via cov_func; default remains Matern52).
More accurate uncertainty: density estimators now use mellon 1.7.1's default Laplacian optimizer instead of ADVI.
Run history and reproducibility
Run parameters are now stored grouped by Settings dataclass, making them directly reconstructible.
RunInfo.call_args() returns a kwargs dict that reproduces the run — edit it and pass to de()/da() to re-run with tweaked parameters.
RunInfo.to_settings() returns the Settings objects from a previous run for inspection.
Improvements
Input validation at construction time: all Settings dataclasses now validate fields in __post_init__. Invalid values like GPSettings(sigma=-1) or FDRSettings(threshold=1.5) raise immediately with a clear message instead of failing deep inside mellon or JAX. The public API functions (de(), da(), smooth_expression()) also validate AnnData inputs upfront (obsm key shape, condition existence, condition1 != condition2, gene names, landmarks dimensions).
Plotting functions return Optional[plt.Figure] (controlled by return_fig) instead of (fig, ax) tuples, and no longer call plt.show().
kompot smooth CLI command for single-condition GP smoothing from the command line, matching the full Python API (condition selection, gene subsetting, empirical variance, sample variance).
--no-progress flag added to the DA CLI; progress bars can now be fully suppressed in both DA and DE.
DA CLI now exposes --store-arrays-on-disk, --disk-storage-dir, and --max-memory-ratio, matching the DE CLI's StorageSettings coverage.
FDR is disabled by default when sample_col is provided (not yet calibrated for sample variance). Override with FDRSettings(null_genes=...).
Remove statsmodels dependency.
Bug fixes
Restore shared-landmark precomputation in DE (requires mellon ≥ 1.7.1). Mellon's compute_landmarks had a silent string-vs-enum bug where gp_type="fixed" did not match GaussianProcessType.FIXED, causing the function to return None instead of the documented fall-through. Kompot's shared-landmark precomputation in DifferentialExpression.fit() and the per-condition fallback in ExpressionModel.fit() both routed through this code path, so on every DE call kompot was silently dropping the cross-condition shared landmark grid (each condition ended up with an independent full GP) and ignoring the user-supplied random_state for landmark selection (mellon's internal _compute_landmarks fell back to the hardcoded DEFAULT_RANDOM_SEED=42). Pinning mellon>=1.7.1 enables the fix transparently — no kompot code changes were required.
Shared landmarks across conditions in DA. DifferentialAbundance.fit() now passes gp_type="fixed" to compute_landmarks and forwards gp_type="fixed" to the per-condition DensityEstimators. Previously, when either condition had fewer cells than n_landmarks, mellon's auto-selection fell back to gp_type=FULL for that estimator, silently discarding the shared-landmark grid that DA had just computed on the combined data — the two density predictors then used independent full GPs, breaking the symmetry assumption behind the Mahalanobis-style abundance comparison. This brings DA into structural parity with DE.
Fix tail FDR: replace Benjamini-Hochberg on empirical p-values (which breaks when n_null << n_genes) with fdrtool-style survival function ratio Fdr(d) = S_null(d) / S_mix(d).
Fix cell_filter docs: parameter includes matching cells, not excludes.
Fix missing field_mapping in DA run history: append_to_run_history was called before field_mapping was computed, so DA history entries never recorded which fields were written.