Starting from version 0.20.1, this format is based on Keep a Changelog, and this project adheres to Semantic Versioning. The full commit history is available in the commit logs.
- Added a flag to turn on or off Importance Sampling in {class}
scvi.external.RESOLVI{meth}~scvi.external.RESOLVI.differential_expression, {pr}3708. - Add dispersion tests, including support for {class}
scvi.external.SCVIVA, {pr}3677. - Add support for Pandas3, {pr}
3638. - Add support for running scVI-Tools on TPU, {pr}
3690.
- Fix checkpointing for {class}
scvi.model.TOTALVI, {pr}3651. - Fix Integrated Gradients gets cont and categ covs in the reverse order, {pr}
3660. - Fix minified adata load into non-minified model, {pr}
3691.
- Change the use of Figshare as storage to SCVERSE S3, {pr}
3667. - Change explicit training configuration objects for scvi-tools, reducing reliance on loose kwargs
and improving clarity across training APIs, {pr}
3666.
- Removed all Jax tests from mandatory tests and put them under a special tag, {pr}
3703.
- Add MLFlow support, {pr}
3573. - Add support for MuData during Ray autotune {pr}
3545. - Add {meth}
~scvi.external.TorchMRVI.get_normalized_expressionfunction to {class}scvi.external.TorchMRVI, {pr}3579. - Add support for anndatabatch dataloading, {pr}
36XX. - Add modality auto-ordering for mudata in {class}
~scvi.model.MULTIVI{pr}3622and fix DE.
- Fix {class}
scvi.model.TOTALVIconvert_legacy_save function with updated model parameters {pr}3561. - Fix configurable mdata filename for {class}
scvi.autotune.AutotuneExperiment, {pr}3580. - Fix inference on GPU for PyTorch implementation of {class}
scvi.external.MRVI, {pr}3586. - Fix model loading and DE with labels in PyTorch implementation of {class}
scvi.external.MRVI, {pr}3615. - Fix in non-multi-GPU training to have history in memory, and not on disk by default {pr}
3543. - Fix missing model history for multi-GPU training, and add an option to log on step {pr}
3516. - Fix external indices validation in {class}
scvi.dataloaders.SemiSupervisedDataSplitter{pr}3601. - Fix issues with hub model loaded from path during scarches query
and model loaded with adata=None while adata not exists {pr}
3628. - Fix
batch_sizein {class}scvi.external.ContrastiveVIdata loader {pr}3629.
- Update model {class}
scvi.model.DestVIwith fine cell-type classifier {pr}3380.
- Removed graceful shutdown from Jupyter notebook, {pr}
3556. - Removed several {class}
~scvi.external.SCBASSETtests that caused failure on GitHub actions, {pr}3632.
- Add a PyTorch implementation of {class}
scvi.external.MRVI{pr}3304. - Add checkpointing with {class}
scvi.autotune.AutotuneExperiment{pr}3452. - Add downstream analysis functions multi-GPU support {pr}
3443. - Add {class}
scvi.external.CYTOVIfor dealing with cytometry data {pr}3456. - Add scArches support for {class}
scvi.external.SCVIVA{pr}3494. - Add a backend parameter and make the class {class}
scvi.external.MRVIa wrapper to the Jax or Torch implementations {pr}3498. - Add Lightning checkpointing to trainer fit {pr}
3501. - Add MuData minification option to {class}
~scvi.model.MULTIVI{pr}3039. - Add solver parameter for SVD stability in {class}
scvi.autotune.AutotuneExperiment{pr}3524.
- Fix library size calculation in {class}
scvi.model.TOTALVI{pr}3452. - Fix scArches surgery in {class}
scvi.external.SysVI{pr}3466. - Fix VAE load size mismatch when using extra covariates with custom datamodule {pr}
3461. - Fix {class}
~scvi.external.POISSONVIdifferential_accessibility {pr}3473.
- Made the Jax dependency optional in scvi-tools {pr}
3426.
- Remove the support for Python 3.10, {pr}
3441. - Remove the support for setup_anndata in {class}
~scvi.model.MULTIVI, {pr}3486.
- Add support for using AnnCollection {class}
scvi.dataloaders.CollectionAdapterdataloader for {class}scvi.model.SCVIand {class}scvi.model.SCANVI, {pr}3362.
- Add a fix to {func}
~scvi.model.SCVI.differential_expression, {pr}3418. - Add {class}
scvi.module.base.SupervisedModuleClassto the classifier, {pr}3430.
- Temporarily pinned the Jax version to <0.7.0 to be able to install numpyro.
- Removed a bad legacy code in {class}
scvi.model.base.ArchesMixin, {pr}3417. - Removed Deprecated {class}
scvi.train.SaveBestStatefrom code {pr}3420.
- Added posterior predictive samples batch projection. {pr}
3369. - Added getting protein probabilities in {class}
~scvi.model.MULTIVI{pr}3341. - Add {class}
scvi.external.SCVIVAfor representation of cells and their environments in spatial transcriptomics {pr}3172. - Add support for Python 3.13 {pr}
3247.
- Fix bug in {class}
scvi.external.TOTALANVIscArches {pr}3355. - Fix bug in {class}
scvi.external.MRVIdownstream analysis use of external adata {pr}3324. - Fix bug in perplexity calculation in {class}
scvi.model.AmortizedLDA{pr}3373.
- Update Read the Docs tutorials with one main preprocessing tutorial {pr}
3363.
- Removed default arguments from test function parameters due to ruff pre-commit v0.12.0 with
PT028 rule {pr}
3393.
- Add {class}
scvi.external.METHYLANVIfor modeling methylation-labeled data from single-cell bisulfite sequencing (scBS-seq) {pr}3066. - Add supervised module class {class}
scvi.module.base.SupervisedModuleClass{pr}3237. - Add
get_normalizedmodel property for any generative model, and changeget_accessibility_estimatestoget_normalized_accessibilitywhere needed {pr}3238 - Add {class}
scvi.external.TOTALANVIfor modeling single-cell RNA and CITE-seq protein data that integrates semi-supervised cell type annotations to jointly infer both protein expression and cell states {pr}3259 - Add Custom Dataloaders registry support {pr}
2932 - Add support for using Census and LaminAI custom dataloaders for {class}
scvi.model.SCVIand {class}scvi.model.SCANVI{pr}2932 - Add early stopping KL warmup steps {pr}
3262 - Add a minification option to {class}
~scvi.model.LinearSCVI{pr}3294 - Update the Read the Docs tutorials index page with interactive, filterable options {pr}
3276
- Handle missing
monitorduring early stopping {pr}3226 - Fix a bug in {class}
scvi.external.SysVIget_normalized_expression{pr}3255 - Add support for Integrated Gradients for multimodal models {pr}
3264 - Fix a bug in resolVI
get_normalized_expression{pr}3308 - Fix a bug in resolVI gene-assay dispersion {pr}
3308
- Updated Scvi-Tools AWS hub to Weizmann instead of Berkeley. {pr}
3246. - Updated resolVI to use rapids-singlecell. {pr}
3308.
- Removed Jax version constraint for mrVI training. {pr}
3309.
- Add {class}
scvi.external.Decipherfor dimensionality reduction and interpretable representation learning in single-cell RNA sequencing data {pr}3015, {pr}3091. - Add multiGPU support for {class}
~scvi.model.SCVI, {class}~scvi.model.SCANVI, {class}~scvi.model.CondSCVIand {class}~scvi.model.LinearSCVI, {class}~scvi.model.TOTALVI, {class}~scvi.model.MULTIVIand {class}~scvi.model.PEAKVI. {pr}3125. - Add an exception callback to {class}
scvi.train._callbacks.SaveCheckpointin order to save optimal model during training, in case of failure because of Nan's in gradients. {pr}3159. - Add {meth}
~scvi.model.SCVI.get_normalized_expressionfor models: {class}~scvi.model.PEAKVI, {class}~scvi.external.POISSONVI, {class}~scvi.model.CondSCVI, {class}~scvi.model.AUTOZI, {class}~scvi.external.CellAssignand {class}~scvi.external.GIMVI. {pr}3121. - Add {class}
scvi.external.RESOLVIfor bias correction in single-cell resolved spatial transcriptomics {pr}3144. - Add semisupervised training mixin class
{class}
scvi.model.base.SemisupervisedTrainingMixin. {pr}3164. - Add scib-metrics support for {class}
scvi.autotune.AutotuneExperimentand {class}scvi.train._callbacks.ScibCallbackfor autotune for scib metrics {pr}3168. - Add Support of dask arrays in AnnTorchDataset. {pr}
3193. - Add a common use cases section in the docs' user guide. {pr}
3200. - Add {class}
scvi.external.SysVIfor cycle consistency loss and VampPrior {pr}3195.
- Fixed bug in distributed {class}
scvi.dataloaders.ConcatDataLoader{pr}3053. - Fixed bug when loading Pyro-based models and scArches support for Pyro {pr}
3138 - Fixed a disable vmap in {class}
scvi.external.MRVIfor large sample sizes to avoid out-of-memory errors. Store distance matrices as a numpy array in xarray to reduce memory usage {pr}3146. - Fixed {class}
scvi.external.MRVIMixtureSameFamily log probability calculation {pr}3189.
- Updated the CI workflow with multiGPU tests {pr}
3053. - Set
mode="change"as default DE method. Compute positive and negative LFC separately by default (test_mode="three"). Corrected computation of pseudocounts and make if default to add a pseudocounts for genes not expressed (pseudocount=None). According to Eq. Ten of Boyeau et al., PNAS 2023 {pr}2826
- Add MuData Minification option to {class}
~scvi.model.TOTALVI{pr}3061. - Add Support for MPS usage in Mac {pr}
3100. - Add support for torch compile before train (EXPERIMENTAL) {pr}
2931. - Add support for Numpy 2.0 {pr}
2842. - Changed scvi-hub ModelCard and add criticism metrics to the card {pr}
3078. - MuData support for {class}
~scvi.model.MULTIVIvia the method {meth}~scvi.model.MULTIVI.setup_mudata{pr}3038.
- Fixed batch_size pop to get in {class}
scvi.dataloaders.DataSplitter{pr}3128.
- Updated the CI workflow with internet, private and optional tests {pr}
3082. - Changed loompy stored files to anndata {pr}
2842. - Address AnnData >= 0.11 deprecation warning for {class}
anndata.experimentalby replacing instances to {class}anndata.abcand {class}anndata.io{pr}3085.
- Removed the support for loompy and local mde function {pr}
2842.
- Added adaptive handling for last training minibatch of 1–2 cells in case of
datasplitter_kwargs={"drop_last": False}andtrain_size = Noneby moving them into the validation set, if available. {pr}3036. - Add
batch_keyandlabels_keyto {meth}scvi.external.SCAR.setup_anndata. {pr}3045. - Implemented variance of ZINB distribution. {pr}
3044. - Support for minified mode while retaining counts to skip the encoder.
- New Training plan argument
update_only_decoderto use stored latent codes and skip training of the encoder. - Refactored code for minified models. {pr}
2883. - Add {class}
scvi.external.METHYLVIfor modeling methylation data from single-cell bisulfite sequencing (scBS-seq) experiments {pr}2834.
- Breaking Change: Fix
get_outlier_cell_sample_pairsfunction in {class}scvi.external.MRVIto correctly compute the maximum log-density across in-sample cells rather than the aggregated posterior log-density {pr}3007. - Fix references to
scvi.externalin {meth}scvi.external.SCAR.setup_anndata. - Fix gimVI to append mini batches first into CPU during get_imputed and get_latent operations {pr}
3058.
- Add support for Python 3.12 {pr}
2966. - Add support for categorial covariates in scArches in {class}
scvi.model.base.ArchesMixin{pr}2936. - Add assertion error in cellAssign for checking duplicates in celltype markers {pr}
2951. - Add {meth}
scvi.external.POISSONVI.get_region_factors{pr}2940. - {attr}
scvi.settings.dl_persistent_workersallows using persistent workers in {class}scvi.dataloaders.AnnDataLoader{pr}2924. - Add option for using external indexes in data splitting classes that are under
scvi.dataloadersby passingexternal_indexing=list[train_idx,valid_idx,test_idx]as well as in all models available {pr}2902. - Add warning if creating data splits in
scvi.dataloadersthat create last batch with less than 3 cells {pr}2916. - Add new experimental functional API for hyperparameter tuning with
{func}
scvi.autotune.run_autotuneand {class}scvi.autotune.AutotuneExperimentto replace {class}scvi.autotune.ModelTuner, {class}scvi.autotune.TunerManager, and {class}scvi.autotune.TuneAnalysis{pr}2561. - Add experimental class {class}
scvi.nn.Embeddingimplementing methods for extending embeddings {pr}2574. - Add experimental support for representing batches with continuously valued embeddings by passing
in
batch_representation="embedding"to {class}scvi.model.SCVI{pr}2576. - Add experimental mixin classes {class}
scvi.model.base.EmbeddingMixinand {class}scvi.module.base.EmbeddingModuleMixin{pr}2576. - Add the option to generate synthetic spatial coordinates in {func}
scvi.data.synthetic_iidwith argumentgenerate_coordinates{pr}2603. - Add experimental support for using custom {class}
lightning.pytorch.core.LightningDataModules in {func}scvi.autotune.run_autotune{pr}2605. - Add {class}
scvi.external.VELOVIfor RNA velocity estimation using variational inference {pr}2611. - Add
unsignedargument to {meth}scvi.hub.HubModel.pull_from_s3to allow for unsigned downloads of models from AWS S3 {pr}2615. - Add support for
batch_keyin {meth}scvi.model.CondSCVI.setup_anndata{pr}2626. - Add support for {meth}
scvi.model.base.RNASeqMixinin {class}scvi.model.CondSCVI{pr}2915. - Add
load_best_on_endargument to {class}scvi.train.SaveCheckpointto load the best model state at the end of training {pr}2672. - Add experimental class {class}
scvi.distributions.BetaBinomialimplementing the Beta-Binomial distribution with mean-dispersion parameterization for modeling scBS-seq methylation data {pr}2692. - Add support for custom dataloaders in {class}
scvi.model.base.VAEMixinmethods by specifying thedataloaderargument {pr}2748. - Add the option to use a normal distribution in the generative model of {class}
scvi.model.SCVIby passing ingene_likelihood="normal"{pr}2780. - Add {class}
scvi.external.MRVIfor modeling sample-level heterogeneity in single-cell RNA-seq data {pr}2756. - Add support for reference mapping with {class}
mudata.MuDatamodels to {class}scvi.model.base.ArchesMixin{pr}2578. - Add argument
return_meanto {meth}scvi.model.base.VAEMixin.get_reconstruction_errorand {meth}scvi.model.base.VAEMixin.get_elboto allow computation without averaging across cells {pr}2362. - Add support for setting
weights="importance"in {meth}scvi.model.SCANVI.differential_expression{pr}2362.
- Deprecate {func}
scvi.data.cellxgene, to be removed in v1.3. Please directly use the cellxgene-census instead {pr}2542. - Deprecate {func}
scvi.nn.one_hot, to be removed in v1.3. Please directly use theone_hotfunction in PyTorch instead {pr}2608. - Deprecate {class}
scvi.train.SaveBestState, to be removed in v1.3. Please use {class}scvi.train.SaveCheckpointinstead {pr}2673. - Deprecate
save_bestargument in {meth}scvi.model.PEAKVI.trainand {meth}scvi.model.MULTIVI.train, to be removed in v1.3. Please pass inenable_checkpointingor specify a custom checkpointing procedure with {class}scvi.train.SaveCheckpointinstead {pr}2673. - Move {func}
scvi.model.base._utils._load_legacy_saved_filesto {func}scvi.model.base._save_load._load_legacy_saved_files{pr}2731. - Move {func}
scvi.model.base._utils._load_saved_filesto {func}scvi.model.base._save_load._load_saved_files{pr}2731. - Move {func}
scvi.model.base._utils._initialize_modelto {func}scvi.model.base._save_load._initialize_model{pr}2731. - Move {func}
scvi.model.base._utils._validate_var_namesto {func}scvi.model.base._save_load._validate_var_names{pr}2731. - Move {func}
scvi.model.base._utils._prepare_obsto {func}scvi.model.base._de_core._prepare_obs{pr}2731. - Move {func}
scvi.model.base._utils._de_coreto {func}scvi.model.base._de_core._de_core{pr}2731. - Move {func}
scvi.model.base._utils._fdr_de_predictionto {func}scvi.model.base._de_core_._fdr_de_prediction{pr}2731. - {func}
scvi.data.synthetic_iidnow generates unique variable names for protein and accessibility data {pr}2739. - The
data_moduleargument in {meth}scvi.model.base.UnsupervisedTrainingMixin.trainhas been renamed todatamodulefor consistency {pr}2749. - Change the default saving method of variable names for {class}
mudata.MuDatabased models (e.g. {class}scvi.model.TOTALVI) to a dictionary of per-mod variable names instead of a concatenated array of all variable names. Users may replicate the previous behavior by passing inlegacy_mudata_format=Trueto {meth}scvi.model.base.BaseModelClass.save{pr}2769. - Changed internal activation function in {class}
scvi.nn.DecoderTOTALVIto Softplus to increase numerical stability. This is the new default for new models. Previously trained models will be loaded with exponential activation function {pr}2913.
- Fix logging of accuracy for cases with one sample per class in scANVI {pr}
2938. - Disable adversarial classifier if training with a single batch.
Previously this raised a None error {pr}
2914. - {meth}
~scvi.model.SCVI.get_normalized_expressionfixed for Poisson distribution and Negative Binomial with latent_library_size {pr}2915. - Fix {meth}
scvi.module.VAE.marginal_llwhenn_mc_samples_per_pass=1{pr}2362. - {meth}
scvi.module.VAE.marginal_llwhenn_mc_samples_per_pass=1{pr}2362. - Enable the option to drop_last minibatch during training by
datasplitter_kwargs={"drop_last": True}{pr}2926. - Fix JAX to be deterministic on CUDA when seed is manually set {pr}
2923.
- Remove {class}
scvi.autotune.ModelTuner, {class}scvi.autotune.TunerManager, and {class}scvi.autotune.TuneAnalysisin favor of new experimental functional API with {func}scvi.autotune.run_autotuneand {class}scvi.autotune.AutotuneExperiment{pr}2561. - Remove
feed_labelsargument and corresponding code paths in {meth}scvi.module.SCANVAE.loss{pr}2644. - Remove {class}
scvi.train._callbacks.MetricsCallbackand argumentadditional_val_metricsin {class}scvi.train.Trainer{pr}2646.
- Breaking change: In
scvi.autotune._managerwe changed the parameter in RunConfig fromlocal_dirtostorage_pathsee issue2908{pr}2689.
- Add argument
return_logitsto {meth}scvi.external.SOLO.predictthat allows returning logits instead of probabilities when passing insoft=Trueto replicate the buggy behavior previous to v1.1.3 {pr}2870.
- Breaking change: Fix {meth}
scvi.external.SOLO.predictto correctly return probabilities instead of logits when passing insoft=True(the default option) {pr}2689. - Breaking change: Fix {class}
scvi.dataloaders.SemiSupervisedDataSplitterto properly sample unlabeled observations without replacement {pr}2816.
- Address AnnData >= 0.10 deprecation warning for {func}
anndata.readby replacing instances with {func}anndata.read_h5ad{pr}2531. - Address AnnData >= 0.10 deprecation warning for {class}
anndata._core.sparse_dataset.SparseDatasetby replacing instances with {class}anndata.abc.CSCDatasetand {class}anndata.abc.CSRDataset{pr}2531.
- Correctly apply non-default user parameters in {class}
scvi.external.POISSONVI{pr}2522.
- Add {class}
scvi.external.ContrastiveVIfor contrastiveVI {pr}2242. - Add {class}
scvi.dataloaders.BatchDistributedSamplerfor distributed training {pr}2102. - Add
additional_val_metricsargument to {class}scvi.train.Trainer, allowing to specify additional metrics to compute and log during the validation loop using {class}scvi.train._callbacks.MetricsCallback{pr}2136. - Expose
acceleratoranddevicearguments in {meth}scvi.hub.HubModel.load_modelpr{2166}. - Add
load_sparse_tensorargument in {class}scvi.data.AnnTorchDatasetfor directly loading SciPy CSR and CSC data structures to their PyTorch counterparts, leading to faster data loading depending on the sparsity of the data {pr}2158. - Add per-group LFC information to
{meth}
scvi.criticism.PosteriorPredictiveCheck.differential_expression.metrics["diff_exp"]is now a dictionary wheresummarystores the summary dataframe, andlfc_per_model_per_groupstores the per-group LFC {pr}2173. - Expose {meth}
torch.savekeyword arguments in {class}scvi.model.base.BaseModelClass.saveand {class}scvi.external.GIMVI.save{pr}2200. - Add
model_kwargsandtrain_kwargsarguments to {meth}scvi.autotune.ModelTuner.fit{pr}2203. - Add
datasplitter_kwargsto modeltrainmethods {pr}2204. - Add
use_posterior_meanargument to {meth}scvi.model.SCANVI.predictfor stochastic prediction of cell type labels {pr}2224. - Add support for Python 3.10+ type annotations in {class}
scvi.autotune.ModelTuner{pr}2239. - Add the option to log device statistics in {meth}
scvi.autotune.ModelTuner.fitwith argumentmonitor_device_stats{pr}2260. - Add the option to pass in a random seed to {meth}
scvi.autotune.ModelTuner.fitwith argumentseed{pr}2260. - Automatically log the learning rate when
reduce_lr_on_plateau=Truein training plans {pr}2280. - Add {class}
scvi.external.POISSONVIto model scATAC-seq fragment counts with a Poisson distribution {pr}2249 - {class}
scvi.train.SemiSupervisedTrainingPlannow logs the classifier calibration error {pr}2299. - Passing
enable_checkpointing=Trueintotrainmethods is now compatible with our model saves. Additional options can be specified by initializing with {class}scvi.train.SaveCheckpoint{pr}2317. - {attr}
scvi.settings.dl_num_workersis now correctly applied as the defaultnum_workersin {class}scvi.dataloaders.AnnDataLoader{pr}2322. - Passing in
indicesto {class}scvi.criticism.PosteriorPredictiveCheckallows for running metrics on a subset of the data {pr}2361. - Add
seedargument to {func}scvi.model.utils.mdefor reproducibility {pr}2373. - Add {meth}
scvi.hub.HubModel.saveand {meth}scvi.hub.HubMetadata.save{pr}2382. - Add support for Optax 0.1.8 by renaming instances of {func}
optax.additive_weight_decayto {func}optax.add_weight_decay{pr}2396. - Add support for hosting {class}
scvi.hub.HubModelon AWS S3 via {meth}scvi.hub.HubModel.pull_from_s3and {meth}scvi.hub.HubModel.push_to_s3{pr}2378. - Add a clearer error message for {func}
scvi.data.poisson_gene_selectionwhen input data does not contain raw counts {pr}2422. - Add API for using custom dataloaders with {class}
scvi.model.SCVIby makingadataargument optional on initialization and adding optional argumentdata_moduleto {meth}scvi.model.base.UnsupervisedTrainingMixin.train{pr}2467. - Add support for Ray 2.8–2.9 in {class}
scvi.autotune.ModelTuner{pr}2478.
- Fix bug where
n_hiddenwas not being passed into {class}scvi.nn.Encoderin {class}scvi.model.AmortizedLDA{pr}2229 - Fix bug in {class}
scvi.module.SCANVAEwhere classifier probabilities were interpreted as logits. This is backwards compatible as loading older models will use the old code path {pr}2301. - Fix bug in {class}
scvi.external.GIMVIwherebatch_sizewas not properly used in inference methods {pr}2366. - Fix error message formatting in {meth}
scvi.data.fields.LayerField.transfer_field{pr}2368. - Fix ambiguous error raised in {meth}
scvi.distributions.NegativeBinomial.log_proband {meth}scvi.distributions.ZeroInflatedNegativeBinomial.log_probwhenscalenot passed in and value not in support {pr}2395. - Fix initialization of {class}
scvi.distributions.NegativeBinomialand {class}scvi.distributions.ZeroInflatedNegativeBinomialwhenvalidate_args=Trueand optional parameters not passed in {pr}2395. - Fix error when re-initializing {class}
scvi.external.GIMVIwith the same datasets {pr}2446.
- Replace
sparsewithsparse_formatargument in {meth}scvi.data.synthetic_iidfor increased flexibility over dataset format {pr}2163. - Revalidate
deviceswhen automatically switching from MPS to CPU accelerator in {func}scvi.model._utils.parse_device_args{pr}2247. - Refactor {class}
scvi.data.AnnTorchDataset, now loads continuous data as {class}numpy.float32and categorical data as {class}numpy.int64by default {pr}2250. - Support fractional GPU usage in {class}
scvi.autotune.ModelTunerpr{2252}. - Tensorboard is now the default logger in {class}
scvi.autotune.ModelTunerpr{2260}. - Match
momentumandepsilonin {class}scvi.module.JaxVAEto the default values in PyTorch {pr}2309. - Change {class}
scvi.train.SemiSupervisedTrainingPlanand {class}scvi.train.ClassifierTrainingPlanaccuracy and F1 score computations to use"micro"reduction rather than"macro"{pr}2339. - Internal refactoring of {meth}
scvi.module.VAE.sampleand {meth}scvi.model.base.RNASeqMixin.posterior_predictive_sample{pr}2377. - Change
xarrayandsparsefrom mandatory-to-optional dependencies {pr}2480. - Use {class}
anndata.abc.CSCDatasetand {class}anndata.abc.CSRDatasetinstead of the deprecated {class}anndata._core.sparse_dataset.SparseDatasetfor type checks {pr}2485. - Make
use_observed_lib_sizeargument adjustable in {class}scvi.module.LDVAEpr{2494}.
- Remove deprecated
use_gpuargument in favor of PyTorch Lightning argumentsacceleratoranddevices{pr}2114. - Remove deprecated
scvi._compat.Literalclass {pr}2115. - Remove chex dependency {pr}
2482.
- Add support for AnnData 0.10.0 {pr}
2271.
- Disable the default selection of MPS when
accelerator="auto"in Lightning {pr}2167. - Change JAX models to use
dictinstead of {class}flax.core.FrozenDictaccording to the Flax migration guide google/flax#3191 {pr}2222.
- Fix bug in {class}
scvi.model.base.PyroSviTrainMixinwheretraining_planargument is ignored {pr}2162. - Fix missing docstring for
unlabeled_categoryin {class}scvi.model.SCANVI.setup_anndataand reorder arguments {pr}2189. — Fix Pandas 2.0 unpickling error in {meth}scvi.model.base.BaseModelClass.convert_legacy_saveby switching to {func}pandas.read_picklefor the setup dictionary {pr}2212.
- Fix link to Scanpy preprocessing in introduction tutorial {pr}
2154. - Fix link to Ray Tune search API in autotune tutorial {pr}
2154.
- Add support for Python 3.11 {pr}
1977.
- Upper bound Chex dependency to 0.1.8 due to NumPy installation conflicts {pr}
2132.
- Add {class}
scvi.criticism.PosteriorPredictiveCheckfor model evaluation {pr}2058. - Add {func}
scvi.data.reads_to_fragmentsfor scATAC data {pr}1946 - Add default
stacklevelforwarningsinscvi.settings{pr}1971. - Add scBasset motif injection procedure {pr}
2010. - Add importance-sampling-based differential expression procedure {pr}
1872. - Raise clearer error when initializing {class}
scvi.external.SOLOfrom {class}scvi.model.SCVIwith extra categorical or continuous covariates {pr}2027. - Add the option to generate {class}
mudata.MuDatain {meth}scvi.data.synthetic_iid{pr}2028. - Add option for disabling shuffling prior to splitting data in
{class}
scvi.dataloaders.DataSplitter{pr}2037. - Add {meth}
scvi.data.AnnDataManager.create_torch_datasetand expose custom sampler ability {pr}2036. - Log training loss through Lightning's progress bar {pr}
2043. - Filter Jax undetected GPU warnings {pr}
2044. - Raise warning if MPS backend is selected for PyTorch,
see pytorch/pytorch#77764 {pr}
2045. - Add
deregister_managerfunction to {class}scvi.model.base.BaseModelClass, allowing to clear {class}scvi.data.AnnDataManagerinstances from memory {pr}2060. - Add the option to use a linear classifier in {class}
scvi.model.SCANVI{pr}2063. - Add lower bound 0.12.1 for Numpyro dependency {pr}
2078. - Add a new section in scBasset tutorial for motif scoring {pr}
2079.
- Fix creation of minified adata by copying original uns dict {pr}
2000. This issue arises with anndata>=0.9.0. - Fix {class}
scvi.model.TOTALVI{class}scvi.model.MULTIVIhandling of missing protein values {pr}2009. - Fix bug in {meth}
scvi.distributions.NegativeBinomialMixture.samplewherethetaandmuarguments were switched around {pr}2024. - Fix bug in {meth}
scvi.dataloaders.SemiSupervisedDataLoader.resample_labelswhere the labeled dataloader was not being reinitialized on subsample {pr}2032. - Fix typo in {class}
scvi.model.JaxSCVIexample snippet {pr}2075.
- Use sphinx book theme for documentation {pr}
1673. - {meth}
scvi.model.base.RNASeqMixin.posterior_predictive_samplenow outputs 3-d {class}sparse.GCXSmatrices {pr}1902. - Add an option to specify
dropout_ratioin {meth}scvi.data.synthetic_iid{pr}1920. - Update to lightning 2.0 {pr}
1961 - Hyperopt is a new default searcher for tuner {pr}
1961 - {class}
scvi.train.AdversarialTrainingPlanno longer encodes data twice during a training step, instead uses same latent for both optimizers {pr}1961, {pr}1980 - Switch back to using sphinx autodoc typehints {pr}
1970. - Disable default seed, run
scvi.settings.seedafter import for reproducibility {pr}1976. - Deprecate
use_gpuin favor of PyTorch Lightning argumentsacceleratoranddevices, to be removed in v1.1 {pr}1978. - Docs organization {pr}
1983. - Validate training data and code URLs for {class}
scvi.hub.HubMetadataand {class}scvi.hub.HubModelCardHelper{pr}1985. - Keyword arguments for encoders and decoders can now be passed in from the model level {pr}
1986. - Expose
local_diras a public property in {class}scvi.hub.HubModel{pr}1994. - Use {func}
anndata.concatinternally inside {meth}scvi.external.SOLO.from_scvi_model{pr}2013. - {class}
scvi.train.SemiSupervisedTrainingPlanand {class}scvi.train.ClassifierTrainingPlannow log accuracy, F1 score, and AUROC metrics {pr}2023. - Switch to cellxgene census for backend for cellxgene data function {pr}
2030. - Change default
max_cellsandtruncationin {meth}scvi.model.base.RNASeqMixin.get_importance_weights{pr}2064. - Refactor heuristic for default
max_epochsas a separate function {meth}scvi.model._utils.get_max_epochs_heuristic{pr}2083.
- Remove the ability to set up ST data in {class}
~scvi.external.SpatialStereoscope.from_rna_model, which was deprecated. ST data should be set up using {class}~scvi.external.SpatialStereoscope.setup_anndata{pr}1949. - Remove custom reusable doc decorator which was used for the docs {pr}
1970. - Remove
drop_lastas an integer from {class}~scvi.dataloaders.AnnDataLoader, add typing and code cleanup {pr}1975. - Remove seqfish and seqfish plus datasets {pr}
2017. - Remove support for Python 3.8 (NEP 29) {pr}
2021.
- Fix totalVI differential expression when integer sequential protein names are automatically used
{pr}
1951. - Fix peakVI scArches test case {pr}
1962.
- Allow passing in
map_locationinto {meth}~scvi.hub.HubMetadata.from_dirand {meth}~scvi.hub.HubModelCardHelper.from_dirand set default to"cpu"{pr}1960. - Updated tutorials {pr}
1966.
- Fix
return_distdocstring of {meth}scvi.model.base.VAEMixin.get_latent_representation{pr}1932. - Fix hyperlink to pymde docs {pr}
1944
- Use ruff for fixing and linting {pr}
1921, {pr}1941. - Use sphinx autodoc instead of sphinx-autodoc-typehints {pr}
1941. - Remove .flake8 and .prospector files {pr}
1923. - Log individual loss terms in {meth}
scvi.module.MULTIVAE.loss{pr}1936. - Setting up ST data in {class}
~scvi.external.SpatialStereoscope.from_rna_modelis deprecated. ST data should be set up using {class}~scvi.external.SpatialStereoscope.setup_anndata{pr}1803.
- Fixed the computation of ELBO during training plan logging when using global kl terms. {pr}
1895 - Fixed usage of {class}
scvi.train.SaveBestStatecallback, which affected {class}scvi.model.PEAKVItraining. If using {class}~scvi.model.PEAKVI, please upgrade. {pr}1913 - Fixed the original seed for jax-based models to work with jax 0.4.4. {pr}
1907, {pr}1909
- Model hyperparameter tuning is available through {class}
~scvi.autotune.ModelTuner(beta) {pr}1785,{pr}1802,{pr}1831. - Pre-trained models can now be uploaded to and downloaded from Hugging Face models using the
{mod}
~scvi.hubmodule {pr}1779,{pr}1812,{pr}1828,{pr}1841, {pr}1851,{pr}1862. - {class}
~anndata.AnnData.varand.varmattributes can now be registered through new fields in {mod}~scvi.data.fields{pr}1830,{pr}1839. - {class}
~scvi.external.SCBASSET, a reimplementation of the original scBasset model, is available for representation learning of scATAC-seq data (experimental) {pr}1839,{pr}1844, {pr}1867,{pr}1874,{pr}1882. - {class}
~scvi.train.LowLevelPyroTrainingPlanand {class}~scvi.model.base.PyroModelGuideWarmupadded to allow the use of vanilla PyTorch optimization on Pyro models {pr}1845,{pr}1847. - Add {meth}
scvi.data.cellxgenefunction to download cellxgene datasets {pr}1880.
- Latent mode support changed so that user data is no longer edited in-place {pr}
1756. - Minimum supported PyTorch Lightning version is now 1.9 {pr}
1795,{pr}1833,{pr}1863. - Minimum supported Python version is now 3.8 {pr}
1819. - Poetry removed in favor of Hatch for builds and publishing {pr}
1823. setup_anndatadocstrings fixed,setup_mudatadocstrings added {pr}1834,{pr}1837.- {meth}
~scvi.data.add_dna_sequenceadds DNA sequences to {class}~anndata.AnnDataobjects using genomepy {pr}1839,{pr}1842. - Update tutorial formatting with pre-commit {pr}
1850 - Expose
acceleratorsanddevicesarguments in {class}~scvi.train.Trainer{pr}1864. - Development in GitHub Codespaces is now supported {pr}
1836.
- {class}
~scvi.module.base.LossRecorderhas been removed in favor of {class}~scvi.module.base.LossOutput{pr}1869.
- {class}
~scvi.train.JaxTrainingPlannow correctly updatesglobal_stepthrough PyTorch Lightning by using a dummy optimizer. {pr}1791. - CUDA compatibility issue fixed in {meth}
~scvi.distributions.ZeroInflatedNegativeBinomial.sample{pr}1813. - Device-backed {class}
~scvi.dataloaders.AnnTorchDatasetfixed to work with sparse data {pr}1824. - Fix bug {meth}
~scvi.model.base._log_likelihood.compute_reconstruction_errorcausing the first batch to be ignored, see more details in {issue}1854{pr}1857.
- {ghuser}
adamgayoso - {ghuser}
eroell - {ghuser}
gokceneraslan - {ghuser}
macwiatrak - {ghuser}
martinkim0 - {ghuser}
saroudant - {ghuser}
vitkl - {ghuser}
watiss
- {class}
~scvi.train.TrainingPlanallows custom PyTorch optimizers #1747. - Improvements to {class}
~scvi.train.JaxTrainingPlan#1747 #1749. - {class}
~scvi.module.base.LossRecorderis deprecated. Please substitute with {class}~scvi.module.base.LossOutput#1749 - All training plans require keyword args after the first positional argument #1749
- {class}
~scvi.module.base.JaxBaseModuleClassabsorbed features from theJaxModuleWrapper, rendering theJaxModuleWrapperobsolete, so it was removed. #1751 - Add {class}
scvi.external.Tangramand {class}scvi.external.tangram.TangramMapperthat implement Tangram for mapping scRNA-seq data to spatial data #1743.
- Remove confusing warning about kl warmup, log kl weight instead #1773
- {class}
~scvi.module.base.LossRecorderno longer allows access to dictionaries of values if provided during initialization #1749. JaxModuleWrapperremoved. #1751
- Fix
n_proteinsusage in {class}~scvi.model.MULTIVI#1737. - Remove unused param in {class}
~scvi.model.MULTIVI#1741. - Fix random seed handling for Jax models #1751.
- Add latent mode support in {class}
~scvi.model.SCVI#1672. This allows for loading a model using latent representations only (i.e., without the full counts). Not only does this speed up inference by using the cached latent distribution parameters (thus skipping the encoding step), but this also helps in scenarios where the full counts are not available but cached latent parameters are. We provide utility functions and methods to dynamically convert a model to latent mode. - Added {class}
~scvi.external.SCARas an external model for ambient RNA removal #1683.
- Faster inference in PyTorch with
torch.inference_mode#1695. - Upgrade to Lightning 1.6 #1719.
- Update the CI workflow to separate static code checking from pytest #1710.
- Add Python 3.10 to CI workflow #1711.
- Add {meth}
~scvi.data.AnnDataManager.register_new_fields#1689. - Use sphinx-contrib-bibtex for references #1731.
- {meth}
~scvi.model.base.VAEMixin.get_latent_representation: more explicit and better docstring #1732. - Replace custom attrdict with {class}
~ml_collectionsimplementation #1696.
- Add weight support to {class}
~scvi.model.MULTIVI#1697. Old models can't be loaded anymore.
- Support for PyTorch Lightning 1.7 #1622.
- Allow
flaxto use any mutable states used by a model generically with {class}~scvi.module.base.TrainStateWithState#1665, #1700. - Update publication links in
README#1667. - Docs now include floating window cross-references with
hoverxref, external links withlinkcode, andgrid#1678.
- Fix
get_likelihood_parameters()failure whengene_likelihood != "zinb"in {class}~scvi.model.base.RNASeqMixin#1618. - Fix exception logic when not using the observed library size in {class}
~scvi.module.VAEinitialization #1660. - Replace instances of
super().__init__()with an argument insuper(), causingautoreloadextension to throw errors #1671. - Change cell2location tutorial causing docs to build to fail #1674.
- Replace instances of
max_epochsasints for new PyTorch Lightning #1686. - Catch case when
torch.backends.mpsis not implemented #1692. - Fix Poisson sampling in {meth}
~scvi.module.VAE.sample#1702.
- Move
trainingargument in {class}~scvi.module.JaxVAEconstructor to a keyword argument into the call method. This simplifies the {class}~scvi.module.base.JaxModuleWrapperlogic and avoids the reinstantiation of {class}~scvi.module.JaxVAEduring evaluation #1580. - Add a static method on the BaseModelClass to return the AnnDataManger's full registry #1617.
- Clarify docstrings for continuous and categorical covariate keys #1637.
- Remove poetry lock, use newer build system #1645.
- Fix CellAssign to accept extra categorical covariates #1629.
- Fix an issue where
max_epochsis never determined heuristically for totalVI, instead it would always default to 400 #1639.
- Fix an issue where
- Fix an issue where
max_epochsis never determined heuristically for totalVI, instead it would always default to 400 #1639.
Make sure notebooks are up to date for real this time :).
-
Experimental MuData support for {class}
~scvi.model.TOTALVIvia the method {meth}~scvi.model.TOTALVI.setup_mudata. For several of the existingAnnDataFieldclasses, there is now a MuData counterpart with an additionalmod_keyargument used to indicate the modality where the data lives (e.g. {class}~scvi.data.fields.LayerFieldto {class}~scvi.data.fields.MuDataLayerField). These modified classes are simply wrapped versions of the originalAnnDataFieldcode via the new {class}scvi.data.fields.MuDataWrappermethod #1474. -
Modification of the {meth}
~scvi.module.VAE.generativemethod's outputs to return prior and likelihood properties as {class}~torch.distributions.distribution.Distributionobjects. Concerned modules are {class}~scvi.module.AmortizedLDAPyroModule, {class}AutoZIVAE, {class}~scvi.module.MULTIVAE, {class}~scvi.module.PEAKVAE, {class}~scvi.module.TOTALVAE, {class}~scvi.module.SCANVAE, {class}~scvi.module.VAE, and {class}~scvi.module.VAEC. This allows facilitating the manipulation of these distributions for model training and inference #1356. -
Major changes to Jax support for scvi-tools models to generalize beyond {class}
~scvi.model.JaxSCVI. Support for Jax remains experimental and is subject to breaking changes:- Consistent module interface for Flax modules (Jax-backed) via
{class}
~scvi.module.base.JaxModuleWrapper, such that they are compatible with the existing {class}~scvi.model.base.BaseModelClass#1506. - {class}
~scvi.train.JaxTrainingPlannow leverages Pytorch Lightning to factor out Jax-specific training loop implementation #1506. - Enable basic device management in Jax-backed modules #1585.
- Consistent module interface for Flax modules (Jax-backed) via
{class}
- Add {meth}
~scvi.module.base.PyroBaseModuleClass.on_loadcallback which is called on {meth}~scvi.model.base.BaseModuleClass.loadprior to loading the module state dict #1542. - Refactor metrics code and use {class}
~torchmetrics.MetricCollectionto update metrics in bulk #1529. - Add
max_kl_weightandmin_kl_weightto {class}~scvi.train.TrainingPlan#1595. - Add a warning to {class}
~scvi.model.base.UnsupervisedTrainingMixinthat is raised ifmax_kl_weightis not reached during training #1595.
- Any methods relying on the output of
inferenceandgenerativefrom existing scvi-tools models (e.g. {class}~scvi.model.SCVI, {class}~scvi.model.SCANVI) will need to be modified to accepttorch.Distributionobjects rather than tensors for each parameter (e.g.px_m,px_v) #1356. - The signature of {meth}
~scvi.train.TrainingPlan.compute_and_log_metricshas changed to support the use of {class}~torchmetrics.MetricCollection. The typical modification required will look like changingself.compute_and_log_metrics(scvi_loss, self.elbo_train)toself.compute_and_log_metrics(scvi_loss, self.train_metrics, "train"). The same is necessary for validation metrics except withself.val_metricsand the mode"validation"#1529.
- Fix issue with {meth}
~scvi.model.SCVI.get_normalized_expressionwith multiple samples and additional continuous covariates. This bug originated from {meth}~scvi.module.VAE.generativefailing to match the dimensions of the continuous covariates with the input whenn_samples>1in {meth}~scvi.module.VAE.inferencein multiple module classes #1548. - Add support for padding layers in {meth}
~scvi.model.SCVI.prepare_query_anndatawhich is necessary to run {meth}~scvi.model.SCVI.load_query_datafor a model setup with a layer instead of X #1575.
Note: When applying any model using the {class}~scvi.train.AdversarialTrainingPlan (e.g.
{class}~scvi.model.TOTALVI, {class}~scvi.model.MULTIVI), you should make sure to use v0.16.4
instead of v0.16.3 or v0.16.2. This release fixes a critical bug in the training plan.
- Fix critical issue in {class}
~scvi.train.AdversarialTrainingPlanwherekl_weightwas overwritten to 0 at each step (#1566). Users should avoid using v0.16.2 and v0.16.3, which both include this bug.
- Removes sphinx max version and removes jinja dependency (#1555).
- Upper bounds protobuf due to pytorch lightning incompatibilities (#1556). Note that #1556 has unique changes as PyTorch Lightning >=1.6.4 adds the upper bound in their requirements.
- Raise appropriate error when
backup_urlis not provided and the file is missing on {meth}~scvi.model.base.BaseModelClass.load(#1527). - Pipe
loss_kwargsproperly in {class}~scvi.train.AdversarialTrainingPlan, and fix incorrectly piped kwargs in {class}~scvi.model.TOTALVIand {class}~scvi.model.MULTIVI(#1532).
- Update scArches Pancreas tutorial, DestVI tutorial (#1520).
- {class}
~scvi.dataloaders.SemiSupervisedDataLoaderand {class}~scvi.dataloaders.SemiSupervisedDataSplitterno longer takeunlabeled_categoryas an initial argument. Instead, theunlabeled_categoryis fetched from the labels' state registry, assuming that the {class}~scvi.data.AnnDataManagerobject is registered with a {class}~scvi.data.fields.LabelsWithUnlabeledObsField(#1515).
- Bug fixed in {class}
~scvi.model.SCANVIwhereself._labeled_indiceswas being improperly set (#1515). - Fix issue where {class}
~scvi.model.SCANVI.load_query_datawould not properly add an obs column with the unlabeled category when thelabels_keywas not present in the query data. - Disable extension of categories for labels in {class}
~scvi.model.SCANVI.load_query_data(#1519). - Fix an issue with {meth}
~scvi.model.SCANVI.prepare_query_datato ensure it does nothing when genes are completely matched (#1520).
This release features a refactor of {class}~scvi.model.DestVI (#1457):
- Bug fix in cell type amortization, which leads to on par performance of cell type amortization
V_encoderwith free parameter for cell type proportionsV. - Bug fix in library size in {class}
~scvi.model.CondSCVI, that lead to downstream dependency between sum over cell type proportionsv_indand library sizelibraryin {class}~scvi.model.DestVI. neg_log_likelihood_prioris not computed anymore on random subset of single cells but cell-type-specific subclustering using cluster variancevar_vprior, cluster mean, and cluster mixture proportionmp_vpriorfor computation. This leads to more stable results and faster computation time. Settingvamp_prior_pin {func}~scvi.model.DestVI.from_rna_modelto the expected resolution is critical in this algorithm.- The new default is to also use dropout
dropoutduring the decoder of {class}~scvi.model.CondSCVIand subsequentlydropout_decoderin {class}~scvi.model.DestVI, we found this to be beneficial after bug fixes listed above. - We changed the weighting of the loss on the variances of beta and the prior of eta.
::: {note}
Due to bug fixes listed above this version of {class}~scvi.model.DestVI is not backwards
compatible. Despite instability in training in the outdated version, we were able to reproduce
results generated with this code. We therefore do not strictly encourage it to rerun old experiments.
:::
We published a new tutorial. This new tutorial incorporates a new utility package
destvi_utils that generates exploratory plots of the
results of {class}~scvi.model.DestVI. We refer to the manual of this package for further
documentation.
- Docs changes (installation #1498, {class}
~scvi.model.DestVIuser guide #1501 and #1508, dark mode code cells #1499). - Add
backup_urlto the {meth}~scvi.model.base.BaseModelClass.loadmethod of each model class, enabling automatic downloading of the model save file (#1505).
- Support for loading legacy loading is removed from {meth}
~scvi.model.base.BaseModelClass.load. Utility to convert old files to the new file as been added {meth}~scvi.model.base.BaseModelClass.convert_legacy_save(#1505). - Breaking changes to {class}
~scvi.model.DestVIas specified above (#1457).
- {meth}
~scvi.model.base.RNASeqMixin.get_likelihood_parametersfix forn_samples > 1anddispersion="gene_cell"#1504. - Fix backwards compatibility for legacy TOTALVI models #1502.
- Add common types file #1467.
- The new default is to not pin memory during training when using a GPU. This is much better for shared GPU environments without any performance regression #1473.
- Add peakVI publication reference #1463.
- Update notebooks with new install functionality for Colab #1466.
- Simplify changing the training plan for pyro #1470.
- Optionally scale ELBO by a scalar in {class}
~scvi.train.PyroTrainingPlan#1469.
- Raise
NotImplementedErrorwhencategorical_covariate_keysare used with {meth}scvi.model.SCANVI.load_query_data. (#1458). - Fix behavior when
continuous_covariate_keysare used with {meth}scvi.model.SCANVI.classify. (#1458). - Unlabeled category values are automatically populated when
{meth}
scvi.model.SCANVI.load_query_datarun onadata_targetmissing labels column. (#1458). - Fix dataframe rendering in dark mode docs (#1448)
- Fix variance constraint in {class}
~scvi.model.AmortizedLDAthat set an artificial bound on latent topic variance (#1445). - Fix {meth}
scvi.model.base.ArchesMixin.prepare_query_datato work cross-device (e.g., model trained on cuda but method used on cpu; see #1451).
- Remove the setuptools pinned requirement due to the new PyTorch 1.11 fix (#1436).
- Switch to myst-parsed Markdown for docs (#1435).
- Add
prepare_query_data(adata, reference_model)to {class}~scvi.model.base.ArchesMixinto enable query data cleaning prior to reference mapping (#1441). - Add Human Lung Cell Atlas tutorial (#1442).
- Errors when arbitrary kwargs are passed into
setup_anndata()(#1439). - Fix {class}
scvi.external.SOLOto usetrain_size=0.9by default, which enables early stopping to work properly (#1438). - Fix scArches version warning (#1431).
- Fix backwards compat for {class}
~scvi.model.SCANVIloading (#1441).
- Remove
labels_keyfrom {class}~scvi.model.MULTIVIas it is not used in the model (#1393). - Use scvi-tools mean/inv_disp parameterization of negative binomial for
{class}
~scvi.model.JaxSCVIlikelihood (#1386). - Use
setupfor Flax-based modules (#1403). - Reimplement {class}
~scvi.module.JaxVAEusing inference/generative paradigm with {class}~scvi.module.base.JaxBaseModuleClass(#1406). - Use multiple particles optionally in {class}
~scvi.model.JaxSCVI(#1385). - {class}
~scvi.external.SOLOno longer warns about count data (#1411). - Class docs are now one page on the docs' site (#1415).
- Copied AnnData objects are assigned a new uuid and transfer is attempted (#1416).
- Fix an issue with using gene lists and proteins lists as well as
transform_batchfor {class}~scvi.model.TOTALVI(#1413). - Error gracefully when NaNs present in {class}
~scvi.data.fields.CategoricalJointObsmField(#1417).
In this release, we have completely refactored the logic behind our data handling strategy (i.e.
setup_anndata) to allow for:
- Readable data handling for existing models.
- Modular code for the easy addition of custom data fields to incorporate into models.
- Avoidance of unexpected edge cases when more than one model is instantiated in one session.
Important Note: This change will not break pipelines for model users (except a
small change to {class}~scvi.model.SCANVI). However, there are several breaking changes for model
developers. The data handling tutorial goes over these changes in detail.
This refactor is centered around the new {class}~scvi.data.AnnDataManager class which
orchestrates any data processing necessary for scvi-tools and stores necessary information, rather
than adding additional fields to the AnnData input.
:::{figure} docs/_static/img/anndata_manager_schematic.svg :align: center :alt: Schematic of data handling strategy with AnnDataManager :class: img-fluid
Schematic of data handling strategy with {class}~scvi.data.AnnDataManager
:::
We also have an exciting new experimental Jax-based scVI implementation via
{class}~scvi.model.JaxSCVI. While this implementation has limited functionality, we have found it
to be substantially faster than the PyTorch-based implementation. For example, on a 10-core Intel
CPU, Jax on only a CPU can be as fast as PyTorch with a GPU (RTX3090). We will be planning further
Jax integrations in the next releases.
- Major refactor to data handling strategy with the introduction of
{class}
~scvi.data.AnnDataManager(#1237). - Prevent clobbering between models using the same AnnData object with model instance specific
{class}
~scvi.data.AnnDataManagermappings (#1342). - Add
size_factor_keyto {class}~scvi.model.SCVI, {class}~scvi.model.MULTIVI, {class}~scvi.model.SCANVI, and {class}~scvi.model.TOTALVI(#1334). - Add references to the scvi-tools journal publication to the README (#1338, #1339).
- Addition of {func}
scvi.model.utils.mde(#1372) for faster visualization of scvi-tools embeddings. - Documentation and user guide fixes (#1364, #1361)
- Fix for {class}
~scvi.external.SOLOwhen {class}~scvi.model.SCVIwas set up with alabels_key(#1354) - Updates to tutorials (#1369, #1371)
- Furo docs theme (#1290)
- Add {class}
scvi.model.JaxSCVIand {class}scvi.module.JaxVAE, drop Numba dependency for checking if data is count data (#1367).
-
The keyword argument
run_setup_anndatahas been removed from built-in datasets since there is no longer a model-agnosticsetup_anndatamethod (#1237). -
The function
scvi.model._metrics.clustering_scoreshas been removed due to incompatibility withnew data handling (#1237).
-
{class}
~scvi.model.SCANVInow takesunlabeled_categoryas an argument to {meth}~scvi.model.SCANVI.setup_anndatarather than on initialization (#1237). -
setup_anndatais now a class method on model classes and requires specific function calls to ensure proper {class}~scvi.data.AnnDataManagersetup and model save/load. Any model inheriting from {class}~scvi.model.base.BaseModelClasswill need to re-implement this method (#1237).- To adapt existing custom models to v0.15.0, one can reference the guidelines below. For some examples of how this was done for the existing models in the codebase, please reference the following PRs: (#1301, #1302).
scvi._CONSTANTShas been changed toscvi.REGISTRY_KEYS.setup_anndata()functions are now class functions and follow a specific structure. Please refer to {meth}~scvi.model.SCVI.setup_anndatafor an example.scvi.data.get_from_registry()has been removed. This method can be replaced by {meth}scvi.data.AnnDataManager.get_from_registry.- The setup dict stored directly on the AnnData object,
adata["_scvi"], has been deprecated. Instead, this information now lives in {attr}scvi.data.AnnDataManager.registry. - The data registry can be accessed at {attr}
scvi.data.AnnDataManager.data_registry. - Summary stats can be accessed at {attr}
scvi.data.AnnDataManager.summary_stats. - Any field-specific information (e.g.
adata.obs["categorical_mappings"]) now lives in field-specific state registries. These can be retrieved via the function {meth}~scvi.data.AnnDataManager.get_state_registry. register_tensor_from_anndata()has been removed. To register tensors with no relevantAnnDataFieldsubclass, create a new subclass of {class}~scvi.data.fields.BaseAnnDataFieldand add it to appropriate model'ssetup_anndata()function.
Bug fixes, minor improvements of docs, code formatting.
- Update black formatting to the stable release (#1324)
- Refresh readme, move tasks image to docs (#1311).
- Add a 0.14.5 release note to the index (#1296).
- Add test to ensure extra {class}
~scvi.model.SCANVItraining of a pre-trained {class}~scvi.model.SCVImodel does not change original model weights (#1284). - Fix issue in {class}
~scvi.model.TOTALVIprotein background prior initialization to not include protein measurements that are known to be missing (#1282). - Upper-bound setuptools due to PyTorch import bug (#1309).
Bug fixes, new tutorials.
- Fix
kl_weightfloor for Pytorch-based models (#1269). - Add support for more Pyro guides (#1267).
- Update scArches, harmonization tutorials, add basic R tutorial, tabula muris label transfer tutorial (#1274).
Bug fixes, some tutorial improvements.
kl_weighthandling for Pyro-based models (#1242).- Allow override of missing protein inference in {class}
~scvi.model.TOTALVI(#1251). This allows treating all 0s in a particular batch for one protein as biologically valid. - Fix load documentation (e.g., {meth}
~scvi.model.SCVI.load, {meth}~scvi.model.TOTALVI.load) (#1253). - Fix model history on the load with Pyro-based models (#1255).
- Model construction tutorial uses new static setup anndata (#1257).
- Add a codebase overview figure to docs (#1231).
Bug fix.
- Bug fix to {func}
~scvi.model.base.BaseModelClassto retain tensors registered byregister_tensor_from_anndata(#1235). - Expose an instance of our
DocstringProcessorto aid in documenting derived implementations ofsetup_anndatamethod (#1235).
Bug fix and new tutorial.
- Bug fix in {class}
~scvi.external.RNAStereoscopewhere loss was computed with mean for a minibatch instead of the sum. This ensures reproducibility with the original implementation (#1228). - New Cell2location contributed tutorial (#1232).
Minor hotfixes.
- Filter out mitochrondrial genes as a preprocessing step in the Amortized LDA tutorial (#1213)
- Remove
verbose=Trueargument from early stopping callback (#1216)
In this release, we have completely revamped the scvi-tools documentation website by creating a new set of user guides that provide:
- The math behind each method (in a succinct, online methods-like way)
- The relationship between the math and the functions associated with each model
- The relationship between math variables and code variables
Our previous User Guide has been renamed to Tutorials and contains all of our existing tutorials (including tutorials for developers).
Another noteworthy addition in this release is the implementation of the (amortized) Latent Dirichlet Allocation (aka LDA) model applied to single-cell gene expression data. We have also prepared a tutorial that demonstrates how to use this model, using a PBMC 10K dataset from 10x Genomics as an example application.
Lastly, in this release we have made a change to reduce user and developer confusion by making the
previously global setup_anndata method a static class-specific method instead. This provides more
clarity on which parameters are applicable for this call, for each model class. Below is a
before/after for the DESTVI and TOTALVI model classes:
:::{figure} docs/_static/img/setup_anndata_before_after.svg :align: center :alt: setup_anndata before and after :class: img-fluid
setup_anndata before and after
:::
- Added fixes to support PyTorch Lightning 1.4 (#1103)
- Simplified data handling in R tutorials with sceasy and addressed bugs in package installation (#1122).
- Moved library size distribution computation to model init (#1123)
- Updated Contribution docs to describe how we backport patches (#1129)
- Implemented Latent Dirichlet Allocation as a PyroModule (#1132)
- Made
setup_anndataa static method on model classes rather than one global function (#1150) - Used Pytorch Lightning's
seed_everythingmethod to set seed (#1151) - Fixed a bug in {class}
~scvi.model.base.PyroSampleMixinfor posterior sampling (#1158) - Added CITE-seq datasets (#1182)
- Added user guides to our documentation (#1127, #1157, #1180, #1193, #1183, #1204)
- Early stopping now prints the reason for stopping when applicable (#1208)
setup_anndatais now an abstract method on model classes. Any model inheriting from {class}~scvi.model.base.BaseModelClasswill need to implement this method (#1150)
None!
- Updated
OrderedDicttyping import to support all Python 3.7 versions (#1114).
None!
- Update Pytorch Lightning version dependency to
>=1.3,<1.4(#1104).
None!
This release adds features for tighter integration with Pyro for model development, fixes for
{class}~scvi.external.SOLO, and other enhancements. Users of {class}~scvi.external.SOLO are
strongly encouraged to upgrade as previous bugs will affect performance.
- Add {class}
scvi.model.base.PyroSampleMixinfor easier posterior sampling with Pyro (#1059). - Add {class}
scvi.model.base.PyroSviTrainMixinfor automated training of Pyro models (#1059). - Ability to pass kwargs to {class}
~scvi.module.Classifierwhen using {class}~scvi.external.SOLO(#1078). - Ability to get doublet predictions for simulated doublets in {class}
~scvi.external.SOLO(#1076). - Add "comparison" column to differential expression results (#1074).
- Clarify {class}
~scvi.external.CellAssignsize factor usage. See class docstring.
- Update minimum Python version to
3.7.2(#1082). - Slight interface changes to {class}
~scvi.train.PyroTrainingPlan."elbo_train"and"elbo_test"are now the average over minibatches as ELBO should be on scale of full data andoptim_kwargscan be set on initialization of training plan (#1059, #1101). - Use pandas' read pickle function for pbmc dataset metadata loading (#1099).
- Adds
n_samples_overallparameter to functions for denoised expression/accessibility/etc. This is used in during differential expression (#1090). - Ignore configure optimizers warning when training Pyro-based models (#1064).
- Fix scale of library size for simulated doublets and expression in {class}
~scvi.external.SOLOwhen using observed library size to train original {class}~scvi.model.SCVImodel (#1078, #1085). Currently, library sizes in this case are not appropriately put on the log scale. - Fix issue where anndata setup with a layer led to errors in {class}
~scvi.external.SOLO(#1098). - Fix
adataparameter of {func}scvi.external.SOLO.from_scvi_model, which previously did nothing (#1078). - Fix default
max_epochsof {class}~scvi.model.SCANVIwhen initializing using pre-trained model of {class}~scvi.model.SCVI(#1079). - Fix bug in
predict()function of {class}~scvi.model.SCANVI, which only occurred for soft predictions (#1100).
None!
From the user perspective, this release features the new differential expression functionality (to
be described in a manuscript). For now, it is accessible from
{func}~scvi.model.SCVI.differential_expression. From the developer perspective, we made changes
with respect to {class}scvi.dataloaders.DataSplitter and surrounding the Pyro backend. Finally,
we also made changes to adapt our code to PyTorch Lightning version 1.3.
- Pass
n_labelsto {class}~scvi.module.VAEfrom {class}~scvi.model.SCVI(#1055). - Require PyTorch lightning > 1.3, add relevant fixes (#1054).
- Add DestVI reference (#1060).
- Add PeakVI links to README (#1046).
- Automatic delta and eps computation in differential expression (#1043).
- Allow the doublet ratio parameter to be changed for used in SOLO (#1066).
- Fix an issue where
transform_batchoptions in {class}~scvi.model.TOTALVIwas accidentally altering the batch encoding in the encoder, which leads to poor results (#1072). This bug was introduced in version 0.9.0.
These breaking changes do not affect the user API, though will impact model developers.
- Use PyTorch Lightning data modules for {class}
scvi.dataloaders.DataSplitter(#1061). This induces a breaking change in the way the data splitter is used. It is no longer callable and now has asetupmethod. See {class}~scvi.train.TrainRunnerand its source code, which is straightforward. - No longer require training plans to be initialized with
n_obs_trainingargument (#1061).n_obs_trainingis now a property that can be set before actual training to rescale the loss. - Log Pyro loss as
train_elboand sum over steps (#1071)
- Includes new optional variance parameterization for the
Encodermodule (#1037). - Provides a new way to select subpopulations for DE using Pandas queries (#1041).
- Update reference to peakVI (#1046).
- Pin Pytorch Lightning version to <1.3
- PeakVI minor enhancements to differential accessibility and fix scArches support (#1019)
- Add DestVI to the codebase (#1011)
- Versioned tutorial links (#1005)
- Remove old VAEC (#1006)
- Use
.numpy()to convert torch tensors to numpy ndarrays (#1016) - Support backed AnnData (#1017), just load anndata with
scvi.data.read_h5ad(path, backed='r+') - Solo interface enhancements (#1009)
- Updated README (#1028)
- Use Python warnings instead of logger warnings (#1021)
- Change totalVI protein background default to
Falseis fewer than 10 proteins used (#1034)
- Fix
SaveBestStatewarning (#1024) - New default SCANVI max epochs if loaded with pretrained SCVI model (#1025), restore old
<v0.9behavior. - Fix marginal log likelihood computation, which was only being computed on the final minibatch of
a dataloader. This bug was introduced in the
0.9.Xversions (#1033). - Fix bug where extra categoricals were not properly extended in
transfer_anndata_setup(#1030).
- Update Pyro module backend to better enforce usage of
modelandguide, automate passing of number of training examples to Pyro modules (#990) - Minimum Pyro version bumped (#988)
- Improve docs' clarity (#989)
- Add glossary to developer user guide (#999)
- Add num threads config option to
scvi.settings(#1001) - Add CellAssign tutorial (#1004)
This release features our new software development kit for building new probabilistic models. Our hope is that others will be able to develop new models by importing scvi-tools into their own packages.
From the user perspective, there are two package-wide API breaking changes and one
{class}~scvi.model.SCANVI specific breaking change enumerated below. From the method developer
perspective, the entire model backend has been revamped using PyTorch Lightning, and no old code
will be compatible with this and future versions. Also, we dropped support for Python 3.6.
n_epochsis nowmax_epochsfor consistency with PytorchLightning and to better reflect the functionality of the parameter.use_cudais nowuse_gpufor consistency with PytorchLightning.frequencyis nowcheck_val_every_n_epochfor consistency with PytorchLightning.train_fun_kwargsandkwargsthroughout thetrain()methods in the codebase have been removed and various arguments have been reorganized intoplan_kwargsandtrainer_kwargs. Generally speaking,plan_kwargsdeal with model optimization like kl warmup, whiletrainer_kwargsdeal with the actual training loop like early stopping.
use_cudawas removed from the init of each model and was not replaced byuse_gpu. By default, every model is initialized on CPU but can be moved to a device viamodel.to_device(). If a model is trained withuse_gpu=Truethe model will remain on the GPU after training.- When loading saved models, scvi-tools will always attempt to load the model on GPU unless otherwise specified.
- We now support specifying which GPU device to use if there are multiple available GPUs.
- {class}
~scvi.model.SCANVIno longer pretrains an {class}~scvi.model.SCVImodel by default. This functionality however is preserved via the new {func}~scvi.model.SCANVI.from_scvi_modelmethod. n_epochs_unsupervisedandn_epochs_semisupervisedhave been removed fromtrain. It has been replaced withmax_epochsfor semisupervised training.n_samples_per_labelis a new argument which will subsample the number of labeled training examples to train on per label each epoch.
- {class}
~scvi.model.PEAKVIimplementation (#877, #921) - {class}
~scvi.external.SOLOimplementation (#923, #933) - {class}
~scvi.external.CellAssignimplementation (#940) - {class}
~scvi.external.RNAStereoscopeand {class}~scvi.external.SpatialStereoscopeimplementation (#889, #959) - Pyro integration via {class}
~scvi.module.base.PyroBaseModuleClass(#895 #903, #927, #931)
- {class}
~scvi.model.SCANVIbug fixes (#879) - {class}
~scvi.external.GIMVImoved to external api (#885) - {class}
~scvi.model.TOTALVI, {class}~scvi.model.SCVI, and {class}~scvi.model.SCANVInow support multiple covariates (#886) - Added callback for saving the best state of a model (#887)
- Option to disable progress bar (#905)
- load() documentation improvements (#913)
- updated tutorials, guides, documentation (#924, #925, #929, #934, #947, #971)
- the track is now public (#938)
- {class}
~scvi.model.SCANVInow logs classification loss (#966) - get_likelihood_parameter() bug (#967)
- model.history is now pandas DataFrames (#949)
freeze_classifieroption in {func}~scvi.model.SCANVI.load_query_datafor the case whenweight_decaypassed to {func}~scvi.model.SCANVI.trainalso passes toClassifierTrainer
Online updates of {class}~scvi.model.SCVI, {class}~scvi.model.SCANVI, and {class}~scvi.model.TOTALVI with the scArches method
It is now possible to iteratively update these models with new samples without altering the model for the "reference" population. Here we use the scArches method. For usage, please see the tutorial in the user guide.
To enable scArches in our models, we added a few new options. The first is encode_covariates,
which is an SCVI option to encode the one-hotted batch covariate. We also allow users to exchange
batch norm in the encoder and decoder with layer norm, which can be though of as batch norm but per
cell. As the layer norm we use has no parameters, it's a bit faster than models with batch norm. We
find few differences between using batch norm or layer norm in our models, though we have
kept defaults the same in this case. To run scArches effectively, batch norm should be exchanged
with layer norm.
The learned prior parameters for the protein background were randomly initialized. Now, they can be
set with the empirical_protein_background_prior option in {class}~scvi.model.TOTALVI. This
option fits a two-component Gaussian mixture model per cell, separating those proteins that are
background for the cell and those that are foreground, and aggregates the learned mean and variance
of the smaller component across cells. This computation is done per batch, if the batch_key was
registered. We emphasize this is just for the initialization of a learned parameter in the model.
Many of our models like SCVI, SCANVI, and {class}~scvi.model.TOTALVI learn a latent library
size variable. The option use_observed_lib_size may now be passed on model initialization. We
have set this as True by default, as we see no regression in performance, and training is a bit
faster.
- To facilitate these enhancements, saved {class}
~scvi.model.TOTALVImodels from previous versions will not load properly. This is due to an architecture change of the totalVI encoder, related to latent library size handling. - The default latent distribution for {class}
~scvi.model.TOTALVIis now"normal". - Autotune was removed from this release. We could not maintain the code given the new API changes, and we will soon have alternative ways to tune hyperparameters.
- Protein names during
setup_anndataare now stored inadata.uns["_scvi"]["protein_names"], instead ofadata.uns["scvi_protein_names"].
- Fixed an issue where the unlabeled category affected the SCANVI architecture prior distribution.
Unfortunately, by fixing this bug, loading previously trained (<v0.8.0)
{class}
~scvi.model.SCANVImodels will fail.
This small update provides access to our new Discourse forum from the documentation.
Scvi is now scvi-tools. Version 0.7 introduces many breaking changes. The best way to learn how to use scvi-tools is with our documentation and tutorials.
- New high-level API and data loading, please see tutorials and examples for usage.
GeneExpressionDatasetand associated classes have been removed.- Built-in datasets now return
AnnDataobjects. scvi-toolsnow relies entirely on the [AnnData] format.scvi.modelshas been moved toscvi.core.module.Posteriorclasses have been reduced to wrappers onDataLoadersscvi.inferencehas been split toscvi.core.data_loadersforAnnDataLoaderclasses andscvi.core.trainersfor trainer classes.- Usage of classes like
TrainerandAnnDataLoadernow require theAnnDatadata object as input.
The scvi-tools package used to be scvi. This page commemorates all the hard work on the scvi package by our numerous contributors.
- @romain
- @adam
- @eddie
- @jeff
- @pierre
- @max
- @yining
- @gabriel
- @achille
- @chenling
- @jules
- @david-kelley
- @william-yang
- @oscar
- @casey-greene
- @jamie-morton
- @valentine-svensson
- @stephen-flemming
- @michael-raevsky
- @james-webber
- @galen
- @francesco-brundu
- @primoz-godec
- @eduardo-beltrame
- @john-reid
- @han-yuan
- @gokcen-eraslan
- downgrade anndata>=0.7 and scanpy>=1.4.6 @galen
- make loompy optional, raise skmisc import error @adam
- fix PBMCDataset download bug @galen
- fix AnnDatasetFromAnnData _X in adata.obs bug @galen
- add tqdm to within cluster DE genes @adam
- restore tqdm to use a simple bar instead of ipywidget @adam
- move to numpydoc for docstrings @adam
- update issues templates @adam
- Poisson variable gene selection @valentine-svensson
- BrainSmallDataset set default save_path_10X @gokcen-eraslan
- train_size must be a float between 0.0 and 1.0 @galen
- bump dependency versions @galen
- remove reproducibility notebook @galen
- fix scanVI dataloading @pierre
- updates to totalVI posterior functions and notebooks @adam
- update seurat v3 HVG selection now using skmisc loess @adam
- add back Python 3.6 support @adam
- get_sample_scale() allows gene selection @valentine-svensson
- bug fix to the dataset to anndata method with how cell measurements are stored @adam
- fix requirements @adam
- bug in the version for Louvain in setup.py @adam
- update highly variable gene selection to handle sparse matrices @adam
- update DE docstrings @pierre
- improve posterior save load to also handle subclasses @pierre
- Create NB and ZINB distributions with torch and refactor code accordingly @pierre
- typos in autozivae @achille
- bug in csc sparse matrices in anndata data loader @adam
- handles gene and cell attributes with the same name @han-yuan
- fixes anndata overwriting when loading @adam, @pierre
- formatting in basic tutorial @adam
- updates on TotalVI and LDVAE @adam
- fix documentation, compatibility and diverse bugs @adam, @pierre @romain
- fix for external module on scanpy @galen
- do not automatically upper case genes @adam
- AutoZI @oscar
- Made the intro tutorial more user-friendly @adam
- Tests for LDVAE notebook @adam
- black codebase @achille @gabriel @adam
- fix compatibility issues with sklearn and numba @romain
- fix Anndata @francesco-brundu
- docstring, totalVI, totalVI notebook and CITE-seq data @adam
- fix type @eduardo-beltrame
- fixing installation guide @jeff
- improved error message for dispersion @stephen-flemming
- gimVI @achille
- synthetic correlated datasets, fixed bug in marginal log likelihood @oscar
- autotune, dataset enhancements @gabriel
- documentation @jeff
- more consistent posterior API, docstring, validation set @adam
- fix anndataset @michael-raevsky
- linearly decoded VAE @valentine-svensson
- support for scanpy, fixed bugs, dataset enhancements @achille
- fix filtering bug, synthetic correlated datasets, docstring, differential expression @pierre
- better docstring @jamie-morton
- classifier based on library size for doublet detection @david-kelley
- corrected notebook @jules
- added UMAP and updated harmonization code @chenling @romain
- support for batch indices in csvdataset @primoz-godec
- speeding up likelihood computations @william-yang
- better anndata interop @casey-greene
- early stopping based on classifier accuracy @david-kelley
- updated to torch v1 @jules
- added stress tests for harmonization @chenling
- fixed autograd breaking @romain
- make removal of empty cells more efficient @john-reid
- switch to os.path.join @casey-greene
- added baselines and datasets for sMFISH imputation @jules
- added harmonization content @chenling
- fixing bugs on DE @romain
- annotation notebook @eddie
- Memory footprint management @jeff
- updated early stopping @max
- docstring @james-webber
- First release on PyPi
- Skeleton code and dependencies @jeff
- Unit tests @max
- PyTorch implementation of scVI @eddie @max
- Dataset preprocessing @eddie @max @yining
- First scVI TensorFlow version @romain