Releases: PavlidisLab/Gemma
1.32.4
This patch release of Gemma refines the single-cell work that has been going on in Gemma.
- display basic single-cell statistics on the dataset page
- minimal integration for Cell Browser
- filtering improvements have been finally merged! #1407
- filtering of raw 10x Cell Ranger datasets
- faster startup for Gemma Web by initializing some costly beans asynchronously
- add a distinct MultiQC report for the cell type annotation pipeline
Gemma REST 2.9.2
We made a few additions to the REST API, notably the capability of filtering predicates and objects from statements and addressing issues with endpoints producing TSV that were incorrectly marked as binary.
Refer to the full changelog for more details.
Single-cell overview
Overview from the development trunk was partially backported to showcase the number of cells and a link to a UCSC Cell Browser.
Filtering of 10x Cell Ranger datasets
Gemma now integrates with Cell Ranger to provide filtering of raw 10x Cell Ranger datasets. Thanks to @rachadele for the initial implementation.
The filter is applied if an unfiltered 10x MEX dataset is detected. We use various heuristics to do that efficiently and ultimately resort to checking the MTX file.
We also implemented detection of 10x chemistry from GEO series metadata, which are used for parameterizing the filter.
Support for Java 19 and beyond
The rule limiting the Java version for building Gemma has been removed. We now only require a minimum of Java 11 and support for Java up to 24 has been tested. The class files that we generate are still targeting Java 8, however.
1.32.2
This patch release comes with numerous bug fixes and performance improvements. There are a couple of neat refinement for single-cell data and anticipatory work to integrate visualization in the next minor release.
- add support for cursor fetching and "native" streaming with MySQL
- standardize options for writing expression data files across all CLIs
- add options for replacing CTAs and CLCs
- add support for Cell Browser-compatible outputs
- improve the characteristics browser and address all possible "owner" of characteristics
- add support for cell-level masks and aggregating single-cell data with a mask
- add assay as a category for curating datasets
1.32.1
This release address numerous issues introduced in the 1.32.0.
- disable caching of processed vectors until #1401 is resolved
- fix pubmed metadata update by making BibliographicReference mutable
- fix aggregation of single-cell data from the command line
- include factor names in analysis result set filename
- support loading single-cell data from AnnData using 32 bit integers and single-precision floats
- set the normalized flag when generating quantile-normalized processed vectors
- major improvement for the diffExAnalyze CLI tool
1.31.13
1.31.12
Changeset
- fix P-value plot
- increase QC plot size
- mark JSESSIONID as http-only
- add consistent ordering behavior when ordering by a nullable field in the REST API
- filter stacktraces printed in the console to hide proxy and Hibernate internals
- allow HEAD on all GET endpoints
- numerous improvements for parsing and loading data from GEO
1.31.11
1.31.10
1.31.9
1.31.8
Changeset
- fix various issues with platform and vector merging
- new endpoint exposing batch information and effect (reserved for curators)
- quantitation type can be retrieved by name in the REST API
- improvement for creating and deleting vectors in batch
- improve serialization of interaction and continuous factors when producing result sets in TSV
Improved encoding of interactions and continuous factors in result sets TSV output
Although rarely used, Gemma's linear model can handle continuous factors. The TSV output not fully supports this.
When we produce a TSV output for a result set, we need to encode three types of contrasts: single factor, interaction of two factors and continuous factors. Those are encoded as follows:
contrast_{fv_id}_{key}for a single factorcontrast_{fv_id1}_{fv_id2}_{key}for an interaction between two factorscontrast_{key}for a continuous factor
where {key} is one of coefficient, log2fc tstat or pvalue.
Gemma is inherently limited to a single continuous factor per result set. If that were to change, we would have to account for this by adjusting the encoding.
Retrieve differential expression results across datasets
The RESTful API has been bumped to 2.8.0 and features a new endpoint for retrieving DE results for a given gene across all datasets, subsets and result sets curated in Gemma.
Results can be filtered at the dataset-level with the usual query and filter parameters and paginated with offset and limit. They can also be filtered by corrected P-value using threshold to reject results with a poor fit for the given gene.
GET /datasets/analyses/differential/results/taxa/human/genes/BRCA1 HTTP/1.1The endpoint can also be requested to produce a tabular output by passing Accept: text/tab-separated-values.
GET /datasets/analyses/differential/results/taxa/{taxon}/genes/{gene} HTTP/1.1
Accept: text/tab-separated-valuesRetrieve raw vectors with quantitation type names
It is now possible to use a name for retrieving vectors for a given experiment.
GET /datasets/{dataset}/data/raw?quantitationType={name}Common quantitation type name for raw data vectors are:
- log2cpm
- counts
- rpkm
- rma value
- value
The first three are used for RNA-Seq data.