Skip to content

Releases: PavlidisLab/Gemma

1.32.4

29 Oct 20:58

Choose a tag to compare

This patch release of Gemma refines the single-cell work that has been going on in Gemma.

  • display basic single-cell statistics on the dataset page
  • minimal integration for Cell Browser
  • filtering improvements have been finally merged! #1407
  • filtering of raw 10x Cell Ranger datasets
  • faster startup for Gemma Web by initializing some costly beans asynchronously
  • add a distinct MultiQC report for the cell type annotation pipeline

Gemma REST 2.9.2

We made a few additions to the REST API, notably the capability of filtering predicates and objects from statements and addressing issues with endpoints producing TSV that were incorrectly marked as binary.

Refer to the full changelog for more details.

Single-cell overview

Overview from the development trunk was partially backported to showcase the number of cells and a link to a UCSC Cell Browser.

image

Filtering of 10x Cell Ranger datasets

Gemma now integrates with Cell Ranger to provide filtering of raw 10x Cell Ranger datasets. Thanks to @rachadele for the initial implementation.

The filter is applied if an unfiltered 10x MEX dataset is detected. We use various heuristics to do that efficiently and ultimately resort to checking the MTX file.

We also implemented detection of 10x chemistry from GEO series metadata, which are used for parameterizing the filter.

Support for Java 19 and beyond

The rule limiting the Java version for building Gemma has been removed. We now only require a minimum of Java 11 and support for Java up to 24 has been tested. The class files that we generate are still targeting Java 8, however.

1.32.2

05 Aug 22:17

Choose a tag to compare

This patch release comes with numerous bug fixes and performance improvements. There are a couple of neat refinement for single-cell data and anticipatory work to integrate visualization in the next minor release.

  • add support for cursor fetching and "native" streaming with MySQL
  • standardize options for writing expression data files across all CLIs
  • add options for replacing CTAs and CLCs
  • add support for Cell Browser-compatible outputs
  • improve the characteristics browser and address all possible "owner" of characteristics
  • add support for cell-level masks and aggregating single-cell data with a mask
  • add assay as a category for curating datasets

1.32.1

21 Jun 19:31

Choose a tag to compare

This release address numerous issues introduced in the 1.32.0.

  • disable caching of processed vectors until #1401 is resolved
  • fix pubmed metadata update by making BibliographicReference mutable
  • fix aggregation of single-cell data from the command line
  • include factor names in analysis result set filename
  • support loading single-cell data from AnnData using 32 bit integers and single-precision floats
  • set the normalized flag when generating quantile-normalized processed vectors
  • major improvement for the diffExAnalyze CLI tool

1.31.13

27 Mar 21:44

Choose a tag to compare

Tag hotfix

1.31.12

29 Oct 20:17

Choose a tag to compare

Changeset

  • fix P-value plot
  • increase QC plot size
  • mark JSESSIONID as http-only
  • add consistent ordering behavior when ordering by a nullable field in the REST API
  • filter stacktraces printed in the console to hide proxy and Hibernate internals
  • allow HEAD on all GET endpoints
  • numerous improvements for parsing and loading data from GEO

1.31.11

29 Oct 20:18

Choose a tag to compare

Tag hotfix

1.31.10

29 Oct 20:18

Choose a tag to compare

Tag hotfix

1.31.9

10 Jul 05:34

Choose a tag to compare

Tag hotfix

1.31.8

26 Jun 02:41

Choose a tag to compare

Changeset

  • fix various issues with platform and vector merging
  • new endpoint exposing batch information and effect (reserved for curators)
  • quantitation type can be retrieved by name in the REST API
  • improvement for creating and deleting vectors in batch
  • improve serialization of interaction and continuous factors when producing result sets in TSV

Improved encoding of interactions and continuous factors in result sets TSV output

Although rarely used, Gemma's linear model can handle continuous factors. The TSV output not fully supports this.

When we produce a TSV output for a result set, we need to encode three types of contrasts: single factor, interaction of two factors and continuous factors. Those are encoded as follows:

  • contrast_{fv_id}_{key} for a single factor
  • contrast_{fv_id1}_{fv_id2}_{key} for an interaction between two factors
  • contrast_{key} for a continuous factor

where {key} is one of coefficient, log2fc tstat or pvalue.

Gemma is inherently limited to a single continuous factor per result set. If that were to change, we would have to account for this by adjusting the encoding.

Retrieve differential expression results across datasets

The RESTful API has been bumped to 2.8.0 and features a new endpoint for retrieving DE results for a given gene across all datasets, subsets and result sets curated in Gemma.

Results can be filtered at the dataset-level with the usual query and filter parameters and paginated with offset and limit. They can also be filtered by corrected P-value using threshold to reject results with a poor fit for the given gene.

GET /datasets/analyses/differential/results/taxa/human/genes/BRCA1 HTTP/1.1

The endpoint can also be requested to produce a tabular output by passing Accept: text/tab-separated-values.

GET /datasets/analyses/differential/results/taxa/{taxon}/genes/{gene} HTTP/1.1
Accept: text/tab-separated-values

Retrieve raw vectors with quantitation type names

It is now possible to use a name for retrieving vectors for a given experiment.

GET /datasets/{dataset}/data/raw?quantitationType={name}

Common quantitation type name for raw data vectors are:

  • log2cpm
  • counts
  • rpkm
  • rma value
  • value

The first three are used for RNA-Seq data.

1.31.7

29 Oct 20:18

Choose a tag to compare

Tag hotfix