Skip to content

Latest commit

 

History

History
198 lines (130 loc) · 11.7 KB

File metadata and controls

198 lines (130 loc) · 11.7 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[0.2.1]

Fixed

  • Compatibility with anndata>=0.12.13 (#240) @eroell

[0.2.0]

Fixed

  • Assigning .X to a view of an X-less {class}~ehrdata.EHRData (e.g. one created with layers= only) no longer raises TypeError: 'NoneType' object does not support item assignment. The view is now materialised before the assignment, consistent with how AnnData handles other field modifications on views. (#233) @eroell

Modified

  • {func}~ehrdata.infer_feature_types considers integers from 0, ..., n as numeric. It further provides a new argument binary_as, to steer if columns 0/1 should be considered numeric or categorical. (#231) @eroell

[0.1.2]

Added

  • {func}~ehrdata.io.from_pandas with format='long' provides a new keyword argument fill_time_gaps that fills missing timegaps in the common case of integer time steps from 0 to n_timesteps (#229) @eroell

Modified

  • {func}~ehrdata.dt.mimic_2 column censor_flg switched to lifeline's convention with 1=event, 0=censored, before this dataset loader function had them vice versa since the dataset provides them as such originally. (#227) @sueoglu

Fixed

  • {func}~ehrdata.io.from_pandas with format='long' misordered entries in .X/.layers with .obs if the input df was not sorted for the obs id keys, which is now fixed. (#228) @eroell

Documentation

  • Documentation style polishing (#223) @zethson

[0.1.1]

Added

  • {func}~ehrdata.io.omop.setup_connection can read .parquet files. (#217) @eroell

Fixed

  • Sliceing of EHRData objects fixed when the backing object is an AnnData. (#218) @eroell

Maintenance

  • More concise messages in {func}~ehrdata.infer_feature_types. (#215) @zethson

[0.1.0]

Added

  • {func}~ehrdata.move_to_obs and {func}~ehrdata.move_to_x are new helpers for conveniently moving variables from central 2D arrays to the .obs field, and vice versa. (#199) @eroell
  • {func}~ehrdata.dt.physionet2019 as another out-of-the-box, conveniently available dataset with 40'000 ICU stays from the Physionet 2019 challenge. (#204) @eroell
  • time_precision parameter ("date" or "datetime") to {func}~ehrdata.io.omop.setup_variables and {func}~ehrdata.io.omop.setup_interval_variables for finer temporal granularity control. (#210) @eroell

Fixed

  • {func}~ehrdata.io.read_h5ad fixed issues when backed=True. (#199) @eroell
  • {func}~ehrdata.io.read_h5ad fixed bug when .X is None and harmonize_missing_features is True. (#206) @eroell
  • {func}~ehrdata.io.omop.setup_obs with observation_table="person_visit_occurrence" now supports multiple visits per patient, creating one row per visit with unique observation IDs, instead of failing with xarray conversion errors with non-unique indices. (#210) @eroell
  • OMOP time interval boundaries now use half-open intervals [start, end) to prevent duplicate measurements at interval boundaries. (#210) @eroell

Maintenance

  • Support Python3.14 (#194) @Zethson
  • Address FutureWarnings across multiple places (#200) @eroell
  • Enhanced tutorial structure (#208) @eroell

Modified

  • Dataset generator function ed.dt.ehrdata_blobs now takes n_cat_var and n_categories arguments to generate categorical (integer encoded) time series data (#207) @sueoglu
  • If enrich_var_with_feature_info=True in {func}~ehrdata.io.omop.setup_variables and {func}~ehrdata.io.omop.setup_interval_variables, data_table_concept_ids not included within the concept table are now mapped from their respective alternate concept_id included in the concept_relationship table to retrieve the available feature information. (#205) @KilianDahm
  • {func}~ehrdata.io.omop.setup_variables and {func}~ehrdata.io.omop.setup_interval_variables with use of "person" now checks birth_datetime for meaningful behaviour and error messages. (#210) @eroell
  • {func}~ehrdata.integrations.vitessce.gen_default_config provides convenience to generate a config directly from an EHRData object, and should be used instead of the previous ehrdata.integrations.vitessce.gen_config. (#211) @eroell

[0.0.10]

{class}~ehrdata.EHRData drops the .R field, and now supports 3D data storage in any slot of .layers. See the {doc}tutorials/getting_started tutorial for an introduction to this behaviour. In the future, .X will be enabled soon for 3D data storage as well.

Maintenance

  • Enhanced {doc}tutorials/getting_started (#184) @eroell
  • Move from zarr<3 to zarr>=3 (#185) @eroell

Fixed

Modified

  • EHRData drops the .R field in favor of using .layers for any 3D data arrays (#184) @eroell
  • EHRData's shape property will always return a 3 dimensional shape. If an EHRData object has flat arrays only, the third dimension will be 1. (#184) @eroell
  • The following functions now take a layer argument: {func}~ehrdata.io.read_csv, {func}~ehrdata.io.from_pandas, {func}~ehrdata.io.to_pandas, {func}~ehrdata.io.omop.setup_variables, {func}~ehrdata.io.omop.setup_interval_variables, {func}~ehrdata.dt.ehrdata_blobs, {func}~ehrdata.dt.physionet2012. If it is let to its default, None, the .X field of EHRData is used. Since .X is 2D in this release, in cases with 3D data, the layer argument needs to be used. (#184) @eroell
  • {func}~ehrdata.io.write_zarr now writes an EHRData specific store encoding, with anndata as a substore. This change allows to use AnnData with its change to consolidated Zarr metadata, and better isolates AnnData's io. (#185) @eroell
  • {func}~ehrdata.io.read_zarr is adapted to read the new store encoding, and can also deal with AnnData stores. (#185) @eroell

[0.0.9]

Maintenance

  • Use custom logger & remove pydata sparse (#176) @Zethson
  • Replace figshare with scverse S3 (#177) @Zethson
  • Update template to v0.6.0 (#166) @Zethson

Fixed

  • Fix order of var created in ed.io.omop.setup_variables and ed.io.omop.setup_interval_variables (#179) @eroell

Modified

  • Rename ed.pl.vitessce.gen_config to ed.integrations.vitessce.gen_config (#181) @eroell
  • Rename ed.tl.omop.EHRDataset to ed.integrations.torch.OMOPEHRDataset (#181) @eroell

[0.0.8]

Fixed

  • Update duckdb imports for future (#157) @eroell

Maintenance

  • Private subset method for EHRData (#160) @eroell
  • Remove omop package dependency (#160) @eroell

[0.0.7]

Fixed

  • Fix tests and Getting Started Notebook (#155) @eroell

Maintenance

  • Update duckdb imports for future (#155) @eroell

[0.0.6]

Fixed

  • Cleaned up and updated tutorial notebooks (#140) @agerardy

Added

  • {func}~ehrdata.io.read_csv Reads a csv file (#136) @eroell
  • {func}~ehrdata.io.read_h5ad Reads an h5ad file (#136) @eroell
  • {func}~ehrdata.io.read_zarr Reads a zarr file (#136) @eroell
  • {func}~ehrdata.io.write_h5ad Writes an h5ad file (#136) @eroell
  • {func}~ehrdata.io.write_zarr Writes a zarr file (#136) @eroell
  • {func}~ehrdata.io.from_pandas Transform a given {class}~pandas.DataFrame into an {class}~ehrdata.EHRData object (#136) @eroell
  • {func}~ehrdata.io.to_pandas Transform an {class}~ehrdata.EHRData object into a {class}~pandas.DataFrame (#136) @eroell
  • {func}~ehrdata.dt.mimic_2 Loads the MIMIC-II dataset (#136) @eroell
  • {func}~ehrdata.dt.mimic_2_preprocessed Loads the preprocessed MIMIC-II dataset (#136) @eroell
  • {func}~ehrdata.dt.diabetes_130_raw Loads the raw diabetes-130 dataset (#136) @eroell
  • {func}~ehrdata.dt.diabetes_130_fairlearn Loads the preprocessed diabetes-130 dataset by fairlearn (#136) @eroell
  • {func}~ehrdata.infer_feature_types Infer feature types in an {class}~ehrdata.EHRData object (#136) @eroell
  • {func}~ehrdata.feature_type_overview Overview of inferred feature types (#136) @eroell
  • {func}~ehrdata.replace_feature_types Replacing inferred feature types (#136) @eroell
  • {func}~ehrdata.harmonize_missing_values Harmonize missing values in an {class}~ehrdata.EHRData object (#136) @eroell

[0.0.5]

Fixed

  • Initialize EHRData with X and layers (#132) @eroell

Added

Modified

  • Rename .t attribute to .tem

[0.0.4]

Fixed

  • Zarr version to less than 3

[0.0.3]

Fixed

  • Added missing zarr dependency

[0.0.2]

Added

  • Expanded documentation
  • Improved OMOP Extraction
  • Support for COO sparse matrices for R
  • A ed.dt.ehrdata_blobs test data generator function
  • Replace -1 encoded missing values with nans in physionet2012 challenge data

Breaking changes

  • Renamed r to R

[0.0.1] - 2024-11-04

Added

  • Initial release

[Unreleased]

Added

  • Basic tool, preprocessing and plotting functions

Fixed

  • tutorial notebooks updated to align with breaking changes