Skip to content

Releases: schneiderkamplab/syntheval

Syntheval v1.7.1

11 May 10:16
a424f48

Choose a tag to compare

Improvements:

  • Rich console mode no longer suppresses warnings, and warnings are now handled in a nicer way and can be toggled with SynthEval's new show_warnings argument.

Bug fixes:

  • The FIO metric would fail when only a single outcome variable was provided, due to a mistaken variable reassignment.
  • The MIA metric would fail when the size of the synthetic data was less than the size of the holdout data. We made a catch because keeping a large holdout set may be intended.
  • PCA is not really meant to work with fewer rows than columns; for the PCA metric, this would cause an error from sklearn. We made a catch and now produce a warning when this is the case.

SynthEval v1.7.0

22 Apr 12:44
3b2b0a4

Choose a tag to compare

The 1.7.0 update introduces the new "rich" console mode, based on the Rich library, and refines the metrics output interface, allowing dynamic switching between modes without exess tech debt. The analysis target variable pattern has been completely overhauled to use a configurable object that handles all labels relevant to downstream task analyses, along with confounders, with finer-grained control. In addition, we introduce a timeout feature that allows long metrics to be interrupted and skipped, we improve error handling and configurability, and we add Maximum Mean Discrepancy, along with Feature Importance Overlap, to the metric library.

Refactor:

  • Changed the format_output method in all metric classes to return a list of tuples suitable for rich console output, replacing previous string-based formatting. This affects the metric template, core metric class, and all metric implementations.
  • Changed the analysis_target_var to an analysis_target object: parsing is handled automatically to retain the simple interface, but the object can also be parsed as an object for superior fine-grained control. See the new section in guides/syntheval_guide.ipynb for a tutorial on how this object can be used.

New Features:

  • Added a console argument for the main class, which allows specifying use of rich (new) or ascii (legacy) formatting for the console print or to turn off entirely. We added a check to automatically switch to ascii from richif in a notebook environment to prevent crashing the terminal.
  • Added a timeout feature based on asyncio, for the main evaluation loop to allow interrupting and skipping of metrics that take too long to complete. By default, timeout is not enabled.
  • Added a plot_figures attribute to metric classes, allowing users to control figure plotting separately from verbosity.
  • Added a corresponding argument enable_plots to the SynthEval class, to control plotting in the main evaluation loop.
  • Added a missing_directive argument to the SynthEval class, so that the user can control if SynthEval should raise a warning, drop rows with missingness, or ignore that there is missingness and carry on. We added a small discussion in the guides/preprocessing.md guide on missingness, for users interested in other solutions.
  • Added the Maximum Mean Discrepancy (MMD) metric, recording both the biased and unbiased versions of the statistic. A detailed explanation and reference are added in guides/metrics_references.md.
  • Added the Feature Importance Overlap (FIO) metric, which checks two properties related to feature selection/ranking tasks. Namely, that the importance values assigned by a predictive model match (mean absolute error and weighted mean absolute error), and that the features recovered in a ranking at 5%, 10%, 25% and 50% of features are the same. The metric can also plot the top feature importance scores. A description of this metric is added in guides/metrics_references.md.
  • With the new analysis_target object, the increased flexibility allowed some important changes in the classification accuracy, auroc difference, attribute disclosure risk, and statistical parity metrics, now accounting for multiple potential labels, and dynamically removing confounders in prediction tasks (this is also considered in the new FIO metric).

Documentation:

  • Minor details were added to the README.md.
  • Major reformatting of the metrics overview in README.md into tables with metric keywords and links to the method documentation, and in the few instances where applicable to the guides/metrics_references.md.
  • Guide codebooks were refreshed with the latest features, new metrics, and nicer printing. guides/syntheval_guide.ipynb now includes a new part on the analysis_target object.
  • New guides/preprocessing.md added to document preprocessing steps.
  • Adjusted a number of the metric docstrings.

Changes:

  • Improved error handling in several metric scripts by raising ValueError instead of printing warnings or passing on failed assertions, ensuring clearer feedback for users and that errors are raised in the active console.
  • Changed default ranking system for the benchmark method, from linear (min-max sum) to summation (flat sum).
  • Changed the default preprocessing for PCA metric from "mean" to "std" for consistency.
  • Changed the default F1 setting in the classification metric from microto weighted, for more saturation-aware behaviour on imbalanced classification problems.
  • Changed the default confidence interval unit for CIO and DWM metrics from sem to std. The new option ci can be used to switch back to the old behaviour if needed.
  • Attribute disclosure metric had the sensitive argument removed; now the sensitive attributes are parsed through the analysis_targetobject.
  • Statistical parity metric similarly had the protected_attribute argument removed, and now also uses the sensitive_vars attribute in the analysis_targetobject. In addition, the statistical parity metric can now evaluate multiple protected attributes.

Bug fixes:

  • Hellinger distance metric had a division by zero error when determining binwidth when the interquartile range was 0: this error is now caught, and handles the binning in the non-monotonous case using variable binwidth.
  • Adding the timeout feature, caused a bunch of warnings from plotting outside of the main thread, on certain versions of matplotlib with tkinter; this is fixed by using the 'Agg' backend for plotting. In addition, the async.io caused a notebook crash on some versions, so we added a catch that replaces direct event-loop calls with a loop-self-helper using coroutines if a loop is already running.
  • Improved type casting to meet NumPy 2.x scalar handling (tests broke because of previously lazy type-handling)
  • Attribute disclosure risk, membership inference attack, and Kolmogorov-Smirnov test metrics would return undefined STD for STD calculation on lists of length 1 (which was not a huge problem, but would throw warnings); we made catches to avoid this.

SynthEval v1.6.2

24 Mar 10:10

Choose a tag to compare

This release includes various fixes and minor improvements.

  • cls_acc: now outputs signed difference, (negative is inferior and positive is better to the real training data)
  • cls_acc: possible to select only a subset of the classifiers
  • cls_acc: extended output can be enabled of individual accuracies and differences in the result data frame
  • auroc: signed difference enabled
  • cio: fixed mistake in calculation of CIs

SynthEval v1.6.1

16 Feb 09:25
4e45dd5

Choose a tag to compare

Fix deprecation warnings and add stratification to classification accuracy difference metric.
Smaller fixes wrt. datatypes.

SynthEval v 1.6.0

20 Dec 12:49

Choose a tag to compare

SynthEval 1.6.0 introduce a new category for metrics called "fairness", doctests, and more.

Fairness dimension

Synthetic data can be generated for various reasons. Privacy is usually part of the motivation, but augmentation is also a key factor.
Creating synthetic data that accurately mimics real-world datasets also propagates biases and imbalances unless extra steps are taken. Whether to try to mitigate such biases and imbalances and thus hurt resemblance to the training data remains a dilemma, but there are many arguments for tackling these problems early in the pipeline rather than downstream.
Enabling fairness metrics to be treated separately from utility and privacy metrics in SynthEval will hopefully prove productive for promoting additional fairness dimensions to be investigated, and for monitoring how utility and privacy are affected by better/worse fairness.

Doctests

Tests for SynthEval have long been so outdated to the point that they were practically non-existent. In this update reviewed all metrics and key files and added additional documentation and doctests to all metrics and main scripts. The doctests work as examples on how each method functions and are verified to work through pytest. The new yml file on GitHub, runs all doctests every time the main branch is updated, and the the status from the latest run is now displayed at the top of the front page.

Other updates and fixes

  • Statistical Parity Difference is added as the first fairness metric.
  • A helper function is added for formatting in the console print (old stuff can be updated).
  • Minor fix for MIA metric, also renamed activation key from "mia_risk" to "mia" to avoid misinterpretation.
  • MIA metric had its saved outputs changed from back from F1, to recall and precision.

SynthEval v1.5.1

17 Sep 13:55

Choose a tag to compare

Release the eps risk privacy loss and the Quantile MSE

SynthEval v1.5.0

23 Aug 07:40

Choose a tag to compare

This release includes many fixes to the previous iteration, as well as a new metric: the PCA eigenvalue- and angle difference.

SynthEval v1.4.1

24 Apr 08:02

Choose a tag to compare

GitHub release to go alongside the existing release on PyPI