Releases: schneiderkamplab/syntheval
Syntheval v1.7.1
Improvements:
- Rich console mode no longer suppresses warnings, and warnings are now handled in a nicer way and can be toggled with SynthEval's new
show_warningsargument.
Bug fixes:
- The FIO metric would fail when only a single outcome variable was provided, due to a mistaken variable reassignment.
- The MIA metric would fail when the size of the synthetic data was less than the size of the holdout data. We made a catch because keeping a large holdout set may be intended.
- PCA is not really meant to work with fewer rows than columns; for the PCA metric, this would cause an error from sklearn. We made a catch and now produce a warning when this is the case.
SynthEval v1.7.0
The 1.7.0 update introduces the new "rich" console mode, based on the Rich library, and refines the metrics output interface, allowing dynamic switching between modes without exess tech debt. The analysis target variable pattern has been completely overhauled to use a configurable object that handles all labels relevant to downstream task analyses, along with confounders, with finer-grained control. In addition, we introduce a timeout feature that allows long metrics to be interrupted and skipped, we improve error handling and configurability, and we add Maximum Mean Discrepancy, along with Feature Importance Overlap, to the metric library.
Refactor:
- Changed the
format_outputmethod in all metric classes to return a list of tuples suitable for rich console output, replacing previous string-based formatting. This affects the metric template, core metric class, and all metric implementations. - Changed the
analysis_target_varto ananalysis_targetobject: parsing is handled automatically to retain the simple interface, but the object can also be parsed as an object for superior fine-grained control. See the new section inguides/syntheval_guide.ipynbfor a tutorial on how this object can be used.
New Features:
- Added a
consoleargument for the main class, which allows specifying use ofrich(new) orascii(legacy) formatting for the console print or to turnoffentirely. We added a check to automatically switch toasciifromrichif in a notebook environment to prevent crashing the terminal. - Added a timeout feature based on
asyncio, for the main evaluation loop to allow interrupting and skipping of metrics that take too long to complete. By default, timeout is not enabled. - Added a
plot_figuresattribute to metric classes, allowing users to control figure plotting separately from verbosity. - Added a corresponding argument
enable_plotsto the SynthEval class, to control plotting in the main evaluation loop. - Added a
missing_directiveargument to the SynthEval class, so that the user can control if SynthEval should raise a warning, drop rows with missingness, or ignore that there is missingness and carry on. We added a small discussion in theguides/preprocessing.mdguide on missingness, for users interested in other solutions. - Added the Maximum Mean Discrepancy (MMD) metric, recording both the biased and unbiased versions of the statistic. A detailed explanation and reference are added in
guides/metrics_references.md. - Added the Feature Importance Overlap (FIO) metric, which checks two properties related to feature selection/ranking tasks. Namely, that the importance values assigned by a predictive model match (mean absolute error and weighted mean absolute error), and that the features recovered in a ranking at 5%, 10%, 25% and 50% of features are the same. The metric can also plot the top feature importance scores. A description of this metric is added in
guides/metrics_references.md. - With the new
analysis_targetobject, the increased flexibility allowed some important changes in the classification accuracy, auroc difference, attribute disclosure risk, and statistical parity metrics, now accounting for multiple potential labels, and dynamically removing confounders in prediction tasks (this is also considered in the new FIO metric).
Documentation:
- Minor details were added to the
README.md. - Major reformatting of the metrics overview in
README.mdinto tables with metric keywords and links to the method documentation, and in the few instances where applicable to theguides/metrics_references.md. - Guide codebooks were refreshed with the latest features, new metrics, and nicer printing.
guides/syntheval_guide.ipynbnow includes a new part on theanalysis_targetobject. - New
guides/preprocessing.mdadded to document preprocessing steps. - Adjusted a number of the metric docstrings.
Changes:
- Improved error handling in several metric scripts by raising
ValueErrorinstead of printing warnings or passing on failed assertions, ensuring clearer feedback for users and that errors are raised in the active console. - Changed default ranking system for the benchmark method, from
linear(min-max sum) tosummation(flat sum). - Changed the default preprocessing for PCA metric from "mean" to "std" for consistency.
- Changed the default F1 setting in the classification metric from
microtoweighted, for more saturation-aware behaviour on imbalanced classification problems. - Changed the default confidence interval unit for CIO and DWM metrics from
semtostd. The new optioncican be used to switch back to the old behaviour if needed. - Attribute disclosure metric had the
sensitiveargument removed; now the sensitive attributes are parsed through theanalysis_targetobject. - Statistical parity metric similarly had the
protected_attributeargument removed, and now also uses thesensitive_varsattribute in theanalysis_targetobject. In addition, the statistical parity metric can now evaluate multiple protected attributes.
Bug fixes:
- Hellinger distance metric had a division by zero error when determining binwidth when the interquartile range was 0: this error is now caught, and handles the binning in the non-monotonous case using variable binwidth.
- Adding the timeout feature, caused a bunch of warnings from plotting outside of the main thread, on certain versions of matplotlib with tkinter; this is fixed by using the 'Agg' backend for plotting. In addition, the
async.iocaused a notebook crash on some versions, so we added a catch that replaces direct event-loop calls with a loop-self-helper using coroutines if a loop is already running. - Improved type casting to meet NumPy 2.x scalar handling (tests broke because of previously lazy type-handling)
- Attribute disclosure risk, membership inference attack, and Kolmogorov-Smirnov test metrics would return undefined STD for STD calculation on lists of length 1 (which was not a huge problem, but would throw warnings); we made catches to avoid this.
SynthEval v1.6.2
This release includes various fixes and minor improvements.
- cls_acc: now outputs signed difference, (negative is inferior and positive is better to the real training data)
- cls_acc: possible to select only a subset of the classifiers
- cls_acc: extended output can be enabled of individual accuracies and differences in the result data frame
- auroc: signed difference enabled
- cio: fixed mistake in calculation of CIs
SynthEval v1.6.1
Fix deprecation warnings and add stratification to classification accuracy difference metric.
Smaller fixes wrt. datatypes.
SynthEval v 1.6.0
SynthEval 1.6.0 introduce a new category for metrics called "fairness", doctests, and more.
Fairness dimension
Synthetic data can be generated for various reasons. Privacy is usually part of the motivation, but augmentation is also a key factor.
Creating synthetic data that accurately mimics real-world datasets also propagates biases and imbalances unless extra steps are taken. Whether to try to mitigate such biases and imbalances and thus hurt resemblance to the training data remains a dilemma, but there are many arguments for tackling these problems early in the pipeline rather than downstream.
Enabling fairness metrics to be treated separately from utility and privacy metrics in SynthEval will hopefully prove productive for promoting additional fairness dimensions to be investigated, and for monitoring how utility and privacy are affected by better/worse fairness.
Doctests
Tests for SynthEval have long been so outdated to the point that they were practically non-existent. In this update reviewed all metrics and key files and added additional documentation and doctests to all metrics and main scripts. The doctests work as examples on how each method functions and are verified to work through pytest. The new yml file on GitHub, runs all doctests every time the main branch is updated, and the the status from the latest run is now displayed at the top of the front page.
Other updates and fixes
- Statistical Parity Difference is added as the first fairness metric.
- A helper function is added for formatting in the console print (old stuff can be updated).
- Minor fix for MIA metric, also renamed activation key from "mia_risk" to "mia" to avoid misinterpretation.
- MIA metric had its saved outputs changed from back from F1, to recall and precision.
SynthEval v1.5.1
Release the eps risk privacy loss and the Quantile MSE
SynthEval v1.5.0
This release includes many fixes to the previous iteration, as well as a new metric: the PCA eigenvalue- and angle difference.
SynthEval v1.4.1
GitHub release to go alongside the existing release on PyPI