Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ Inspired by [Awesome Synthetic Data](https://github.com/gretelai/awesome-synthet
+ [Copulas](https://github.com/sdv-dev/Copulas): a Python library for modeling multivariate distributions and sampling from them using copula functions.
+ [CTGAN](https://github.com/sdv-dev/CTGAN): SDV’s collection of deep learning-based synthetic data generators for single table data.
+ [DataGene](https://github.com/firmai/datagene): a tool to train, test, and validate datasets, detect and compare dataset similarity between real and synthetic datasets.
+ [distilabel](https://github.com/argilla-io/distilabel): Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
+ [DoppelGANger](https://github.com/fjxmlzn/DoppelGANger): a synthetic data generation framework based on generative adversarial networks (GANs).
+ [DP_WGAN-UCLANESL](https://github.com/nesl/nist_differential_privacy_synthetic_data_challenge): this solution trains a Wasserstein generative adversarial network (w-GAN) that is trained on the real private dataset.
+ [DPSyn](https://github.com/usnistgov/PrivacyEngCollabSpace/tree/master/tools/de-identification/Differential-Privacy-Synthetic-Data-Challenge-Algorithms/DPSyn): an algorithm for synthesizing microdata while satisfying differential privacy.
Expand All @@ -30,6 +31,7 @@ Inspired by [Awesome Synthetic Data](https://github.com/gretelai/awesome-synthet
+ [Synner](https://github.com/huda-lab/synner): an open source tool to generate real-looking synthetic data by visually specifying the properties of the dataset.
+ [Synth](https://www.getsynth.com/): an open source data-as-code tool that provides a simple CLI workflow for generating consistent data in a scalable way.
+ [Synthea](https://synthetichealth.github.io/synthea/): an open source synthetic patient generator that models the medical history of synthetic patients.
+ [Synthetic Data Generator](https://huggingface.co/spaces/argilla/synthetic-data-generator): Build datasets using natural language for free using Hugging Face inference endpoints, Gradio and distilabel.
+ [Synthetic data vault (SDV)](https://sdv.dev/): one of the first open source synthetic data solutions, SDV provides tools for generating synthetic data for tabular, relational, and time series data.
+ [TGAN](https://github.com/sdv-dev/TGAN): generative adversarial training for generating synthetic tabular data.
+ [Tofu](https://github.com/spiros/tofu): a Python library for generating synthetic UK Biobank data.
Expand Down