You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We well know GANs for success in the realistic image generation. However, they can be applied in tabular data generation. We will review and examine some recent papers about tabular GANs in action.
Generative Adversarial Networks (GANs) are well-known for their success in realistic image generation. However, they can also be applied to generate tabular data. Here will give opportunity to try some of them.
4. Run all experiments `python ./Research/run_experiment.py`. Run all experiments `python run_experiment.py`. You may add more datasets, adjust validation type and categorical encoders.
140
-
5. Observe metrics across all experiment in console or
141
-
in `./Research/results/fit_predict_scores.txt`
142
-
135
+
4. Run all experiments `python ./Research/run_experiment.py`. Run all experiments `python run_experiment.py`. You may
136
+
add more datasets, adjust validation type and categorical encoders.
137
+
5. Observe metrics across all experiment in console or in `./Research/results/fit_predict_scores.txt`
143
138
144
139
145
-
## Acknowledgments
140
+
**Experiment design**
146
141
147
-
The author would like to thank Open Data Science community [7] for many
148
-
valuable discussions and educational help in the growing field of machine and
149
-
deep learning. Also, special big thanks to Sber [8] for allowing solving
150
-
such tasks and providing computational resources.
142
+

151
143
152
-
## References
144
+
**Picture 1.1** Experiment design and workflow
153
145
154
-
[1] Jonathan Hui. GAN — What is Generative Adversarial Networks GAN? (2018), medium article
146
+
## Results
147
+
To determine the best sampling strategy, ROC AUC scores of each dataset were scaled (min-max scale) and then averaged
148
+
among the dataset.
155
149
156
-
[2]Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio. Generative Adversarial Networks (2014). arXiv:1406.2661
150
+
**Table 1.2** Different sampling results across the dataset, higher is better (100% - maximum per dataset ROC AUC)
157
151
158
-
[3] Lei Xu LIDS, Kalyan Veeramachaneni. Synthesizing Tabular Data using Generative Adversarial Networks (2018). arXiv:1811.11264v1 [cs.LG]
[4] Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. Modeling Tabular Data using Conditional GAN (2019). arXiv:1907.00503v2 [cs.LG]
161
+
## Acknowledgments
161
162
162
-
[5] Denis Vorotyntsev. Benchmarking Categorical Encoders (2019). Medium post
163
+
The author would like to thank Open Data Science community [7] for many valuable discussions and educational help in the
164
+
growing field of machine and deep learning.
165
+
166
+
## Citation
167
+
168
+
If you use **GAN-for-tabular-data** in a scientific publication, we would appreciate references to the following BibTex entry:
[7] Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila. Analyzing and Improving the Image Quality of StyleGAN (2019) arXiv:1912.04958v2 [cs.CV]
183
+
[1] Lei Xu LIDS, Kalyan Veeramachaneni. Synthesizing Tabular Data using Generative Adversarial Networks (2018). arXiv:
184
+
1811.11264v1 [cs.LG]
167
185
168
-
[8] ODS.ai: Open data science (2020),https://ods.ai/
186
+
[2] Alexia Jolicoeur-Martineau and Kilian Fatras and Tal Kachman. Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees ((2023)https://github.com/SamsungSAILMontreal/ForestDiffusion[cs.LG]
169
187
170
-
[9] Sber (2020), https://www.sberbank.ru/
188
+
[3] Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. Modeling Tabular data using Conditional GAN. NeurIPS, (2019)
0 commit comments