You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Document/content/tests/AITG-MOD-03_Testing_for_Poisoned_Training_Sets.md
+4-8Lines changed: 4 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,14 +31,10 @@ This test identifies vulnerabilities associated with poisoned training datasets,
31
31
-**Enforce MLOps Security**: Secure the entire MLOps pipeline. Use strict access controls on data storage, version control for data and code, and require reviews for any changes to the data preprocessing or training scripts.
32
32
33
33
### Suggested Tools for this Specific Test
34
-
-**Cleanlab**: A data-centric AI package that automatically detects and corrects label errors, outliers, and other issues in datasets.
35
-
- Tool Link: [Cleanlab on GitHub](https://github.com/cleanlab/cleanlab)
36
-
-**Adversarial Robustness Toolbox (ART)**: Provides tools for crafting data poisoning attacks (for testing defenses) and implementing detection methods like activation clustering.
37
-
- Tool Link: [ART on GitHub](https://github.com/Trusted-AI/adversarial-robustness-toolbox)
38
-
-**Data Version Control (DVC)**: An open-source tool for data versioning, which is crucial for maintaining data integrity and reproducibility.
39
-
- Tool Link: [DVC Website](https://dvc.org/)
40
-
-**TensorFlow Data Validation (TFDV)**: A library for analyzing and validating machine learning data at scale. It can help detect anomalies and drift in your datasets.
-**Cleanlab**: A data-centric AI package that automatically detects and corrects label errors, outliers, and other issues in datasets - [Cleanlab on GitHub](https://github.com/cleanlab/cleanlab)
35
+
-**Adversarial Robustness Toolbox (ART)**: Provides tools for crafting data poisoning attacks (for testing defenses) and implementing detection methods like activation clustering - [ART on GitHub](https://github.com/Trusted-AI/adversarial-robustness-toolbox)
36
+
-**Data Version Control (DVC)**: An open-source tool for data versioning, which is crucial for maintaining data integrity and reproducibility - [DVC Website](https://dvc.org/)
37
+
-**TensorFlow Data Validation (TFDV)**: A library for analyzing and validating machine learning data at scale. It can help detect anomalies and drift in your datasets - [TFDV Documentation](https://www.tensorflow.org/tfx/data_validation/get_started)
42
38
43
39
### References
44
40
- Northcutt, Curtis, et al. "Confident Learning: Estimating Uncertainty in Dataset Labels." Journal of Artificial Intelligence Research, 2021. [Link](https://arxiv.org/abs/1911.00068)
0 commit comments