This repository was created by the authors of the Towards equitable AI for women’s health: accessible data as a catalyst for innovation paper.
Pre-print paper link here
- Bianca Schor, Kristin Collett Caolo, Le Minh Thao Doan et al. Towards equitable AI for women’s health:accessible data as a catalyst for innovation, 10 March 2026, PREPRINT (Version 1) available at Research Square https://doi.org/10.21203/rs.3.rs-8001150/v1
Paper abstract:
Artificial intelligence (AI) is rapidly advancing across health domains, yet its integration into women’s health remains challenged, limited by under-representation in clinical literature and datasets, inconsistent data standards, and a lack of coordinated access to multimodal research-quality data resources. This research maps the current horizon of accessible (i.e. open and accessible on request) data that can contribute to AI development for women’s health. Main resources include clinical data repositories, cancer registries, biobanks and published research studies. We summarise data resources related to cancers (breast, cervical, endometrial, and ovarian), chronic and acute health conditions (cardiovascular), under-diagnosed conditions (endometriosis), wearable and vital sign data from remote health monitoring, and discuss other potential resources, such as the broader healthcare data in community care and pharmacy data. We provide a working definition of ”women’s health”, a table centralising key accessible data sources under the level of resources (national registry/clinical study, single/multimodality), and discuss key challenges and opportunities to advance AI research and innovations in the field. To support accessibility and reuse, we also provide an open-access online repository of curated datasets. This paper thus offers a cornerstone for building an equitable AI for women’s health: it can support future assessments of data completeness, demographic diversity, clinically deployability, methodological benchmarks, licensing, pharmacovigilance, and contributes to highlighting the global AI research in the women’s health ecosystem.
Zenodo archive of full data table:
Schor, B., Caolo, K., Doan, L. M. T., Delfino, M., Occhipinti, A., Lu, H. Y., & Karoune, E. (2026). Towards equitable AI for women's health: accessible data as a catalyst for innovation (Version 2) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.19852537
Version 2 is the most recent update!
The spreadsheet of data for AI in Womens Health was initiated after the AI in Women’s Health: Bridging research and patient voices 1-day conference on 5th June 2025, organised by the AI for Women's health supra-interest group.
For our paper, we aimed to bring together a list of datasets and databases that were easily accessible and advocate for the need to make data open and FAIR to increase accessibility and reuse. This list was not intended to be a comprehensive, but a starting point for others to add to.
If you have created or know of any womens health data sets or databases that are suitable for AI research, please contribute to our list.
Option 1:
- Open an issue in this repository.
- Add all the information about the data - see the columns in the spreadsheet.
- We will then review this and reply with a comment.
- Then we will open a pull request and add your data to the spreadsheet.
- We will add you to our contributor list using all-contributors bot.
Option 2:
- Fill in this google form to provide all the information about the data - Link to form.
- We will then review the information and add the data to our list.
- We will add you as our contributor list using all-contributors bot.
For help or more information, please contact Emma Karoune at ekaroune@turing.ac.uk
This work is licensed under a Creative Commons Attribution 4.0 International License.
This work is licensed under a MIT license
