Skip to content

Commit 59d1de1

Browse files
authored
Update README.md (#34)
* Update README.md Added HF links and table of contents * Update README.md * Update README.md Fixed HF URLs * Update README.md Added citation * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md
1 parent c91f032 commit 59d1de1

File tree

1 file changed

+29
-12
lines changed

1 file changed

+29
-12
lines changed

README.md

+29-12
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,30 @@
1-
# OpenPMC-VL
1+
# Open-PMC
22

33
----------------------------------------------------------------------------------------
44

55
[![code checks](https://github.com/VectorInstitute/aieng-template/actions/workflows/code_checks.yml/badge.svg)](https://github.com/VectorInstitute/pmc-data-extraction/actions/workflows/code_checks.yml)
66
[![integration tests](https://github.com/VectorInstitute/aieng-template/actions/workflows/integration_tests.yml/badge.svg)](https://github.com/VectorInstitute/pmc-data-extraction/actions/workflows/integration_tests.yml)
77
[![license](https://img.shields.io/github/license/VectorInstitute/aieng-template.svg)](https://github.com/VectorInstitute/pmc-data-extraction/blob/main/LICENSE.md)
88

9-
A toolkit to download, augment, and benchmark OpenPMC-VL; a large dataset of image-text pairs extracted from open-access scientific articles on PubMedCentral.
9+
<div align="center">
10+
<img src="https://github.com/VectorInstitute/pmc-data-extraction/blob/0a969136344a07267bb558d01f3fe76b36b93e1a/media/open-pmc-pipeline.png?raw=true"
11+
alt="Open-PMC Pipeline"
12+
width="1000" />
13+
</div>
14+
15+
A toolkit to download, augment, and benchmark Open-PMC; a large dataset of image-text pairs extracted from open-access scientific articles on PubMedCentral.
16+
17+
For more details, see the following resources:
18+
- **arXiv Paper:** [http://arxiv.org/abs/2503.14377](http://arxiv.org/abs/2503.14377)
19+
- **Dataset:** [https://huggingface.co/datasets/vector-institute/open-pmc](https://huggingface.co/datasets/vector-institute/open-pmc)
20+
- **Model Checkpoint:** [https://huggingface.co/vector-institute/open-pmc-clip](https://huggingface.co/vector-institute/open-pmc-clip)
21+
22+
## Table of Contents
23+
24+
1. [Installing Dependencies](#installing-dependencies)
25+
2. [Download and Parse Image-Caption Pairs](#download-and-parse-image-caption-pairs-from-pubmed-articles)
26+
3. [Run Benchmarking Experiments](#run-benchmarking-experiments)
27+
4. [Citation](#citation)
1028

1129
## Installing dependencies
1230

@@ -133,18 +151,17 @@ mmlearn_run \
133151
dataloader.test.batch_size=64 \
134152
resume_from_checkpoint="path/to/model/checkpoint"
135153
```
136-
For more comprehensive examples of shell scripts that run various experiments with OpenPMC-VL, refer to `openpmcvl/experiment/scripts`.
154+
For more comprehensive examples of shell scripts that run various experiments with Open-PMC, refer to `openpmcvl/experiment/scripts`.
137155
For more information about `mmlearn`, please refer to the package's [official codebase](https://github.com/VectorInstitute/mmlearn).
138156

139157

140-
141-
## References
142-
<a id="1">[1]</a> PMC-OA paper:
143-
```latex
144-
@article{lin2023pmc,
145-
title={PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents},
146-
author={Lin, Weixiong and Zhao, Ziheng and Zhang, Xiaoman and Wu, Chaoyi and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},
147-
journal={arXiv preprint arXiv:2303.07240},
148-
year={2023}
158+
## Citation
159+
If you find the code useful for your research, please consider citing
160+
```bib
161+
@article{baghbanzadeh2025advancing,
162+
title={Advancing Medical Representation Learning Through High-Quality Data},
163+
author={Baghbanzadeh, Negin and Fallahpour, Adibvafa and Parhizkar, Yasaman and Ogidi, Franklin and Roy, Shuvendu and Ashkezari, Sajad and Khazaie, Vahid Reza and Colacci, Michael and Etemad, Ali and Afkanpour, Arash and Dolatabadi, Elham},
164+
journal={arXiv preprint arXiv:2503.14377},
165+
year={2025}
149166
}
150167
```

0 commit comments

Comments
 (0)