Author: Ian Kariuki
Institution: UVA School of Data Science
Advisor: Alex Gates
Increase impact of research initiatives at UVA by analyzing historical uva grant proposals and associated faculty information. This work aims to uncover patterns and characteristics correlated with successful funding outcomes
Design and implement a scalable web scraping pipeline that collects and stores structured faculty biography metadata from UVA department websites. The collected metadata includes but is not limited to:
- Faculty Name
- Department Affiliation
- Professional Title
- Biography Text
- Research Expertise and Interests
- Publicly Available Contact Information (email)
The pipeline will begin with faculty in UVA's School of Data Science and is designed to expand to additional departments as the project scales. Raw HTML snapshots are preserved to ensure reproducibility and track potential changes to already recorded data.
Download dependencies within requirements.txt
Within scrapers folder run
python run.py --departments "department name"
ex. python run.py --departments "data science" "economics" "psychology"
