Skip to content

Latest commit

 

History

History
44 lines (23 loc) · 1.33 KB

File metadata and controls

44 lines (23 loc) · 1.33 KB

University of Virginia logo

Research

Author: Ian Kariuki

Institution: UVA School of Data Science

Advisor: Alex Gates

Project Intent:

Increase impact of research initiatives at UVA by analyzing historical uva grant proposals and associated faculty information. This work aims to uncover patterns and characteristics correlated with successful funding outcomes

Data Acquisition:

Design and implement a scalable web scraping pipeline that collects and stores structured faculty biography metadata from UVA department websites. The collected metadata includes but is not limited to:

- Faculty Name
- Department Affiliation
- Professional Title
- Biography Text
- Research Expertise and Interests
- Publicly Available Contact Information (email)

The pipeline will begin with faculty in UVA's School of Data Science and is designed to expand to additional departments as the project scales. Raw HTML snapshots are preserved to ensure reproducibility and track potential changes to already recorded data.

TO RUN:

Download dependencies within requirements.txt

Within scrapers folder run

python run.py --departments "department name"

ex. python run.py --departments "data science" "economics" "psychology"