Authors:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt1. Scrape hackathon data:
python lauzhack_scraper.pyOutputs: data/lauzhack_projects.csv, data/lauzhack_projects.json, data/lauzhack_hackathons.csv, data/lauzhack_hackathons.json
2. Enrich with GitHub data:
python enrich_github_data.py --token YOUR_GITHUB_TOKENOutputs: data/lauzhack_projects_with_github.csv, data/github_repos_data.json
Get your GitHub token: github.com/settings/tokens
Projects CSV:
- Basic: year, name, awards, description, team, link, tags
- GitHub: stars, forks, language, contributors, commit dates, README
Hackathons CSV:
- year, url, date, location, schedule
lauzhack_scraper.py
- Scrapes LauZHack websites (2023-2025) to extract project information and hackathon metadata
- Outputs:
data/lauzhack_projects.csv,data/lauzhack_projects.json,data/lauzhack_hackathons.csv,data/lauzhack_hackathons.json
github_extractor.py
- Helper module with functions to extract GitHub repo data (stars, forks, commits, contributors, README)
- Used by
enrich_github_data.py(no need to run directly)
enrich_github_data.py
- Takes the scraped projects and enriches them with GitHub data for all repos
- Outputs:
data/lauzhack_projects_with_github.csv,data/github_repos_data.json
Copyright © 2025-2028 Swiss Data Science Center (SDSC), www.datascience.ch. All rights reserved. The SDSC is jointly established and legally represented by the École Polytechnique Fédérale de Lausanne (EPFL) and the Eidgenössische Technische Hochschule Zürich (ETH Zürich). This copyright encompasses all materials, software, documentation, and other content created and developed by the SDSC.