Here is the README in raw text format:
This repository contains code and analysis for exploring trends and insights within the Olympics dataset.
The dataset includes statistics on athletes, countries, and events from every Olympics from 1896 to 2016. It has details like:
- Athlete demographics
- Event outcomes
- GDP and population data for participating countries
Data sources:
- USA, Russia, Germany and China are top medal winners, with swimming, diving, wrestling as areas of dominance
- There is a positive correlation between GDP and medal tally
- Contingent size correlates with medal count
- Linear model predicts medal tally with low error
- Height, weight data produces accurate predictions for some sports
To replicate the analysis:
- Clone the repository
- Run the Jupyter notebook
The core libraries used include Pandas, Matplotlib, Seaborn, Statsmodels, Scikit-learn.
- Vedant Deshpande