Author: Lily Gates
Data Source: Duolingo Language Report (2020–2025)
This project analyzes public data from the Duolingo Language Report (2020–2025) to explore global trends in language learning. Using Python and Plotly, it visualizes which languages are most popular in different countries over time and highlights patterns in language adoption.
Duolingo Language Report [2020–2025]: Public data (duolingo_language_report_2020_2025.xlsx)
- Overview Sheet: Contains global summary metrics and key statistics.
- Data by Country Sheet: Detailed language popularity data per country and per year (columns:
pop1_2020,pop2_2020, …pop2_2025).
- Converts wide-format country-language data into a tidy long-form dataframe.
- Safely extracts the year from column names for consistent analysis.
- Filters top languages for visualization clarity.
- Line Chart / Multi-Line Plot
- Shows the number of countries teaching each language over time (2020–2025).
- Horizontal Stacked Bar Chart
- Displays the top languages per year with “Most Popular” and “Second Most Popular” ranks.
- Interactive slider allows filtering by year.
- Overall Distribution Bar Chart
- Shows total number of countries teaching each language across all years.
This project requires the following Python packages:
- Python 3.10+ – Recommended version for compatibility.
- pandas – For data manipulation and cleaning.
- plotly – For interactive visualizations.
- openpyxl – To read Excel files (
.xlsxformat). - dash – For building interactive dashboards.
You can install all dependencies using pip:
pip install pandas plotly openpyxl dash
- Clone the repository:
git clone <repo_url>
cd duolingo_user_language_analysis
- Add the Excel data file:
- Ensure the file
duolingo_language_report_2020_2025.xlsxis placed in the project folder.
-
Install dependencies (if not already installed):
pip install pandas plotly openpyxl dash -
Run the Dash app:
python duolingo_user_language_analysis.py
- View the dashboard:
- The dashboard will automatically open in your default web browser.
- Use the slider to filter by year and explore the top languages per country.