An end-to-end project using Python, Jupyter Notebook, Excel, MySQL, Pandas, NumPy, and Power BI to analyze website traffic and Google AdWords data. This project transforms raw keyword-level data into a structured relational model to uncover insights for SEO, CPC trends, and digital marketing optimization.
- ๐ฏ Project Objective
- ๐ ๏ธ Tools Used
- ๐ Project Files
- ๐ Project Tree
- ๐ Workflow
- ๐งฉ Data Model Overview
- โ Key Features
- ๐ License
- ๐ค Contributing
- ๐ค Author
To convert raw AdWords and website traffic data into a clean, structured dataset that enables analysis of:
- Keyword ranking trends and performance
- CPC, competition, and keyword difficulty
- Traffic share and keyword cost-effectiveness
- Budget optimization for paid ads and SEO strategy
| Tool/Library | Purpose |
|---|---|
| Python | Assign keyword_id, create fact and keyword dimension tables |
| Jupyter Notebook | Interactive Python code and data processing |
| Pandas | Data manipulation and cleaning |
| NumPy | Numerical transformation support |
| Excel | Create competition and keyword_difficulty dimension tables |
| MySQL | Define fact table structure first, then import data & enforce relations |
| Power BI | Build dashboards, model schema, and use DAX for reporting |
- Raw Source
data/raw/traffic_data_RAW.xlsโ Original keyword and traffic data export
- Fact Table
data/final/website_traffic_data.csvโ Keyword-level traffic metrics (Python-generated)
- Dimension Tables
data/final/keyword.csvโ Keyword ID and text (Python-generated)data/final/competition.csvโ Competition scores (Excel-generated)data/final/keyword_difficulty.csvโ Difficulty ratings (Excel-generated)
notebooks/assaign_keyword_ID.ipynbโ Python notebook for:- Assigning
keyword_ids - Generating
data/final/website_traffic_data.csvanddata/final/keyword.csv
- Assigning
sql/traffic_data_script.sqlโ SQL script to:- Apply primary/foreign keys
- Finalize schema relationships after importing all data
website_traffic_google_ADword_analysis/
โโโ ๐ configs/
โ โโโ โ๏ธ db_config.yaml # Database connection details
โ
โโโ ๐ data/
โ โโโ ๐ raw/ # Original, unprocessed data
โ โ โโโ ๐ traffic_data_RAW.xls
โ โโโ ๐ interim/ # Temporary/cleaned intermediate files
โ โโโ ๐ final/ # Final datasets for analysis
โ โโโ ๐ competition.csv
โ โโโ ๐ keyword_difficulty.csv
โ โโโ ๐ keyword.csv
โ โโโ ๐ website_traffic_data.csv
โ
โโโ ๐ notebooks/
โ โโโ ๐ assaign_keyword_ID.ipynb # Exploratory analysis & preprocessing
โ
โโโ ๐ reports/
โ โโโ ๐ dashboards/ # Power BI or visualization dashboards
โ โ โโโ ๐ website traffic data.pbix
โ โโโ ๐ figures/ # Charts, plots & model diagrams
โ โ โโโ ๐ competition.png
โ โ โโโ ๐ excel_raw_data.png
โ โ โโโ ๐ keyword_difficulty.png
โ โ โโโ ๐ keyword.png
โ โ โโโ ๐ mysql_table_relation.png
โ โ โโโ ๐ power_bi_modeling.png
โ โ โโโ ๐ traffic_data_dashboard.png
โ โ โโโ ๐ website_traffic_data.png
โ โโโ ๐ summary_reports/
โ โโโ ๐ website_traffic_adwords_report.pdf
โ
โโโ ๐ scripts/ # Automation & data processing scripts
โ
โโโ ๐ sql/
โ โโโ ๐ข๏ธ traffic_data_script.sql # SQL queries & schema
โ
โโโ ๐ซ .gitignore # Ignore unnecessary files for Git
โโโ ๐ LICENCE # License details
โโโ ๐ README.md # Project overview & setup instructions
โโโ ๐ฆ requirements.txt # Python dependencies
def keyword_id(text):
if "scrum" in text or "csm" in text or "smum" in text or "srum" in text:
return 1
elif "amazon" in text or "aws" in text or "devops" in text:
return 2
elif (
"pmp" in text
or "project management" in text
or "pmi" in text
or "proyectos" in text
or "it project" in text
):
return 3
elif "cloud" in text:
return 4
elif "capm" in text:
return 5
elif "cspo" in text:
return 6
elif "itil" in text or "itl" in text:
return 7
elif (
"simpl" in text
or "simli" in text
or "simpi" in text
or "smpli" in text
or "simi" in text
or "sipli" in text
):
return 8
elif "safe" in text or "scale" in text:
return 9
elif "togaf" in text or "udacity" in text or "it architect" in text:
return 10
df["Keyword ID"] = df["Keyword"].apply(keyword_id)
df.head(10)- Extract and clean
keywordnames alongside IDs
def keyword(value):
if 1 == value:
return 'Scrum Master'
elif 2 == value:
return 'AWS'
elif 3 == value:
return 'PMP'
elif 4 == value:
return 'Cloud'
elif 5 == value:
return 'CAPM'
elif 6 == value:
return 'CSPO'
elif 7 == value:
return 'ITIL'
elif 8 == value:
return 'Simplilearn'
elif 9 == value:
return 'SAFe'
elif 10 == value:
return 'TOGAF'
else:
return 'No Value'
website_traffic_data['Keyword'] = website_traffic_data['Keyword ID'].apply(keyword)
website_traffic_data.head(10)- Clean and Format data using Pandas and NumPy
- Export:
- Use Excel formulas (VLOOKUP, XLOOKUP, SUMIF) to create:
- Create
data/final/website_traffic_datatable structure first in MySQL to avoid data mismatch
CREATE TABLE website_traffic_data (
title VARCHAR(255) NOT NULL,
keyword_id INT NOT NULL,
position INT NOT NULL,
previous_position INT NOT NULL,
last_seen DATE NOT NULL,
search_volume INT NOT NULL,
cost_per_click DECIMAL(10 , 2 ) NOT NULL,
traffic INT NOT NULL,
traffic_percent DECIMAL(10 , 2 ) NOT NULL,
traffic_cost INT NOT NULL,
traffic_cost_percent DECIMAL(10 , 2 ) NOT NULL,
competition DECIMAL(10 , 2 ) NOT NULL,
number_of_results INT NOT NULL,
keyword_difficulty INT NOT NULL
);-
Import all
.csvfiles:data/final/website_traffic_data.csvdata/final/keyword.csvdata/final/competition.csvdata/final/keyword_difficulty.csv
-
Run
sql/traffic_data_script.sqlto:- Apply primary keys to dimension tables
ALTER TABLE competition ADD CONSTRAINT pk_competition PRIMARY KEY (`keyword ID`); ALTER TABLE keyword_difficulty ADD CONSTRAINT pk_keyword_difficulty PRIMARY KEY (`keyword ID`); ALTER TABLE keywords ADD CONSTRAINT pk_keywords PRIMARY KEY (`keyword ID`);
- Add foreign key constraints to relate tables
ALTER TABLE website_traffic_data ADD CONSTRAINT fk_competition FOREIGN KEY (keyword_id) REFERENCES competition(`keyword ID`); ALTER TABLE website_traffic_data ADD CONSTRAINT fk_keyword_difficulty FOREIGN KEY (keyword_id) REFERENCES keyword_difficulty(`keyword ID`); ALTER TABLE website_traffic_data ADD CONSTRAINT fk_keywords FOREIGN KEY (keyword_id) REFERENCES keywords(`keyword ID`);
-
โ Use MySQL Workbench ER Diagram to visually validate relationships between fact and dimension tables

- Imported all tables directly from MySQL
- Verified relationships using Power BIโs model view
- Ensured correct cardinality and cross-filtering direction
- Modeled using a clean star schema layout for performance and clarity

- Created calculated columns and measures such as:
Average CPC = AVERAGE('traffic_data website_traffic_data'[cost_per_click])Total Search Volume = SUM('traffic_data website_traffic_data'[search_volume])SUM('traffic_data website_traffic_data'[traffic])Total Traffic Cost = SUM('traffic_data website_traffic_data'[traffic_cost])Traffic Percent = AVERAGE('traffic_data website_traffic_data'[traffic_percent])Calendar = CALENDAR([Min Date],[Max Date])Max Date = MAX('traffic_data website_traffic_data'[last_seen])Min Date = MIN('traffic_data website_traffic_data'[last_seen])
- Built dashboards using visuals, slicers, and cards to showcase performance trends and keyword insights
| Table Name | Type | Description | Key Field | Created Using |
|---|---|---|---|---|
website_traffic_data |
Fact Table | Keyword-level AdWords traffic metrics | keyword_id |
Python |
keyword |
Dimension | Keyword ID and name mapping | keyword_id |
Python |
competition |
Dimension | Keyword competition scores | keyword_id |
Excel |
keyword_difficulty |
Dimension | Keyword difficulty ratings | keyword_id |
Excel |
- Assign and manage keyword IDs using Python
- Build normalized relational structure in MySQL
- Use Excel for additional dimension data
- Apply schema constraints and validate relationships with ER diagrams
- Model and visualize insights in Power BI with custom DAX measures
- Run Python notebook to generate:
data/final/website_traffic_data.csvdata/final/keyword.csv
- Create
data/final/competition.csvanddata/final/keyword_difficulty.csvin Excel - In MySQL:
- Create structure for
website_traffic_datafirst - Import all
.csvfiles - Run
sql/traffic_data_script.sqlto define schema and constraints - Validate schema with ERD view
- Create structure for
- Connect Power BI to MySQL
- Model the data and use DAX to create KPIs and dashboards
This project is licensed under the MIT License.
Contributions are welcome! Please fork the repository and submit a pull request.
Hi, I'm Hemant, a data enthusiast passionate about turning raw data into meaningful business insights.
๐ซ Letโs connect:
- LinkedIn : LinkedIn Profile
- Email : [email protected]





