Yashwanth Kumar Kotla Yashwanth-Kumar-Kotla

Yashwanth Kumar Kotla

Data Scientist & Machine Learning Engineer

Turning raw data into predictive models and actionable insights

👋 About Me

Master's student in Data Science at Webster University, building end-to-end machine learning pipelines and deploying predictive models as production-ready web applications. My work spans healthcare analytics, regression modelling, classification, and interactive dashboarding with a focus on turning complex data into clear, measurable business value.

Hands-on experience across the full DS lifecycle: from EDA and feature engineering through model training, evaluation, and Streamlit deployment. Comfortable working with Python, SQL, and BI tools in fast-paced, project-driven environments.

🛠️ Skills

Category	Technologies
Languages	Python · Java · C · SQL
ML / Data Science	Scikit-Learn · Pandas · NumPy · Linear Regression · Logistic Regression · SVM · Random Forest · XGBoost · Feature Engineering
Data & EDA	Exploratory Data Analysis · Data Cleaning · Correlation Analysis · Seaborn · Matplotlib
Visualisation & BI	Tableau · Power BI · Streamlit · HTML5 · CSS3
Deployment & Tools	Streamlit · Joblib / Pickle · Docker · Jupyter Notebook · Git · GitHub

Technology Badges

🚀 Featured Projects

1. Medical Insurance Cost Prediction

Regression · Streamlit Deployment · End-to-End ML Pipeline

Item	Detail
Problem	Predict individual medical insurance charges based on demographic & lifestyle features
Approach	Full pipeline — EDA → feature encoding → Linear Regression → evaluation → Streamlit deployment
Key Insight	Smoking status is the dominant cost driver (~3–4× charge multiplier); age and BMI are secondary predictors
Metrics	R² ≈ 0.77 · MAE ≈ $4,300 · RMSE ≈ $6,200
Stack	Python · Pandas · Scikit-Learn · Matplotlib · Seaborn · Streamlit
Live App	🚀 medicalinsurancecostpredict.streamlit.app
Repo	Medical-insurance-cost-prediction

2. Diabetes Prediction & Diagnostic Analysis

Classification · Healthcare Analytics · EDA + Tableau Dashboard

Item	Detail
Problem	Diagnostically predict whether a patient has diabetes using the Pima Indians dataset
Approach	Data cleaning → EDA → feature selection → classification model training → Tableau dashboard for stakeholder reporting
Key Insight	Glucose level and BMI are the strongest predictive features; class imbalance required careful handling
Stack	Python · Scikit-Learn · Pandas · Seaborn · Tableau
Repo	(https://github.com/Yashwanth-Kumar-Kotla/Diabetes-Prediction-using-Machine-Learning-with-Python)

3. Sales Data Analysis & Reporting

EDA · Business Intelligence · Automated Reporting

Item	Detail
Problem	Analyse monthly sales data to surface trends, anomalies, and actionable KPIs for the business team
Approach	End-to-end analysis — ingestion → cleaning → aggregation → visualisation → executive-ready report
Key Insight	Identified seasonal demand patterns and underperforming product segments; automated ETL reduced reporting overhead
Stack	Python · Pandas · Matplotlib · Excel · Power BI
Repo	(https://github.com/Yashwanth-Kumar-Kotla/Linear_regression_on_AdvertisingSales_without_sklearn)

📊 ML Pipeline — How I Work

┌──────────────┐    ┌──────────────┐    ┌──────────────────┐
│  Data        │───▶│  EDA &       │───▶│  Feature         │
│  Ingestion   │    │  Visualisation│   │  Engineering     │
└──────────────┘    └──────────────┘    └────────┬─────────┘
                                                 │
                                                 ▼
┌──────────────┐    ┌──────────────┐    ┌──────────────────┐
│  Deployment  │◀───│  Evaluation  │◀───│  Model Training  │
│  (Streamlit) │    │  (R² / MAE)  │    │  (Train/Test)    │
└──────────────┘    └──────────────┘    └──────────────────┘

Every project follows this structured loop — ensuring reproducibility, interpretability, and production readiness.

📈 GitHub Stats

🎯 Currently Learning & Goals

Now	Next 6 Months
Deep Learning fundamentals (CNNs, RNNs)	Build & deploy a multi-model comparison framework
Advanced feature engineering techniques	Explore NLP pipelines (text classification, sentiment)
SQL query optimisation & window functions	Pursue AWS or Google Cloud ML certification
SHAP / model explainability methods	Contribute to open-source data science tools

Target roles: Data Scientist · ML Engineer · Analytics Engineer

📬 Contact

Channel	Link
Email	kotla.yashwanthkumar@gmail.com
LinkedIn	linkedin.com/in/kotlayashwanthkumar
GitHub	github.com/Yashwanth-Kumar-Kotla
Twitter / X	x.com/@yashkotla

Open to internships, graduate roles, and project collaborations in Data Science & Machine Learning. Feel free to reach out.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly