Turning raw data into predictive models and actionable insights
Master's student in Data Science at Webster University, building end-to-end machine learning pipelines and deploying predictive models as production-ready web applications. My work spans healthcare analytics, regression modelling, classification, and interactive dashboarding with a focus on turning complex data into clear, measurable business value.
Hands-on experience across the full DS lifecycle: from EDA and feature engineering through model training, evaluation, and Streamlit deployment. Comfortable working with Python, SQL, and BI tools in fast-paced, project-driven environments.
| Category | Technologies |
|---|---|
| Languages | Python Β· Java Β· C Β· SQL |
| ML / Data Science | Scikit-Learn Β· Pandas Β· NumPy Β· Linear Regression Β· Logistic Regression Β· SVM Β· Random Forest Β· XGBoost Β· Feature Engineering |
| Data & EDA | Exploratory Data Analysis Β· Data Cleaning Β· Correlation Analysis Β· Seaborn Β· Matplotlib |
| Visualisation & BI | Tableau Β· Power BI Β· Streamlit Β· HTML5 Β· CSS3 |
| Deployment & Tools | Streamlit Β· Joblib / Pickle Β· Docker Β· Jupyter Notebook Β· Git Β· GitHub |
Regression Β· Streamlit Deployment Β· End-to-End ML Pipeline
| Item | Detail |
|---|---|
| Problem | Predict individual medical insurance charges based on demographic & lifestyle features |
| Approach | Full pipeline β EDA β feature encoding β Linear Regression β evaluation β Streamlit deployment |
| Key Insight | Smoking status is the dominant cost driver (~3β4Γ charge multiplier); age and BMI are secondary predictors |
| Metrics | RΒ² β 0.77 Β· MAE β $4,300 Β· RMSE β $6,200 |
| Stack | Python Β· Pandas Β· Scikit-Learn Β· Matplotlib Β· Seaborn Β· Streamlit |
| Live App | π medicalinsurancecostpredict.streamlit.app |
| Repo | Medical-insurance-cost-prediction |
Classification Β· Healthcare Analytics Β· EDA + Tableau Dashboard
| Item | Detail |
|---|---|
| Problem | Diagnostically predict whether a patient has diabetes using the Pima Indians dataset |
| Approach | Data cleaning β EDA β feature selection β classification model training β Tableau dashboard for stakeholder reporting |
| Key Insight | Glucose level and BMI are the strongest predictive features; class imbalance required careful handling |
| Stack | Python Β· Scikit-Learn Β· Pandas Β· Seaborn Β· Tableau |
| Repo | (https://github.com/Yashwanth-Kumar-Kotla/Diabetes-Prediction-using-Machine-Learning-with-Python) |
EDA Β· Business Intelligence Β· Automated Reporting
| Item | Detail |
|---|---|
| Problem | Analyse monthly sales data to surface trends, anomalies, and actionable KPIs for the business team |
| Approach | End-to-end analysis β ingestion β cleaning β aggregation β visualisation β executive-ready report |
| Key Insight | Identified seasonal demand patterns and underperforming product segments; automated ETL reduced reporting overhead |
| Stack | Python Β· Pandas Β· Matplotlib Β· Excel Β· Power BI |
| Repo | (https://github.com/Yashwanth-Kumar-Kotla/Linear_regression_on_AdvertisingSales_without_sklearn) |
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
β Data βββββΆβ EDA & βββββΆβ Feature β
β Ingestion β β Visualisationβ β Engineering β
ββββββββββββββββ ββββββββββββββββ ββββββββββ¬ββββββββββ
β
βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
β Deployment ββββββ Evaluation ββββββ Model Training β
β (Streamlit) β β (RΒ² / MAE) β β (Train/Test) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ
Every project follows this structured loop β ensuring reproducibility, interpretability, and production readiness.
| Now | Next 6 Months |
|---|---|
| Deep Learning fundamentals (CNNs, RNNs) | Build & deploy a multi-model comparison framework |
| Advanced feature engineering techniques | Explore NLP pipelines (text classification, sentiment) |
| SQL query optimisation & window functions | Pursue AWS or Google Cloud ML certification |
| SHAP / model explainability methods | Contribute to open-source data science tools |
Target roles: Data Scientist Β· ML Engineer Β· Analytics Engineer
| Channel | Link |
|---|---|
| kotla.yashwanthkumar@gmail.com | |
| linkedin.com/in/kotlayashwanthkumar | |
| GitHub | github.com/Yashwanth-Kumar-Kotla |
| Twitter / X | x.com/@yashkotla |
Open to internships, graduate roles, and project collaborations in Data Science & Machine Learning. Feel free to reach out.