Skip to content

Commit 2394b05

Browse files
Refine README content for clarity and updates
Updated various sections for clarity and consistency, including expertise and skills. Added new project details and refined the closing remarks.
1 parent 8f6a905 commit 2394b05

File tree

1 file changed

+26
-36
lines changed

1 file changed

+26
-36
lines changed

README.md

Lines changed: 26 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
## Hi there, I’m Diogo Ribeiro 👋
22
**Senior Data Scientist • Mathematician • based between the United Kingdom and Portugal**
33

4-
> “Knowledge is knowing a tomato is a fruit; wisdom is not putting it in a fruit salad.”
5-
>
4+
> “Knowledge is knowing a tomato is a fruit; wisdom is not putting it in a fruit salad.”
65
> — Miles Kington
76
87
<p align="center">
@@ -11,7 +10,7 @@
1110
</a>
1211
</p>
1312

14-
**I build production systems that turn messy data into decisions. Two decades across logistics, health, and engineering taught me the value of lean models, clean code, and reproducible pipelines. Lately, I’ve been shipping NLP and statistical modelling that helps teams reason about text and time series in real time.**
13+
I build production systems that turn messy data into decisions. Two decades across logistics, health, and engineering taught me the value of lean models, clean code, and reproducible pipelines. Lately I’ve been shipping NLP and statistical modelling that helps teams reason about text and time series in real time.
1514

1615
<p align="center">
1716
<img src="data_has_a_better_idea.png"
@@ -26,62 +25,58 @@
2625

2726
- **Machine Learning**
2827
Supervised & unsupervised learning, anomaly detection, time-series forecasting, optimisation.
29-
3028
- **Graph & Network Analysis**
3129
Social/interaction networks, graph theory, dynamic metrics, community structure.
32-
3330
- **Big Data Analytics**
3431
Pattern discovery in marketing, logistics, and urban systems (structured & unstructured data).
35-
3632
- **Mathematical Modelling**
3733
Differential equations, statistical inference, numerical methods for complex systems.
38-
3934
- **Sustainability & Urban Systems**
4035
Energy optimisation, smart environments, traffic prediction.
4136

4237
---
4338

4439
## 🛠️ Technical Skills
4540

46-
- **Programming** — Python (typed, NumPy-first), SQL, R, TypeScript, Bash/Zsh, C, Fortran
41+
- **Programming** — Python (typed, NumPy-first), SQL, R, TypeScript, Bash/Zsh, C, Fortran
4742
- **ML / Data** — NumPy, Pandas, Polars, FireDucks; scikit-learn, XGBoost/LightGBM; PyTorch, TensorFlow; Statsmodels
48-
_Focus:_ time series, anomaly detection, GLMs/IRLS, robust statistics
49-
- **Data Eng & Streaming** — Apache Kafka, Flink, Spark, Databricks; Arrow/Parquet; Apache Iceberg lakehouse
50-
- **Cloud & Storage** — AWS S3, DynamoDB; PostgreSQL, MySQL, SQLite; MongoDB, InfluxDB
51-
- **DevEx & CI/CD** — Docker; GitHub Actions, Jenkins; Poetry; pre-commit (ruff, mypy, pytest-cov); semantic versioning
43+
_Focus:_ time series, anomaly detection, GLMs/IRLS, robust statistics
44+
- **Data Eng & Streaming** — Apache Kafka, Flink, Spark, Databricks; Arrow/Parquet; Apache Iceberg (lakehouse)
45+
- **Cloud & Storage** — AWS S3, DynamoDB; PostgreSQL, MySQL, SQLite; MongoDB, InfluxDB
46+
- **DevEx & CI/CD** — Docker; GitHub Actions, Jenkins; Poetry; pre-commit (ruff, mypy, pytest-cov); semantic versioning
5247
- **Testing & Quality** — pytest, coverage, property-based tests (hypothesis); static typing; security linting (bandit)
5348

5449
---
5550

5651
## 🔭 Research Interests
5752

58-
- **Health Data Science** — real-time analytics from wearables/sensors, personalised baselines, clinical interpretability
59-
- **Graph Theory & Social Networks** — interaction graphs, diffusion/contagion models, community & role discovery
60-
- **Big Data & Marketing Analytics** — uplift modelling, sequence-aware attribution, lifetime value with drift control
61-
- **Sustainability & Energy Systems** — demand forecasting, optimisation under constraints, carbon-aware scheduling
62-
- **Smart Environments & Sensor Networks** — multimodal fusion (RSSI + activations), localisation, reliability modelling
63-
- **Behavioural & Labour Economics** — micro-behavioural patterns, incentive effects, heterogeneity and fairness
53+
- **Health Data Science** — real-time analytics from wearables/sensors, personalised baselines, clinical interpretability
54+
- **Graph Theory & Social Networks** — interaction graphs, diffusion/contagion models, community & role discovery
55+
- **Big Data & Marketing Analytics** — uplift modelling, sequence-aware attribution, lifetime value with drift control
56+
- **Sustainability & Energy Systems** — demand forecasting, optimisation under constraints, carbon-aware scheduling
57+
- **Smart Environments & Sensor Networks** — multimodal fusion (RSSI + activations), localisation, reliability modelling
58+
- **Behavioural & Labour Economics** — micro-behavioural patterns, incentive effects, heterogeneity and fairness
6459
- **Inequality & Sustainable Development** — distributional metrics, policy simulation, causal and counterfactual analysis
6560

66-
> **Current themes:** real-time anomaly detection; Bayesian filtering/HMMs for indoor localisation; robust regression & GLMs (IRLS); LLM-assisted reporting with audit trails.
61+
> **Now:** real-time anomaly detection; Bayesian filtering/HMMs for indoor localisation; robust regression & GLMs (IRLS); LLM-assisted reporting with audit trails; **abx-next** (modern A/B experimentation utilities).
6762
6863
---
6964

7065
## 📌 Pinned Projects
7166

72-
<!-- Replace the placeholder links with your repo URLs -->
67+
- **abx-next** — A/B experimentation utilities: CUPED/CUPAC hooks, triggered analysis, SRM guardrails, switchback helpers, and power simulations.
68+
👉 [repo](https://github.com/DiogoRibeiro7/abx-next)
7369

7470
- **genSurvPy** — Survival-data generators (AFT/CPHM, censored data), reproducible simulations, and validation utilities.
75-
👉 [repo](https://github.com/DiogoRibeiro7/genSurvPy) <!-- update if different -->
76-
71+
👉 [repo](https://github.com/DiogoRibeiro7/genSurvPy)
7772

7873
- **smart-todo-action** — GitHub Action that extracts TODOs, groups by semantic labels/tags/metadata, and opens issues/changelogs.
79-
👉 [repo](https://github.com/DiogoRibeiro7/smart-todo-action) <!-- update if different -->
74+
👉 [repo](https://github.com/DiogoRibeiro7/smart-todo-action)
8075

8176
- **navier-stokes-solvers** — CFD solvers for the 2D/3D Navier–Stokes equations (finite-difference & spectral variants), with buildable CLI targets and basic tests.
8277
👉 [repo](https://github.com/DiogoRibeiro7/navier-stokes-solvers)
8378

84-
- **heavytails** — Utilities for heavy-tailed modelling and inference (tail index estimation, Pareto-like fits, EVT-style diagnostics).
79+
- **heavytails** — Utilities for heavy-tailed modelling and inference (tail index estimation, Pareto-like fits, EVT diagnostics).
8580
👉 [repo](https://github.com/DiogoRibeiro7/heavytails)
8681

8782
---
@@ -90,37 +85,29 @@
9085

9186
### Teaching @ESMAD
9287
- **Introduction to Logic & Set Theory (First Semester, 15 weeks)** — Logic (prop/FO), sets, induction, **differential & integral calculus**; notes + LaTeX.
93-
9488
- **Linear Algebra (Second Semester, 15 weeks)** — Vector spaces and linear maps; matrices and determinants; eigenvalues/eigenvectors, diagonalisation; orthogonality, projections, Gram–Schmidt; least squares; **SVD and PCA**; numerical stability & conditioning; applications to optimisation and data science.
9589
Syllabus: _link_ · Slides (Beamer): _link_
96-
9790
- **NLP & LLM mini-workshops** — Prompt design, evals, lightweight retrieval, and report generation with structured → narrative transforms.
9891

9992
### Seminars & Workshops
100-
10193
- **Data Science Seminars** — End-to-end ML pipelines, feature engineering for time series, evaluation under drift, MLOps (CI/CD, data/versioning), and reproducible research practices.
10294
Slides: _link_ · Notebooks: _link_
103-
10495
- **Sensors & Dashboards** — IoT data ingestion (MQTT/Kafka), time-series storage (InfluxDB/Parquet), streaming analytics (Flink), and dashboards (Grafana/Plotly/Dash) with alerting & anomaly detection.
10596
Slides: _link_ · Demo repo: _link_
106-
10797
- **Applications of Matrices to Computational Graphics** — Linear transforms in 2D/3D, homogeneous coordinates, rotations (Euler vs. quaternions), camera models & projections, shading basics; **SVD/PCA** for geometry processing.
10898
Slides: _link_ · Code samples: _link_
10999

110-
111100
### Selected Writings / Demos
112101
- **Streaming analytics with Iceberg + Flink + DynamoDB** — Architecture notes and example pipelines.
113-
114102
- **Robust regression with IRLS** — ψ-functions, influence diagnostics, and uncertainty reporting.
115-
116103
- **Time-series anomaly detection** — EWMA variants, adaptive σ, and change-point alerts for sensors.
117104

118105
---
119106

120107
## 🌟 Highlights
121108

122-
- Interdisciplinary approach spanning computer science, mathematics, economics, and natural sciences.
123-
- Practical projects in **IoT**, automation, and environmental monitoring (Raspberry Pi + sensors).
109+
- Interdisciplinary approach spanning computer science, mathematics, economics, and natural sciences.
110+
- Practical projects in **IoT**, automation, and environmental monitoring (Raspberry Pi + sensors).
124111
- Ongoing work in ML for time series, anomaly detection, and robust statistical modelling.
125112

126113
---
@@ -135,7 +122,8 @@
135122

136123
---
137124

138-
## 📈 Let’s Connect and Collaborate
125+
## 📈 Let’s Connect and Collaborate
126+
139127
Thanks for visiting! I’m keen to partner with data enthusiasts, researchers, and product teams. Browse my projects or get in touch—happy to explore ideas and build useful things together.
140128

141129
<div align="center">
@@ -149,5 +137,7 @@ Thanks for visiting! I’m keen to partner with data enthusiasts, researchers, a
149137
<img src="https://img.shields.io/badge/linkedin-%230077B5.svg?style=for-the-badge&logo=linkedin&logoColor=white" alt="LinkedIn" />
150138
</a>
151139
<a href="mailto:[email protected]">
152-
<img src="https://img.shields.io/badge/Gmail-D14836?logo=gmail&logoColor=white" alt="Email"></a>
140+
<img src="https://img.shields.io/badge/Gmail-D14836?logo=gmail&logoColor=white" alt="Email">
141+
</a>
153142
</div>
143+

0 commit comments

Comments
 (0)