Bastien Dussap BastienDussap

ML Engineer / Data Scientist

About Me

I'm a ML Engineer / Data Scientist at Metafora Biosystems, a biotechnology company based at the Cochin Hospital (Paris 14). I work on METAflow, a novel AI-powered tool for flow cytometry analysis.

What I do:

Develop machine learning algorithms and data processing pipelines in Python for cytometry analysis
Build and maintain production-ready REST APIs using Django framework
Design and deploy ML models on Google Cloud Platform (GCP), following ISO 62304 standards for medical device software
Collaborate directly with users to gather feedback and translate requirements into actionable development tickets
Participate in Agile/Scrum workflows, including sprint planning and backlog management

Former PhD student in Machine Learning / Statistics at Université Paris-Saclay, affiliated with the Institut de Mathématiques d'Orsay and part of the Datashape team at INRIA, under the supervision of Gilles Blanchard and Marc Glisse.

Technical Skills

Languages & Frameworks

ML & Data Science

Libraries: NumPy, Pandas, Scikit-learn, PyTorch, Matplotlib
MLOps: Git, Docker basics, REST API development
Cloud: Google Cloud Platform (GCP) - Compute Engine, Cloud Storage
Methodologies: Agile/Scrum, ISO 62304 (medical device software)

Tools & Platforms

Version Control: Git
Documentation: LaTeX, Zotero, Markdown
OS: Linux, Windows (WSL)
AI tools: Claude, Copilot

Currently Learning

AWS Machine Learning Engineer certification
Docker & Kubernetes for ML deployment
Advanced MLOps patterns

PhD Research

Thesis: A Unified Framework for Label Shift Quantification

My doctoral research focused on quantification learning applied to cytometric datasets, particularly in the context of Metafora's METAflow software.

Key contributions:

Developed methods for automatic analysis of flow cytometry data using machine learning
Leveraged Reproducing Kernel Hilbert Spaces (RKHS) to embed and store high-dimensional features
Created transfer learning techniques to analyze new samples based on previously analyzed ones
Built frameworks to estimate population proportions in new samples

Read my thesis

Publications & Recognition

Best Paper Award

"Label Shift Quantification with Robust Guarantees via Distribution Feature Matching"
with G. Blanchard and B.-E. Chérief-Abdellatif

Best Student Paper Award - Research Track at ECML/PKDD 2023
ArXiv preprint
Conference proceedings

Abstract: We propose a unified framework based on distribution feature matching that recovers estimators from both classification-based and statistical mixture modeling approaches to quantification learning. We provide robust theoretical guarantees under label shift and investigate misspecification scenarios.

Research Interests

Machine Learning: Kernel methods, transfer learning, statistical learning theory
Label Shift & Quantification Learning: Distribution matching, robust estimation
Kernel Mean Embedding: RKHS methods, feature representations
Applications: Flow cytometry analysis, biomedical data processing
MLOps: Model deployment, API development, production systems

Selected Talks

Conference Presentations

ECML/PKDD 2023 - Turin, Italy
Label Shift Quantification with Robust Guarantees via Distribution Feature Matching
🏆 Research Track – Best Student Paper Award

Invited Seminars

Journées de Statistique - Société Française de Statistique, 2023
DataShape Seminar - INRIA, 2023
Workshop FAST-BIG - Efficient Statistical Testing for High-Dimensional Models
Séminaire des doctorants - Institut de Mathématiques d'Orsay, 2023

Teaching & Service

Seminar Organization

Co-organizer of the Master's seminar in Statistics and Machine Learning at Université Paris-Saclay (2022-2024)

Teaching

Teaching Fellow - IUT Sceaux (2022-2023)
Mathematics for Management - L1 B.U.T GEA, taught by Pr. Patrick Pamphile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly