Hi folks!
We are pleased to announce that Auto-Sklong is now available with its official paper released, all that under the tag 0.0.9 🎉
❝ In a nutshell, what's Auto-Sklong?
📽️ Auto-Sklong (short for Auto-Scikit-Longitudinal) is built on @_gama (by @PGijsbers & @amore-labs), @scikit-longitudinal (our own Sklearn-like library gathering primitives for Longitudinal related classification tasks challenges), @_smac3 for Bayesian optimization (by the @automl team) and draws inspiration from longitudinal research by top-notch researchers like Dr. Caio Ribeiro (@caioedurib), Dr. Tossapol Pomsuwan (@mastervii), Dr. Sergey Ovchinnik (@SergeyOvchinnik), Dr. Fernando Otero (@febo). But really, what does it do? 👇👇👇
💡 Auto-Sklong is an open-source Python system that is among the first to unravel uncharted territories between automated machine learning (AutoML) and longitudinal health-related data classification. It solves the Combined Algorithm Selection and Hyperparameter (CASH) optimization problem for longitudinal datasets, where features are repeatedly measured over time (often called waves), common in health and biomedical fields. Unlike standard AutoML systems, which require flattening longitudinal data and lose temporal insights (Longitudinal data is different than time-series data, while similar in some ways), Auto-Sklong includes both data transformation and algorithm adaptation approaches, evaluating pipelines with longitudinal-aware components aiming to enhance both predictive accuracy and explainability!
With Auto-Sklong, we address the challenge of manually selecting the best pipeline for temporal data by offering automated search across:
- Search Methods: 4 strategies including Bayesian Optimization (via
SMAC3), Asynchronous Successive Halving, Evolutionary Algorithms, and Random Search (via GAMA). - Data Preparation: Utilities like
MerWavTimePlusto preserve temporal structure for longitudinal methods. - Data Transformation: 10 flattening methods (e.g., aggregation with mean/median,
MerWavTimeMinus,SepWavwith voting/stacking) for standard ML compatibility. - Preprocessing: Longitudinal-aware feature selection like
ExhaustiveCFSPerGroup, or standard CFS. - Estimators: 11 classifiers, including 5 specialized longitudinal ones (e.g.,
LexicoRandomForestClassifier,NestedTreesClassifier,LexicoDeepForestClassifier) and 6 traditional (e.g., Random Forest, Gradient Boosting) from Scikit-learn.
Maybe a look at the search space won't hurt!
|
|
Not Enough? ❞
🗞️ The scientific paper is available at (Published by IEEE): https://doi.org/10.1109/BIBM62325.2024.10821737
As well as that, more is coming; explore our GitHub issues, read through our README, and check our documentation! We've also added tutorial enhancements in the docs for a quick overview (PS: There even is a generated podcast about the paper!).
|
|
Open-Source Contribution, More Than Welcome! ❞
We hope to provide motivation for you to contribute your own search methods, preprocessors, data transformation techniques and more! If we could have 1% of what @scikit-learn did 10 years ago (back in France 🇫🇷) for the machine learning community globally, it'd be just insane!
As a result, please share your suggestions! Without external input, how can we ensure we're advancing longitudinal AutoML workflows? 👀 New primitives are welcome from external contributors without problems—simply open an issue to discuss.
❝ I guess it's now time for tech-ish changelog!
🫵
https://pypi.org/project/auto-sklong/
[v0.0.9] - 2025-08-02 - BIBM Paper Publication and Early-Beta Transition
Added
- BIBM paper integration: Added links, badges, and references to the published paper (DOI: 10.1109/BIBM62325.2024.10821737) – core commit for release.
- Tutorial improvements in documentation – commit yesterday.
- GAMA inclusion for enhanced search methods – commit 13 hours ago.
- Document-dates plugin for docs – inferred from similar updates.
- Blurry tabs styling for docs – inferred from similar updates.
Enhanced
- Documentation overhaul: Revamped home page, improved wordings, import examples, and docstrings – commits yesterday.
- Updated BIBM paper links and temporal dependency references – commits 7-8 hours ago.
- Improved authorship, PyPI README, and links – commits 13 hours ago.
- Updated development requirements (e.g., package versions) and uv.lock – commit on Jan 15, 2025.
- Enhanced tutorial and README – commits 14 hours ago and yesterday.
Resolved
- Fixed PyPI links and setup.py – commits 13 hours ago.
- Addressed post-acceptance tweaks in minor releases (e.g., since
0.0.4) – various commits.
Note: This changelog covers advancements since0.0.4. For prior details, see the expanded history below.