Releases: klarna-incubator/mleko
Releases · klarna-incubator/mleko
v4.3.0
v4.2.0
v4.1.0
v4.1.0 (2024-05-18)
✨ Features
- tuning: Add support for enqueuing trials in
OptunaTuner. (9e0b6b2) - data splitting: Add support for stratification on multiple features in the
RandomSplitter. (d745434) - transformer: Add
metadataoption for theExpressionTransformerthat allows for creation of meta features not tracked in theDataSchema. (f16ea8b) - transformer: Add
ExpressionTransformerfor creating features using thevaexexpression system. (c0faf74)
v4.0.0
v4.0.0 (2024-05-09)
⛔️ BREAKING CHANGES
- exporter: Add
S3Exporterthat implements cached S3 exporting of files from the local disk. (d17b2d2) - exporter: Add
BaseExporterandLocalExporterimplementations that support exporting data to disk, along with correspondingPipelinesteps. (6ce13cf)
✨ Features
- exporter: Add
LocalManifestsupport forLocalExporterwhich simplifies caching logic and enables S3 manifest translations. (2199ff0) - exporter: Add support for multiple data export using
LocalExporter. (ff988b6) - data source: Add support for reading manifest files from S3 buckets in
S3Ingester. (9c68a9b) - pipeline: Add
disable_cacheparameter toPipelineexecution. (da1e31a)
🐛 Bug Fixes
- data cleaning: Fix newline characters breaking CSV reading using Arrow. (
3a7e594) - tuning: Delete logging of storage URI to minimize risk of accidentally logging credentials. (
054692d)
🛠️ Code Refactoring
- data source: Extract shared S3 logic to
utilswhich can be then used byS3Exporter. (97a7974)
v3.2.0
v3.1.0
v3.0.0
v2.2.0
v2.2.0 (2024-03-22)
✨ Features
- filter: Add
ImblearnResamplingFilterwhich is a wrapper forimblearnover- and under-samplers. (77a3d7d) - filter: Add
ExpressionFilterand base class for simple DataFrame filtering usingvaexexpressions. (dc679ff) - cache: Add
disable_cacheargument to all cached functions to completely bypass all caching functionality. (fbdfc5d)
📝 Documentation
- Update
CHANGELOG.mdformat to include missing categories. (d97b32c)
v2.1.0
v2.0.0
v2.0.0 (2024-02-07)
⛔️ BREAKING CHANGES
- pipeline: Refactor
PipelineStepto useTypedDictfor both inputs and outputs. (2eb623c)
🐛 Bug Fixes
- data cleaning: Rename empty column name to
_emptyto preventvaexcrashes. (da72b75) - data cleaning: Cast boolean columns to
int8during cleaning to reduce label encoding needs. (d94f7c9) - Added reserved keyword column name replacement to prevent evaluation errors from
vaex. (3969ffd)
🛠️ Code Refactoring
- Improve error logging messages, and update codebase to new
blackformat. (a29ad45) - cache: Break out cache handler retrieval method. (
aba9e41)