Releases: kedro-org/kedro
0.18.13
Release 0.18.13
Major features and improvements
- Added support for Python 3.11. This includes tackling challenges like dependency pinning and test adjustments to ensure a smooth experience. Detailed migration tips are provided below for further context.
- Added new
OmegaConfigLoaderfeatures:- Allowed registering of custom resolvers to
OmegaConfigLoaderthroughCONFIG_LOADER_ARGS. - Added support for global variables to
OmegaConfigLoader.
- Allowed registering of custom resolvers to
- Added
kedro catalog resolveCLI command that resolves dataset factories in the catalog with any explicit entries in the project pipeline. - Implemented a flat
conf/structure for modular pipelines, and accordingly, updated thekedro pipeline createandkedro catalog createcommand. - Updated new Kedro project template and Kedro starters:
- Change Kedro starters and new Kedro projects to use
OmegaConfigLoader. - Converted
setup.pyin new Kedro project template and Kedro starters topyproject.tomland moved flake8 configuration
to dedicated file.flake8. - Updated the spaceflights starter to use the new flat
conf/structure.
- Change Kedro starters and new Kedro projects to use
Bug fixes and other changes
- Updated
OmegaConfigLoaderto ignore config from hidden directories like.ipynb_checkpoints.
Documentation changes
- Revised the
datasection to restructure beginner and advanced pages about the Data Catalog and datasets. - Moved contributor documentation to the GitHub wiki.
- Updated example of using generator functions in nodes.
- Added migration guide from the
ConfigLoaderand theTemplatedConfigLoaderto theOmegaConfigLoader. TheConfigLoaderand theTemplatedConfigLoaderare deprecated and will be removed in the0.19.0release.
Migration Tips for Python 3.11:
- PyTables on Windows: Users on Windows with Python >=3.8 should note we've pinned
pytablesto3.8.0due to compatibility issues. - Spark Dependency: We've set an upper version limit for
pysparkat <3.4 due to breaking changes in 3.4. - Testing with Python 3.10: The latest
motoversion now supports parallel test execution for Python 3.10, resolving previous issues.
Breaking changes to the API
Upcoming deprecations for Kedro 0.19.0
- Renamed abstract dataset classes, in accordance with the Kedro lexicon. Dataset classes ending with "DataSet" are deprecated and will be removed in 0.19.0. Note that all of the below classes are also importable from
kedro.io; only the module where they are defined is listed as the location.
| Type | Deprecated Alias | Location |
|---|---|---|
AbstractDataset |
AbstractDataSet |
kedro.io.core |
AbstractVersionedDataset |
AbstractVersionedDataSet |
kedro.io.core |
- Using the
layerattribute at the top level is deprecated; it will be removed in Kedro version 0.19.0. Please movelayerinside themetadata->kedro-vizattributes.
Community contributions
Thanks to Laíza Milena Scheid Parizotto and Jonathan Cohen.
0.18.12
Release 0.18.12
Major features and improvements
- Added dataset factories feature which uses pattern matching to reduce the number of catalog entries.
- Activated all built-in resolvers by default for
OmegaConfigLoaderexcept foroc.env. - Added
kedro catalog rankCLI command that ranks dataset factories in the catalog by matching priority.
Bug fixes and other changes
- Consolidated dependencies and optional dependencies in
pyproject.toml. - Made validation of unique node outputs much faster.
- Updated
kedro catalog listto show datasets generated with factories.
Documentation changes
- Recommended
ruffas the linter and removed mentions ofpylint,isort,flake8.
Community contributions
Thanks to Laíza Milena Scheid Parizotto and Chris Schopp.
Breaking changes to the API
Upcoming deprecations for Kedro 0.19.0
ConfigLoaderandTemplatedConfigLoaderwill be deprecated. Please useOmegaConfigLoaderinstead.
0.18.11
Release 0.18.11
Major features and improvements
- Added databricks-iris as an official starter.
Bug fixes and other changes
- Reworked micropackaging workflow to use standard Python packaging practices.
- Make kedro micropkg package accept --verbose.
Documentation changes
- Significant improvements to the documentation that covers working with Databricks and Kedro, including a new page for workspace-only development, and a guide to choosing the best workflow for your use case.
- Updated documentation for deploying with Prefect for version 2.0.
0.18.10
0.18.9
Major features and improvements
kedro run --paramsnow updates interpolated parameters correctly when usingOmegaConfigLoader.- Added
metadataattribute tokedro.iodatasets. This is ignored by Kedro, but may be consumed by users or external plugins. - Added
kedro.logging.RichHandler. This replaces the defaultrich.logging.RichHandlerand is more flexible, user can turn off therichtraceback if needed.
Bug fixes and other changes
OmegaConfigLoaderwill return adictinstead ofDictConfig.OmegaConfigLoaderdoes not show aMissingConfigErrorwhen the config files exist but are empty.
Documentation changes
- Added documentation for collaborative experiment tracking within Kedro-Viz.
- Revised section on deployment to better organise content and reflect how recently docs have been updated.
- Minor improvements to fix typos and revise docs to align with engineering changes.
Breaking changes to the API
kedro packagedoes not produce.eggfiles anymore, and now relies exclusively on.whlfiles.
Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:
0.18.8
Major features and improvements
- Added
KEDRO_LOGGING_CONFIGenvironment variable, which can be used to configure logging from the beginning of thekedroprocess. - Removed logs folder from the kedro new project template. File-based logging will remain but just be level INFO and above and go to project root instead.
Bug fixes and other changes
- Improvements to Jupyter E2E tests.
- Added full
kedro runCLI command to session store to improve run reproducibility usingKedro-Vizexperiment tracking.
Documentation changes
- Improvements to documentation about configuration.
- Improvements to Sphinx toolchain including incrementing to use a newer version.
- Improvements to documentation on visualising Kedro projects on Databricks, and additional documentation about the development workflow for Kedro projects on Databricks.
- Updated Technical Steering Committee membership documentation.
- Revised documentation section about linting and formatting and extended to give details of
flake8configuration. - Updated table of contents for documentation to reduce scrolling.
- Expanded FAQ documentation.
- Added a 404 page to documentation.
- Added deprecation warnings about the removal of
kedro.extras.datasets.
0.18.7
Release 0.18.7
Major features and improvements
- Added new Kedro CLI
kedro jupyter setupto setup Jupyter Kernel for Kedro. kedro packagenow includes the project configuration in a compressedtar.gzfile.- Added functionality to the
OmegaConfigLoaderto load configuration from compressed files ofziportarformat. This feature requiresfsspec>=2023.1.0. - Significant improvements to on-boarding documentation that covers setup for new Kedro users. Also some major changes to the spaceflights tutorial to make it faster to work through. We think it's a better read. Tell us if it's not.
Bug fixes and other changes
- Added a guide and tooling for developing Kedro for Databricks.
- Implement missing dict-like interface for
_ProjectPipeline.
0.18.6
Release 0.18.6
Bug fixes and other changes
- Fixed bug that didn't allow to read or write datasets with
s3aors3nfilepaths - Fixed bug with overriding nested parameters using the
--paramsflag - Fixed bug that made session store incompatible with
Kedro-Vizexperiment tracking
Migration guide from Kedro 0.18.5 to 0.18.6
A regression introduced in Kedro version 0.18.5 caused the Kedro-Viz console to fail to show experiment tracking correctly. If you experienced this issue, you will need to:
- upgrade to Kedro version
0.18.6 - delete any erroneous session entries created with Kedro 0.18.5 from your session_store.db stored at
<project-path>/data/session_store.db.
Thanks to Kedroids tomohiko kato, tsanikgr and maddataanalyst for very detailed reports about the bug.
0.18.5
Release 0.18.5
NOTE: This version of Kedro introduced a bug such that the Kedro-Viz console to fail to show experiment tracking correctly. We recommend that you don't use it and prefer instead to use Kedro version
0.18.6.
Major features and improvements
- Added new
OmegaConfigLoaderwhich usesOmegaConffor loading and merging configuration. - Added the
--conf-sourceoption tokedro run, allowing users to specify a source for project configuration for the run. - Added
omegaconfsyntax as option for--params. Keys and values can now be separated by colons or equals signs. - Added support for generator functions as nodes, i.e. using
yieldinstead of return.- Enable chunk-wise processing in nodes with generator functions.
- Save node outputs after every
yieldbefore proceeding with next chunk.
- Fixed incorrect parsing of Azure Data Lake Storage Gen2 URIs used in datasets.
- Added support for loading credentials from environment variables using
OmegaConfigLoader. - Added new
--namespaceflag tokedro runto enable filtering by node namespace. - Added a new argument
nodefor all four dataset hooks. - Added the
kedro runflags--nodes,--tags, and--load-versionsto replace--node,--tag, and--load-version.
Bug fixes and other changes
- Commas surrounded by square brackets (only possible for nodes with default names) will no longer split the arguments to
kedro runoptions which take a list of nodes as inputs (--from-nodesand--to-nodes). - Fixed bug where
micropkgmanifest section inpyproject.tomlisn't recognised as allowed configuration. - Fixed bug causing
load_ipython_extensionnot to register the%reload_kedroline magic when called in a directory that does not contain a Kedro project. - Added
anyconfig'sac_contextparameter tokedro.config.commonsmodule functions for more flexibleConfigLoadercustomizations. - Change reference to
kedro.pipeline.Pipelineobject throughout test suite withkedro.modular_pipeline.pipelinefactory. - Fixed bug causing the
after_dataset_savedhook only to be called for one output dataset when multiple are saved in a single node and async saving is in use. - Log level for "Credentials not found in your Kedro project config" was changed from
WARNINGtoDEBUG. - Added safe extraction of tar files in
micropkg pullto fix vulnerability caused by CVE-2007-4559. - Documentation improvements
- Bug fix in table font size
- Updated API docs links for datasets
- Improved CLI docs for
kedro run - Revised documentation for visualisation to build plots and for experiment tracking
- Added example for loading external credentials to the Hooks documentation
Breaking changes to the API
Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:
Upcoming deprecations for Kedro 0.19.0
project_versionwill be deprecated inpyproject.tomlplease usekedro_init_versioninstead.- Deprecated
kedro runflags--node,--tag, and--load-versionin favour of--nodes,--tags, and--load-versions.
0.18.4
Major features and improvements
- Make Kedro instantiate datasets from
kedro_datasetswith higher priority thankedro.extras.datasets.kedro_datasetsis the namespace for the newkedro-datasetspython package. - The config loader objects now implement
UserDictand the configuration is accessed throughconf_loader['catalog']. - You can configure config file patterns through
settings.pywithout creating a custom config loader. - Added the following new datasets:
| Type | Description | Location |
|---|---|---|
svmlight.SVMLightDataSet |
Work with svmlight/libsvm files using scikit-learn library | kedro.extras.datasets.svmlight |
video.VideoDataSet |
Read and write video files from a filesystem | kedro.extras.datasets.video |
video.video_dataset.SequenceVideo |
Create a video object from an iterable sequence to use with VideoDataSet |
kedro.extras.datasets.video |
video.video_dataset.GeneratorVideo |
Create a video object from a generator to use with VideoDataSet |
kedro.extras.datasets.video |
- Implemented support for a functional definition of schema in
dask.ParquetDataSetto work with thedask.to_parquetAPI.
Bug fixes and other changes
- Fixed
kedro micropkg pullfor packages on PyPI. - Fixed
formatinsave_argsforSparkHiveDataSet, previously it didn't allow you to save it as delta format. - Fixed save errors in
TensorFlowModelDatasetwhen used without versioning; previously, it wouldn't overwrite an existing model. - Added support for
tf.deviceinTensorFlowModelDataset. - Updated error message for
VersionNotFoundErrorto handle insufficient permission issues for cloud storage. - Updated Experiment Tracking docs with working examples.
- Updated MatplotlibWriter Dataset, TextDataset, plotly.PlotlyDataSet and plotly.JSONDataSet docs with working examples.
- Modified implementation of the Kedro IPython extension to use
local_nsrather than a global variable. - Refactored
ShelveStoreto its own module to ensure multiprocessing works with it. kedro.extras.datasets.pandas.SQLQueryDataSetnow takes optional argumentexecution_options.- Removed
attrsupper bound to support newer versions of Airflow. - Bumped the lower bound for the
setuptoolsdependency to <=61.5.1.
Minor breaking changes to the API
Upcoming deprecations for Kedro 0.19.0
kedro testandkedro lintwill be deprecated.
Documentation
- Revised the Introduction to shorten it
- Revised the Get Started section to remove unnecessary information and clarify the learning path
- Updated the spaceflights tutorial to simplify the later stages and clarify what the reader needed to do in each phase
- Moved some pages that covered advanced materials into more appropriate sections
- Moved visualisation into its own section
- Fixed a bug that degraded user experience: the table of contents is now sticky when you navigate between pages
- Added redirects where needed on ReadTheDocs for legacy links and bookmarks
Contributions from the Kedroid community
We are grateful to the following for submitting PRs that contributed to this release: jstammers, FlorianGD, yash6318, carlaprv, dinotuku, williamcaicedo, avan-sh, Kastakin, amaralbf, BSGalvan, levimjoseph, daniel-falk, clotildeguinard, avsolatorio, and picklejuicedev for comments and input to documentation changes