Releases: kedro-org/kedro
0.18.3
Release 0.18.3
Major features and improvements
-
Implemented autodiscovery of project pipelines. A pipeline created with
kedro pipeline create <pipeline_name>can now be accessed immediately without needing to explicitly register it insrc/<package_name>/pipeline_registry.py, either individually by name (e.g.kedro run --pipeline=<pipeline_name>) or as part of the combined default pipeline (e.g.kedro run). By default, the simplifiedregister_pipelines()function inpipeline_registry.pylooks like:def register_pipelines() -> Dict[str, Pipeline]: """Register the project's pipelines. Returns: A mapping from pipeline names to ``Pipeline`` objects. """ pipelines = find_pipelines() pipelines["__default__"] = sum(pipelines.values()) return pipelines
-
The Kedro IPython extension should now be loaded with
%load_ext kedro.ipython. -
The line magic
%reload_kedronow accepts keywords arguments, e.g.%reload_kedro --env=prod. -
Improved resume pipeline suggestion for
SequentialRunner, it will backtrack the closest persisted inputs to resume.
Bug fixes and other changes
- Changed default
Falsevalue for rich loggingshow_locals, to make sure credentials and other sensitive data isn't shown in logs. - Rich traceback handling is disabled on Databricks so that exceptions now halt execution as expected. This is a workaround for a bug in
rich. - When using
kedro run -n [some_node], ifsome_nodeis missing a namespace the resulting error message will suggest the correct node name. - Updated documentation for
richlogging. - Updated Prefect deployment documentation to allow for reruns with saved versioned datasets.
- The Kedro IPython extension now surfaces errors when it cannot load a Kedro project.
- Relaxed
delta-sparkupper bound to allow compatibility with Spark 3.1.x and 3.2.x. - Added
gdriveto list of cloud protocols, enabling Google Drive paths for datasets. - Added svg logo resource for ipython kernel.
Upcoming deprecations for Kedro 0.19.0
- The Kedro IPython extension will no longer be available as
%load_ext kedro.extras.extensions.ipython; use%load_ext kedro.ipythoninstead. kedro jupyter convert,kedro build-docs,kedro build-reqsandkedro activate-nbstripoutwill be deprecated.
0.18.2
Release 0.18.2
Major features and improvements
- Added
abfssto list of cloud protocols, enabling abfss paths. - Kedro now uses the Rich library to format terminal logs and tracebacks.
- The file
conf/base/logging.ymlis now optional. See our documentation for details. - Introduced a
kedro.startersentry point. This enables plugins to create custom starter aliases used bykedro starter listandkedro new. - Reduced the
kedro newprompts to just one question asking for the project name.
Bug fixes and other changes
- Bumped
pyyamlupper bound to make Kedro compatible with the pyodide stack. - Updated project template's Sphinx configuration to use
myst_parserinstead ofrecommonmark. - Reduced number of log lines by changing the logging level from
INFOtoDEBUGfor low priority messages. - Kedro's framework-side logging configuration no longer performs file-based logging. Hence superfluous
info.log/errors.logfiles are no longer created in your project root, and running Kedro on read-only file systems such as Databricks Repos is now possible. - The
rootlogger is now set to the Python default level ofWARNINGrather thanINFO. Kedro's logger is still set to emitINFOlevel messages. SequentialRunnernow has consistent execution order across multiple runs with sorted nodes.- Bumped the upper bound for the Flake8 dependency to <5.0.
kedro jupyter notebook/labno longer reuses a Jupyter kernel.- Required
cookiecutter>=2.1.1to address a known command injection vulnerability. - The session store no longer fails if a username cannot be found with
getpass.getuser. - Added generic typing for
AbstractDataSetandAbstractVersionedDataSetas well as typing to all datasets. - Rendered the deployment guide flowchart as a Mermaid diagram, and added Dask.
Minor breaking changes to the API
- The module
kedro.config.default_loggerno longer exists; default logging configuration is now set automatically throughkedro.framework.project.LOGGING. Unless you explicitly importkedro.config.default_loggeryou do not need to make any changes.
Upcoming deprecations for Kedro 0.19.0
kedro.extras.ColorHandlerwill be removed in 0.19.0.
0.18.1
Major features and improvements
- Added a new hook
after_context_createdthat passes theKedroContextinstance ascontext. - Added a new CLI hook
after_command_run. - Added more detail to YAML
ParserErrorexception error message. - Added option to
SparkDataSetto specify aschemaload argument that allows for supplying a user-defined schema as opposed to relying on the schema inference of Spark. - The Kedro package no longer contains a built version of the Kedro documentation significantly reducing the package size.
Bug fixes and other changes
- Removed fatal error from being logged when a Kedro session is created in a directory without git.
- Fixed
CONFIG_LOADER_CLASSvalidation so thatTemplatedConfigLoadercan be specified in settings.py. AnyCONFIG_LOADER_CLASSmust be a subclass ofAbstractConfigLoader. - Added runner name to the
run_paramsdictionary used in pipeline hooks. - Updated Databricks documentation to include how to get it working with IPython extension and Kedro-Viz.
- Update sections on visualisation, namespacing, and experiment tracking in the spaceflight tutorial to correspond to the complete spaceflights starter.
- Fixed
Jinja2syntax loading withTemplatedConfigLoaderusingglobals.yml. - Removed global
_active_session,_activate_sessionand_deactivate_session. Plugins that need to access objects such as the config loader should now do so throughcontextin the newafter_context_createdhook. config_loaderis available as a public read-only attribute ofKedroContext.- Made
hook_managerargument optional forrunner.run. kedro docsnow opens an online version of the Kedro documentation instead of a locally built version.
Upcoming deprecations for Kedro 0.19.0
kedro docswill be removed in 0.19.0.
0.18.0
Release 0.18.0
TL;DR ✨
Kedro 0.18.0 strives to reduce the complexity of the project template and get us closer to a stable release of the framework. We've introduced the full micro-packaging workflow 📦, which allows you to import packages, utility functions and existing pipelines into your Kedro project. Integration with IPython and Jupyter has been streamlined in preparation for enhancements to Kedro's interactive workflow. Additionally, the release comes with long-awaited Python 3.9 and 3.10 support 🐍.
Major features and improvements
Framework
- Added
kedro.config.abstract_config.AbstractConfigLoaderas an abstract base class for allConfigLoaderimplementations.ConfigLoaderandTemplatedConfigLoadernow inherit directly from this base class. - Streamlined the
ConfigLoader.getandTemplatedConfigLoader.getAPI and delegated the actualgetmethod functional implementation to thekedro.config.commonmodule. - The
hook_manageris no longer a global singleton. Thehook_managerlifecycle is now managed by theKedroSession, and a newhook_managerwill be created every time asessionis instantiated. - Added support for specifying parameters mapping in
pipeline()without theparams:prefix. - Added new API
Pipeline.filter()(previously inKedroContext._filter_pipeline()) to filter parts of a pipeline. - Added
usernameto Session store for logging during Experiment Tracking. - A packaged Kedro project can now be imported and run from another Python project as following:
from my_package.__main__ import main
main(
["--pipleine", "my_pipeline"]
) # or just main() if no parameters are needed for the runProject template
- Removed
cli.pyfrom the Kedro project template. By default, all CLI commands, includingkedro run, are now defined on the Kedro framework side. You can still define custom CLI commands by creating your owncli.py. - Removed
hooks.pyfrom the Kedro project template. Registration hooks have been removed in favour ofsettings.pyconfiguration, but you can still define execution timeline hooks by creating your ownhooks.py. - Removed
.ipythondirectory from the Kedro project template. The IPython/Jupyter workflow no longer uses IPython profiles; it now uses an IPython extension. - The default
kedrorun configuration environment names can now be set insettings.pyusing theCONFIG_LOADER_ARGSvariable. The relevant keyword arguments to supply arebase_envanddefault_run_env, which are set tobaseandlocalrespectively by default.
DataSets
- Added the following new datasets:
| Type | Description | Location |
|---|---|---|
pandas.XMLDataSet |
Read XML into Pandas DataFrame. Write Pandas DataFrame to XML | kedro.extras.datasets.pandas |
networkx.GraphMLDataSet |
Work with NetworkX using GraphML files | kedro.extras.datasets.networkx |
networkx.GMLDataSet |
Work with NetworkX using Graph Modelling Language files | kedro.extras.datasets.networkx |
redis.PickleDataSet |
loads/saves data from/to a Redis database | kedro.extras.datasets.redis |
- Added
partitionBysupport and exposedsave_argsforSparkHiveDataSet. - Exposed
open_args_saveinfs_argsforpandas.ParquetDataSet. - Refactored the
loadandsaveoperations forpandasdatasets in order to leveragepandasown API and delegatefsspecoperations to them. This reduces the need to have our ownfsspecwrappers. - Merged
pandas.AppendableExcelDataSetintopandas.ExcelDataSet. - Added
save_argstofeather.FeatherDataSet.
Jupyter and IPython integration
- The only recommended way to work with Kedro in Jupyter or IPython is now the Kedro IPython extension. Managed Jupyter instances should load this via
%load_ext kedro.extras.extensions.ipythonand use the line magic%reload_kedro. kedro ipythonlaunches an IPython session that preloads the Kedro IPython extension.kedro jupyter notebook/labcreates a custom Jupyter kernel that preloads the Kedro IPython extension and launches a notebook with that kernel selected. There is no longer a need to specify--all-kernelsto show all available kernels.
Dependencies
- Bumped the minimum version of
pandasto 1.3. Anystorage_optionsshould continue to be specified underfs_argsand/orcredentials. - Added support for Python 3.9 and 3.10, dropped support for Python 3.6.
- Updated
blackdependency in the project template to a non pre-release version.
Other
- Documented distribution of Kedro pipelines with Dask.
Breaking changes to the API
Framework
- Removed
RegistrationSpecsand its associatedregister_config_loaderandregister_cataloghook specifications in favour ofCONFIG_LOADER_CLASS/CONFIG_LOADER_ARGSandDATA_CATALOG_CLASSinsettings.py. - Removed deprecated functions
load_contextandget_project_context. - Removed deprecated
CONF_SOURCE,package_name,pipeline,pipelines,config_loaderandioattributes fromKedroContextas well as the deprecatedKedroContext.runmethod. - Added the
PluginManagerhook_managerargument toKedroContextand theRunner.run()method, which will be provided by theKedroSession. - Removed the public method
get_hook_manager()and replaced its functionality by_create_hook_manager(). - Enforced that only one run can be successfully executed as part of a
KedroSession.run_idhas been renamed tosession_idas a result.
Configuration loaders
- The
settings.pysettingCONF_ROOThas been renamed toCONF_SOURCE. Default value ofconfremains unchanged. ConfigLoaderandTemplatedConfigLoaderargumentconf_roothas been renamed toconf_source.extra_paramshas been renamed toruntime_paramsinkedro.config.config.ConfigLoaderandkedro.config.templated_config.TemplatedConfigLoader.- The environment defaulting behaviour has been removed from
KedroContextand is now implemented in aConfigLoaderclass (or equivalent) with thebase_envanddefault_run_envattributes.
DataSets
pandas.ExcelDataSetnow usesopenpyxlengine instead ofxlrd.pandas.ParquetDataSetnow callspd.to_parquet()upon saving. Note that the argumentpartition_colsis not supported.spark.SparkHiveDataSetAPI has been updated to reflectspark.SparkDataSet. Thewrite_mode=insertoption has also been replaced withwrite_mode=appendas per Spark styleguide. This change addresses Issue 725 and Issue 745. Additionally,upsertmode now leveragescheckpointfunctionality and requires a validcheckpointDirbe set for currentSparkContext.yaml.YAMLDataSetcan no longer save apandas.DataFramedirectly, but it can save a dictionary. Usepandas.DataFrame.to_dict()to convert yourpandas.DataFrameto a dictionary before you attempt to save it to YAML.- Removed
open_args_loadandopen_args_savefrom the following datasets:pandas.CSVDataSetpandas.ExcelDataSetpandas.FeatherDataSetpandas.JSONDataSetpandas.ParquetDataSet
storage_optionsare now dropped if they are specified underload_argsorsave_argsfor the following datasets:pandas.CSVDataSetpandas.ExcelDataSetpandas.FeatherDataSetpandas.JSONDataSetpandas.ParquetDataSet
- Renamed
lambda_data_set,memory_data_set, andpartitioned_data_settolambda_dataset,memory_dataset, andpartitioned_dataset, respectively, inkedro.io. - The dataset
networkx.NetworkXDataSethas been renamed tonetworkx.JSONDataSet.
CLI
- Removed
kedro installin favour ofpip install -r src/requirements.txtto install project dependencies. - Removed
--parallelflag fromkedro runin favour of--runner=ParallelRunner. The-pflag is now an alias for--pipeline. kedro pipeline packagehas been replaced bykedro micropkg packageand, in addition to the--aliasflag used to rename the package, now accepts a module name and path to the pipeline or utility module to package, relative tosrc/<package_name>/. The--versionCLI option has been removed in favour of setting a__version__variable in the micro-package's__init__.pyfile.kedro pipeline pullhas been replaced bykedro micropkg pulland now also supports--destinationto provide a location for pulling the package.- Removed
kedro pipeline listandkedro pipeline describein favour ofkedro registry listandkedro registry describe. kedro packageandkedro micropkg packagenow saveeggandwhlortarfiles in the<project_root>/distfolder (previously<project_root>/src/dist).- Changed the behaviour of
kedro build-reqsto compile requirements fromrequirements.txtinstead ofrequirements.inand save them torequirements.lockinstead ofrequirements.txt. kedro jupyter notebook/labno longer accept--all-kernelsor--idle-timeoutflags.--all-kernelsis now the default behaviour.KedroSession.runnow raisesValueErrorrather thanKedroContextErrorwhen the pipeline contains no nodes. The sameValueErroris raised when there are no matching tags.KedroSession.runnow raisesValueErrorrather thanKedroContextErrorw...
0.17.7
Release 0.17.7
Major features and improvements
pipelinenow acceptstagsand a collection ofNodes and/orPipelines rather than just a singlePipelineobject.pipelineshould be used in preference toPipelinewhen creating a Kedro pipeline.pandas.SQLTableDataSetandpandas.SQLQueryDataSetnow only open one connection per database, at instantiation time (therefore at catalog creation time), rather than one per load/save operation.- Added new command group,
micropkg, to replacekedro pipeline pullandkedro pipeline packagewithkedro micropkg pullandkedro micropkg packagefor Kedro 0.18.0.kedro micropkg packagesaves packages toproject/distwhilekedro pipeline packagesaves packages toproject/src/dist.
Bug fixes and other changes
- Added tutorial documentation for experiment tracking.
- Added Plotly dataset documentation.
- Added the upper limit
pandas<1.4to maintain compatibility withxlrd~=1.0. - Bumped the
Pillowminimum version requirement to 9.0 (Python 3.7+ only) following CVE-2022-22817. - Fixed
PickleDataSetto be copyable and hence work with the parallel runner. - Upgraded
pip-tools, which is used bykedro build-reqs, to 6.5 (Python 3.7+ only). Thispip-toolsversion is compatible withpip>=21.2, including the most recent releases ofpip. Python 3.6 users should continue to usepip-tools6.4 andpip<22. - Added
astro-irisas alias forastro-airlow-iris, so that old tutorials can still be followed. - Added details about Kedro's Technical Steering Committee and governance model.
Upcoming deprecations for Kedro 0.18.0
kedro pipeline pullandkedro pipeline packagewill be deprecated. Please usekedro micropkginstead.
0.17.6
Release 0.17.6
Major features and improvements
- Added
pipelinesglobal variable to IPython extension, allowing you to access the project's pipelines inkedro ipythonorkedro jupyter notebook. - Enabled overriding nested parameters with
paramsin CLI, i.e.kedro run --params="model.model_tuning.booster:gbtree"updates parameters to{"model": {"model_tuning": {"booster": "gbtree"}}}. - Added option to
pandas.SQLQueryDataSetto specify afilepathwith a SQL query, in addition to the current method of supplying the query itself in thesqlargument. - Extended
ExcelDataSetto support saving Excel files with multiple sheets. - Added the following new datasets:
| Type | Description | Location |
|---|---|---|
plotly.JSONDataSet |
Works with plotly graph object Figures (saves as json file) | kedro.extras.datasets.plotly |
pandas.GenericDataSet |
Provides a 'best effort' facility to read / write any format provided by the pandas library |
kedro.extras.datasets.pandas |
pandas.GBQQueryDataSet |
Loads data from a Google Bigquery table using provided SQL query | kedro.extras.datasets.pandas |
spark.DeltaTableDataSet |
Dataset designed to handle Delta Lake Tables and their CRUD-style operations, including update, merge and delete |
kedro.extras.datasets.spark |
Bug fixes and other changes
- Fixed an issue where
kedro new --config config.ymlwas ignoring the config file whenprompts.ymldidn't exist. - Added documentation for
kedro viz --autoreload. - Added support for arbitrary backends (via importable module paths) that satisfy the
pickleinterface toPickleDataSet. - Added support for
sumsyntax for connecting pipeline objects. - Upgraded
pip-tools, which is used bykedro build-reqs, to 6.4. Thispip-toolsversion requirespip>=21.2while adding support forpip>=21.3. To upgradepip, please refer to their documentation. - Relaxed the bounds on the
plotlyrequirement forplotly.PlotlyDataSetand thepyarrowrequirement forpandas.ParquetDataSet. kedro pipeline package <pipeline>now raises an error if the<pipeline>argument doesn't look like a valid Python module path (e.g. has/instead of.).- Added new
overwriteargument toPartitionedDataSetandMatplotlibWriterto enable deletion of existing partitions and plots on datasetsave. kedro pipeline pullnow works when the project requirements contains entries such as-r,--extra-index-urland local wheel files (Issue #913).- Fixed slow startup because of catalog processing by reducing the exponential growth of extra processing during
_FrozenDatasetscreations. - Removed
.coveragercfrom the Kedro project template.coveragesettings are now given inpyproject.toml. - Fixed a bug where packaging or pulling a modular pipeline with the same name as the project's package name would throw an error (or silently pass without including the pipeline source code in the wheel file).
- Removed unintentional dependency on
git. - Fixed an issue where nested pipeline configuration was not included in the packaged pipeline.
- Deprecated the "Thanks for supporting contributions" section of release notes to simplify the contribution process; Kedro 0.17.6 is the last release that includes this. This process has been replaced with the automatic GitHub feature.
- Fixed a bug where the version on the tracking datasets didn't match the session id and the versions of regular versioned datasets.
- Fixed an issue where datasets in
load_versionsthat are not found in the data catalog would silently pass. - Altered the string representation of nodes so that node inputs/outputs order is preserved rather than being alphabetically sorted.
Upcoming deprecations for Kedro 0.18.0
kedro.extras.decoratorsandkedro.pipeline.decoratorsare being deprecated in favour of Hooks.kedro.extras.transformersandkedro.io.transformersare being deprecated in favour of Hooks.- The
--parallelflag onkedro runis being removed in favour of--runner=ParallelRunner. The-pflag will change to be an alias for--pipeline. kedro.io.DataCatalogWithDefaultis being deprecated, to be removed entirely in 0.18.0.
Thanks for supporting contributions
Deepyaman Datta,
Brites,
Manish Swami,
Avaneesh Yembadi,
Zain Patel,
Simon Brugman,
Kiyo Kunii,
Benjamin Levy,
Louis de Charsonville,
Simon Picard
0.17.5
Release 0.17.5
Major features and improvements
- Added new CLI group
registry, with the associated commandskedro registry listandkedro registry describe, to replacekedro pipeline listandkedro pipeline describe. - Added support for dependency management at a modular pipeline level. When a pipeline with
requirements.txtis packaged, its dependencies are embedded in the modular pipeline wheel file. Upon pulling the pipeline, Kedro will append dependencies to the project'srequirements.in. More information is available in our documentation. - Added support for bulk packaging/pulling modular pipelines using
kedro pipeline package/pull --allandpyproject.toml. - Removed
cli.pyfrom the Kedro project template. By default all CLI commands, includingkedro run, are now defined on the Kedro framework side. These can be overridden in turn by a plugin or acli.pyfile in your project. A packaged Kedro project will respect the same hierarchy when executed withpython -m my_package. - Removed
.ipython/profile_default/startup/from the Kedro project template in favour of.ipython/profile_default/ipython_config.pyand thekedro.extras.extensions.ipython. - Added support for
dillbackend toPickleDataSet. - Imports are now refactored at
kedro pipeline packageandkedro pipeline pulltime, so that aliasing a modular pipeline doesn't break it. - Added the following new datasets to support basic Experiment Tracking:
| Type | Description | Location |
|---|---|---|
tracking.MetricsDataSet |
Dataset to track numeric metrics for experiment tracking | kedro.extras.datasets.tracking |
tracking.JSONDataSet |
Dataset to track data for experiment tracking | kedro.extras.datasets.tracking |
Bug fixes and other changes
- Bumped minimum required
fsspecversion to 2021.04. - Fixed the
kedro installandkedro build-reqsflows when uninstalled dependencies are present in a project'ssettings.py,context.pyorhooks.py(Issue #829). - Imports are now refactored at
kedro pipeline packageandkedro pipeline pulltime, so that aliasing a modular pipeline doesn't break it. - Pinned
dynaconfto<3.1.6because the method signature for_validate_itemschanged which is used in Kedro.
Minor breaking changes to the API
Upcoming deprecations for Kedro 0.18.0
kedro pipeline listandkedro pipeline describeare being deprecated in favour of new commandskedro registry listandkedro registry describe.kedro installis being deprecated in favour of usingpip install -r src/requirements.txtto install project dependencies.
Thanks for supporting contributions
0.17.4
Release 0.17.4
Major features and improvements
- Added the following new datasets:
| Type | Description | Location |
|---|---|---|
plotly.PlotlyDataSet |
Works with plotly graph object Figures (saves as json file) | kedro.extras.datasets.plotly |
Bug fixes and other changes
- Defined our set of Kedro Principles! Have a read through our docs.
ConfigLoader.get()now raises aBadConfigException, with a more helpful error message, if a configuration file cannot be loaded (for instance due to wrong syntax or poor formatting).run_idnow defaults tosave_versionwhenafter_catalog_createdis called, similarly to what happens during akedro run.- Fixed a bug where
kedro ipythonandkedro jupyter notebookdidn't work if thePYTHONPATHwas already set. - Update the IPython extension to allow passing
envandextra_paramstoreload_kedrosimilar to how the IPython script works. kedro infonow outputs if a plugin has anyhooksorcli_hooksimplemented.PartitionedDataSetnow supports lazily materializing data on save.kedro pipeline describenow defaults to the__default__pipeline when no pipeline name is provided and also shows the namespace the nodes belong to.- Fixed an issue where spark.SparkDataSet with enabled versioning would throw a VersionNotFoundError when using databricks-connect from a remote machine and saving to dbfs filesystem.
EmailMessageDataSetadded to doctree.- When node inputs do not pass validation, the error message is now shown as the most recent exception in the traceback (Issue #761).
kedro pipeline packagenow only packages the parameter file that exactly matches the pipeline name specified and the parameter files in a directory with the pipeline name.- Extended support to newer versions of third-party dependencies (Issue #735).
- Ensured consistent references to
model inputtables in accordance with our Data Engineering convention. - Changed behaviour where
kedro pipeline packagetakes the pipeline package version, rather than the kedro package version. If the pipeline package version is not present, then the package version is used. - Launched GitHub Discussions and Kedro Discord Server
- Improved error message when versioning is enabled for a dataset previously saved as non-versioned (Issue #625).
0.17.3
Release 0.17.3
Major features and improvements
- Kedro plugins can now override built-in CLI commands.
- Added a
before_command_runhook for plugins to add extra behaviour before Kedro CLI commands run. pipelinesfrompipeline_registry.pyandregister_pipelinehooks are now loaded lazily when they are first accessed, not on startup:
from kedro.framework.project import pipelines
print(pipelines["__default__"]) # pipeline loading is only triggered hereBug fixes and other changes
TemplatedConfigLoadernow correctly inserts default values when no globals are supplied.- Fixed a bug where the
KEDRO_ENVenvironment variable had no effect on instantiating thecontextvariable in an iPython session or a Jupyter notebook. - Plugins with empty CLI groups are no longer displayed in the Kedro CLI help screen.
- Duplicate commands will no longer appear twice in the Kedro CLI help screen.
- CLI commands from sources with the same name will show under one list in the help screen.
- The setup of a Kedro project, including adding src to path and configuring settings, is now handled via the
bootstrap_projectmethod. configure_projectis invoked if apackage_nameis supplied toKedroSession.create. This is added for backward-compatibility purpose to support a workflow that createsSessionmanually. It will be removed in0.18.0.- Stopped swallowing up all
ModuleNotFoundErrorifregister_pipelinesnot found, so that a more helpful error message will appear when a dependency is missing, e.g. Issue #722. - When
kedro newis invoked using a configuration yaml file,output_diris no longer a required key; by default the current working directory will be used. - When
kedro newis invoked using a configuration yaml file, the appropriateprompts.ymlfile is now used for validating the provided configuration. Previously, validation was always performed against the kedro project templateprompts.ymlfile. - When a relative path to a starter template is provided,
kedro newnow generates user prompts to obtain configuration rather than supplying empty configuration. - Fixed error when using starters on Windows with Python 3.7 (Issue #722).
- Fixed decoding error of config files that contain accented characters by opening them for reading in UTF-8.
- Fixed an issue where
after_dataset_loadedrun would finish before a dataset is actually loaded when using--asyncflag.
Upcoming deprecations for Kedro 0.18.0
kedro.versioning.journal.Journalwill be removed.- The following properties on
kedro.framework.context.KedroContextwill be removed:ioin favour ofKedroContext.catalogpipeline(equivalent topipelines["__default__"])pipelinesin favour ofkedro.framework.project.pipelines
0.17.2
Release 0.17.2
Major features and improvements
- Added support for
compress_picklebackend toPickleDataSet. - Enabled loading pipelines without creating a
KedroContextinstance:
from kedro.framework.project import pipelines
print(pipelines)- Projects generated with kedro>=0.17.2:
- should define pipelines in
pipeline_registry.pyrather thanhooks.py. - when run as a package, will behave the same as
kedro run
- should define pipelines in
Bug fixes and other changes
- If
settings.pyis not importable, the errors will be surfaced earlier in the process, rather than at runtime.
Minor breaking changes to the API
kedro pipeline listandkedro pipeline describeno longer accept redundant--envparameter.from kedro.framework.cli.cli import clino longer includes thenewandstartercommands.
Upcoming deprecations for Kedro 0.18.0
kedro.framework.context.KedroContext.runwill be removed in release 0.18.0.