Releases: PrefectHQ/prefect
Releases · PrefectHQ/prefect
Cloud Ready
Changelog
0.6.0
Released July 16, 2019
Features
- Add the Prefect CLI for working with core objects both locally and in cloud - #1059
- Add RemoteEnvironment for simple executor based executions - #1215
- Add the ability to share caches across Tasks and Flows - #1222
- Add the ability to submit tasks to specific dask workers for task / worker affinity - #1229
Enhancements
- Refactor mapped caching to be independent of order - #1082
- Refactor caching to allow for caching across multiple runs - #1082
- Allow for custom secret names in Result Handlers - #1098
- Have
execute cloud-flowCLI immediately set the flow run state toFailedif environment fails - #1122 - Validate configuration objects on initial load - #1136
- Add
auto_generatedproperty to Tasks for convenient filtering - #1135 - Disable dask work-stealing in kubernetes via scheduler config - #1166
- Implement backoff retry settings on Client calls - #1187
- Explicitly set Dask keys for a better Dask visualization experience - #1218
- Implement a local cache which persists for the duration of a Python session - #1221
- Implement in-process retries for Cloud Tasks which request retry in less than one minute - #1228
- Support
Client.login()with API tokens - #1240 - Add live log streaming for
prefect run cloudcommand - #1241
Task Library
- Add task to trigger AWS Step function workflow #1012
- Add task to copy files within Google Cloud Storage - #1206
- Add task for downloading files from Dropbox - #1205
Fixes
- Fix issue with mapped caching in Prefect Cloud - #1096
- Fix issue with Result Handlers deserializing incorrectly in Cloud - #1112
- Fix issue caused by breaking change in
marshmallow==3.0.0rc7- #1151 - Fix issue with passing results to Prefect signals - #1163
- Fix issue with
flow.updatenot preserving mapped edges - #1164 - Fix issue with Parameters and Context not being raw dictionaries - #1186
- Fix issue with asynchronous, long-running mapped retries in Prefect Cloud - #1208
- Fix issue with automatically applied collections to task call arguments when using the imperative API - #1211
Breaking Changes
- The CLI command
prefect execute-flowandprefect execute-cloud-flowno longer exist - #1059 - The
slack_notifierstate handler now uses awebhook_secretkwarg to pull the URL from a Secret - #1075 - Use GraphQL for Cloud logging - #1193
- Remove the
CloudResultHandlerdefault result handler - #1198 - Rename
LocalStoragetoLocal- #1236
Contributors
Season 8
A Release Has No Name
Changelog
0.5.4
Released May 28, 2019
Features
- Add new
UnionSchedulefor combining multiple schedules, allowing for complex schedule specifications - #428 - Allow for Cloud users to securely pull Docker images from private registries - #1028
Enhancements
- Add
prefect_versionkwarg toDockerstorage for controlling the version of prefect installed into your containers - #1010, #533 - Warn users if their Docker storage base image uses a different python version than their local machine - #999
- Add flow run id to k8s labels on Cloud Environment jobs / pods for easier filtering in deployment - #1016
- Allow for
SlackTaskto pull the Slack webhook URL from a custom named Secret - #1023 - Raise informative errors when Docker storage push / pull fails - #1029
- Standardized
__repr__s for various classes, to remove inconsistencies - #617 - Allow for use of local images in Docekr storage - #1052
- Allow for doc tests and doc generation to run without installing
all_extras- #1057
Task Library
- Add task for creating new branches in a GitHub repository - #1011
- Add tasks to create, delete, invoke, and list AWS Lambda functions #1009
- Add tasks for integration with spaCy pipelines #1018
- Add tasks for querying Postgres database #1022
- Add task for waiting on a Docker container to run and optionally raising for nonzero exit code - #1061
- Add tasks for communicating with Redis #1021
Fixes
- Ensure that state change handlers are called even when unexpected initialization errors occur - #1015
- Fix an issue where a mypy assert relied on an unavailable import - #1034
- Fix an issue where user configurations were loaded after config interpolation had already taken place - #1037
- Fix an issue with saving a flow visualization to a file from a notebook - #1056
- Fix an issue in which mapped tasks incorrectly tried to run when their upstream was skipped - #1068
- Fix an issue in which mapped tasks were not using their caches locally - #1067
Breaking Changes
- Changed the signature of
configuration.load_configuration()- #1037 - Local Secrets now raise
ValueErrors when not found in context - #1047
Contributors
The Release is Bright and Full of Features
Changelog
0.5.3
Released May 7, 2019
Features
Enhancements
- Flow now has optional
storagekeyword - #936 - Flow
environmentargument now defaults to aCloudEnvironment- #936 Queuedstates acceptstart_timearguments - #955- Add new
BytesandMemorystorage classes for local testing - #956, #961 - Add new
LocalEnvironmentexecution environment for local testing - #957 - Add new
Abortedstate for Flow runs which are cancelled by users - #959 - Added an
execute-cloud-flowCLI command for working with cloud deployed flows - #971 - Add new
flows.run_on_scheduleconfiguration option for affecting the behavior offlow.run- #972 - Allow for Tasks with
manual_onlytriggers to be root tasks - #667 - Allow compression of serialized flows #993
- Allow for serialization of user written result handlers - #623
- Allow for state to be serialized in certain triggers and cache validators - #949
- Add new
filenamekeyword toflow.visualizefor automatically saving visualizations - #1001 - Add new
LocalStorageoption for storing Flows locally - #1006
Task Library
- None
Fixes
- Fix Docker storage not pulling correct flow path - #968
- Fix
run_flowloading to decode properly by use cloudpickle - #978 - Fix Docker storage for handling flow names with spaces and weird characters - #969
- Fix non-deterministic issue with mapping in the DaskExecutor - #943
Breaking Changes
- Remove
flow.idandtask.idattributes - #940 - Removed old WIP environments - #936
(Note: Changes from #936 regarding environments don't break any Prefect code because environments weren't used yet outside of Cloud.) - Update
flow.deployandclient.deployto useset_schedule_activekwarg to match Cloud - #991 - Removed
Flow.generate_local_task_ids()- #992
Contributors
- None
Unredacted: The 0.5.2 Release
0.5.2
Released April 19, 2019
Features
- Implement two new triggers that allow for specifying bounds on the number of failures or successes - #933
Enhancements
DaskExecutor(local_processes=True)supports timeouts - #886- Calling
Secret.get()from within a Flow context raises an informative error - #927 - Add new keywords to
Task.set_upstreamandTask.set_downstreamfor handling keyed and mapped dependencies - #823 - Downgrade default logging level to "INFO" from "DEBUG" - #935
- Add start times to queued states - #937
- Add
is_submittedto states - #944 - Introduce new
ClientFailedstate - #938
Task Library
- Add task for sending Slack notifications via Prefect Slack App - #932
Fixes
- Fix issue with timeouts behaving incorrectly with unpickleable objects - #886
- Fix issue with Flow validation being performed even when eager validation was turned off - #919
- Fix issue with downstream tasks with
all_failedtriggers running if an upstream Client call fails in Cloud - #938
Breaking Changes
- Remove
prefect make user configfrom cli commands - #904 - Change
set_schedule_activekeyword in Flow deployments toset_schedule_inactiveto match Cloud - #941
Contributors
- None
It Takes a Village
0.5.1
Released April 4, 2019
Features
- API reference documentation is now versioned - #270
- Add
S3ResultHandlerfor handling results to / from S3 buckets - #879 - Add ability to use
Cachedstates across flow runs in Cloud - #885
Enhancements
- Bump to latest version of
pytest(4.3) - #814 Client.deployaccepts optionalbuildkwarg for avoiding building Flow environment - #876- Bump
distributedto 1.26.1 for enhanced security features - #878 - Local secrets automatically attempt to load secrets as JSON - #883
- Add task logger to context for easily creating custom logs during task runs - #884
Task Library
- Add
ParseRSSFeedfor parsing a remote RSS feed - #856 - Add tasks for working with Docker containers and imaged - #864
- Add task for creating a BigQuery table - #895
Fixes
- Only checkpoint tasks if running in cloud - #839, #854
- Adjusted small flake8 issues for names, imports, and comparisons - #849
- Fix bug preventing
flow.runfrom properly using cached tasks - #861 - Fix tempfile usage in
flow.visualizeso that it runs on Windows machines - #858 - Fix issue caused by Python 3.5.2 bug for Python 3.5.2 compatibility - #857
- Fix issue in which
GCSResultHandlerwas not pickleable - #879 - Fix issue with automatically converting callables and dicts to tasks - #894
Breaking Changes
- Change the call signature of
Dicttask fromrun(**task_results)torun(keys, values)- #894
Contributors
Open Source Launch!
0.5.0
Released March 24, 2019
Features
- Add
checkpointoption for individualTasks, as well as a globalcheckpointconfig setting for storing the results of Tasks using their result handlers - #649 - Add
defaults_from_attrsdecorator to easily constructTasks whose attributes serve as defaults forTask.run- #293 - Environments follow new hierarchy (PIN-3) - #670
- Add
OneTimeSchedulefor one-time execution at a specified time - #680 flow.runis now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690- Pre-populate
prefect.contextwith various formatted date strings during execution - #704 - Add ability to overwrite task attributes such as "name" when calling tasks in the functional API - #717
- Release Prefect Core under the Apache 2.0 license - #762
Enhancements
- Refactor all
Stateobjects to store fully hydratedResultobjects which track information about how results should be handled - #612, #616 - Add
google.cloud.storageas an optional extra requirement so that theGCSResultHandlercan be exposed better - #626 - Add a
start_timecheck for Scheduled flow runs, similar to the one for Task runs - #605 - Project names can now be specified for deployments instead of IDs - #633
- Add a
createProjectmutation function to the client - #633 - Add timestamp to auto-generated API docs footer - #639
- Refactor
Resultinterface intoResultandSafeResult- #649 - The
manual_onlytrigger will pass ifresume=Trueis found in context, which indicates that aResumestate was passed - #664 - Added DockerOnKubernetes environment (PIN-3) - #670
- Added Prefect docker image (PIN-3) - #670
defaults_from_attrsnow accepts a splatted list of arguments - #676- Add retry functionality to
flow.run(on_schedule=True)for local execution - #680 - Add
helper_fnskeyword toShellTaskfor pre-populating helper functions to commands - #681 - Convert a few DEBUG level logs to INFO level logs - #682
- Added DaskOnKubernetes environment (PIN-3) - #695
- Load
contextfrom Cloud when running flows - #699 - Add
Queuedstate - #705 flow.serialize()will always serialize its environment, regardless ofbuild- #696flow.deploy()now raises an informative error if your container cannot deserialize the Flow - #711- Add
_MetaStateas a parent class for states that modify other states - #726 - Add
flowkeyword argument toTask.set_upstream()andTask.set_downstream()- #749 - Add
is_retrying()helper method to allStateobjects - #753 - Allow for state handlers which return
None- #753 - Add daylight saving time support for
CronSchedule- #729 - Add
idempotency_keyandcontextarguments toClient.create_flow_run- #757 - Make
EmailTaskmore secure by pulling credentials from secrets - #706
Task Library
- Add
GCSUploadandGCSDownloadfor uploading / retrieving string data to / from Google Cloud Storage - #673 - Add
BigQueryTaskandBigQueryInsertTaskfor executing queries against BigQuery tables and inserting data - #678, #685 - Add
FilterTaskfor filtering out lists of results - #637 - Add
S3DownloadandS3Uploadfor interacting with data stored on AWS S3 - #692 - Add
AirflowTaskandAirflowTriggerDAGtasks to the task library for running individual Airflow tasks / DAGs - #735 - Add
OpenGitHubIssueandCreateGitHubPRtasks for interacting with GitHub repositories - #771 - Add Kubernetes tasks for deployments, jobs, pods, and services - #779
- Add Airtable tasks - #803
- Add Twitter tasks - #803
- Add
GetRepoInfofor pulling GitHub repository information - #816
Fixes
- Fix edge case in doc generation in which some
Exceptions' call signature could not be inspected - #513 - Fix bug in which exceptions raised within flow runner state handlers could not be sent to Cloud - #628
- Fix issue wherein heartbeats were not being called on a fixed interval - #669
- Fix issue wherein code blocks inside of method docs couldn't use
**kwargs- #658 - Fix bug in which Prefect-generated Keys for S3 buckets were not properly converted to strings - #698
- Fix next line after Docker Environment push/pull from overwriting progress bar - #702
- Fix issue with
JinjaTemplatenot being pickleable - #710 - Fix issue with creating secrets from JSON documents using the Core Client - #715
- Fix issue with deserialization of JSON secrets unnecessarily calling
json.loads- #716 - Fix issue where
IntervalSchedulesdidn't respect daylight saving time after serialization - #729
Breaking Changes
- Remove the
BokehRunnerand associated webapp - #609 - Rename
ResultHandlermethods fromserialize/deserializetowrite/read- #612 - Refactor all
Stateobjects to store fully hydratedResultobjects which track information about how results should be handled - #612, #616 Client.create_flow_runnow returns a string instead of aGraphQLResultobject to match the API ofdeploy- #630flow.deployandclient.deployrequire aproject_nameinstead of an ID - #633- Upstream state results now take precedence for task inputs over
cached_inputs- #591 - Rename
Matchtask (used inside control flow) toCompareValue- #638 Client.graphql()now returns a response with up to two keys (dataanderrors). Previously thedatakey was automatically selected - #642ContainerEnvironmentwas changed toDockerEnvironment- #670- The environment
from_filewas moved toutilities.environments- #670 - Removed
start_tasksargument fromFlowRunner.run()andcheck_upstreamargument fromTaskRunner.run()- #672 - Remove support for Python 3.4 - #671
flow.runis now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690- Remove
make_return_failed_handlerasflow.runnow returns all task states - #693 - Refactor Airflow migration tools into a single
AirflowTaskin the task library for running individual Airflow tasks - #735 nameis now required on all Flow objects - #732- Separate installation "extras" packages into multiple, smaller extras - #739
...
Version 0.4.1
Major Features
- Add ability to run scheduled flows locally via
on_schedulekwarg inflow.run()- #519 - Allow tasks to specify their own result handlers, ensure inputs and outputs are stored only when necessary, and ensure no raw data is sent to the database - #587
Minor Features
- Allow for building
ContainerEnvironments locally without pushing to registry - #514 - Make mapping more robust when running children tasks multiple times - #541
- Always prefer
cached_inputsover upstream states, if available - #546 - Add hooks to
FlowRunner.initialize_run()for manipulating task states and contexts - #548 - Improve state-loading strategy for Prefect Cloud - #555
- Introduce
on_failurekwarg to Tasks and Flows for user-friendly failure callbacks - #551 - Include
scheduled_start_timein context for Flow runs - #524 - Add GitHub PR template - #542
- Allow flows to be deployed to Prefect Cloud without a project id - #571
- Introduce serialization schemas for ResultHandlers - #572
- Add new
metadataattribute to States for managing user-generated results - #573 - Add new 'JSONResultHandler' for serializing small bits of data without external storage - #576
- Use
JSONResultHandlerfor all Parameter caching - #590
Fixes
- Fixed
flow.deploy()attempting to access a nonexistent string attribute - #503 - Ensure all logs make it to the logger service in deployment - #508, #552
- Fix a situation where
Pausedtasks would be treated asPendingand run - #535 - Ensure errors raised in state handlers are trapped appropriately in Cloud Runners - #554
- Ensure unexpected errors raised in FlowRunners are robustly handled - #568
- Fixed non-deterministic errors in mapping caused by clients resolving futures of other clients - #569
- Older versions of Prefect will now ignore fields added by newer versions when deserializing objects - #583
- Result handler failures now result in clear task run failures - #575
- Fix issue deserializing old states with empty metadata - #590
- Fix issue serializing
cached_inputs- #594
Breaking Changes
- Move
prefect.client.result_handlerstoprefect.engine.result_handlers- #512 - Removed
inputskwarg fromTaskRunner.run()- #546 - Moves the
start_task_idsargument fromFlowRunner.run()toEnvironment.run()- #544, #545 - Convert
timeoutkwarg fromtimedeltatointeger- #540 - Remove
timeoutkwarg fromexecutor.wait- #569 - Serialization of States will ignore any result data that hasn't been processed - #581
- Removes
VersionedSchemain favor of implicit versioning: serializers will ignore unknown fields and thecreate_objectmethod is responsible for recreating missing ones - #583 - Convert and rename
CachedStateto a successful state namedCached, and also remove the superfluouscached_resultattribute - #586
Version 0.4.0
Major Features
- Add support for Prefect Cloud - #374, #406, #473, #491
- Add versioned serialization schemas for
Flow,Task,Parameter,Edge,State,Schedule, andEnvironmentobjects - #310, #318, #319, #340 - Add ability to provide
ResultHandlers for storing private result data - #391, #394, #430 - Support depth-first execution of mapped tasks and tracking of both the static "parent" and dynamic "children" via
Mappedstates - #485
Minor Features
- Add new
TimedOutstate for task execution timeouts - #255 - Use timezone-aware dates throughout Prefect - #325
- Add
descriptionandtagsarguments toParameters- #318 - Allow edge
keychecks to be skipped in order to create "dummy" flows from metadata - #319 - Add new
names_onlykeyword toflow.parameters- #337 - Add utility for building GraphQL queries and simple schemas from Python objects - #342
- Add links to downloadable Jupyter notebooks for all tutorials - #212
- Add
to_dictconvenience method forDotDictclass - #341 - Refactor requirements to a custom
inifile specification - #347 - Refactor API documentation specification to
tomlfile - #361 - Add new SQLite tasks for basic SQL scripting and querying - #291
- Executors now pass
map_indexinto theTaskRunners - #373 - All schedules support
start_dateandend_dateparameters - #375 - Add
DateTimemarshmallow field for timezone-aware serialization - #378 - Adds ability to put variables into context via the config - #381
- Adds new
client.deploymethod for adding new flows to the Prefect Cloud - #388 - Add
idattribute toTaskclass - #416 - Add new
Resumestate for resuming fromPausedtasks - #435 - Add support for heartbeats - #436
- Add new
Submittedstate for signaling thatScheduledtasks have been handled - #445 - Add ability to add custom environment variables and copy local files into
ContainerEnvironments - #453 - Add
set_secretmethod to Client for creating and setting the values of user secrets - #452 - Refactor runners into
CloudTaskRunnerandCloudFlowRunnerclasses - #431 - Added functions for loading default
engineclasses from config - #477
Fixes
- Fixed issue with
GraphQLResultreprs - #374 CronScheduleproduces expected results across daylight savings time transitions - #375utilities.serialization.Nestedproperly respectsmarshmallow.missingvalues - #398- Fixed issue in capturing unexpected mapping errors during task runs - #409
- Fixed issue in
flow.visualize()so that mapped flow states can be passed and colored - #387 - Fixed issue where
IntervalSchedulewas serialized at "second" resolution, not lower - #427 - Fixed issue where
SKIPsignals were preventing multiple layers of mapping - #455 - Fixed issue with multi-layer mapping in
flow.visualize()- #454 - Fixed issue where Prefect Cloud
cached_inputsweren't being used locally - #434 - Fixed issue where
Config.set_nestedwould have an error if the provided key was nested deeper than an existing terminal key - #479 - Fixed issue where
state_handlerswere not called for certain signals - #494
Breaking Changes
- Remove
NoScheduleandDateScheduleschedule classes - #324 - Change
serialize()method to use schemas rather than custom dict - #318 - Remove
timestampproperty fromStateclasses - #305 - Remove the custom JSON encoder library at
prefect.utilities.json- #336 flow.parametersnow returns a set of parameters instead of a dictionary - #337- Renamed
to_dotdict->as_nested_dict- #339 - Moved
prefect.utilities.collections.GraphQLResulttoprefect.utilities.graphql.GraphQLResult- #371 SynchronousExecutornow does not do depth first execution for mapped tasks - #373- Renamed
prefect.utilities.serialization.JSONField->JSONCompatible, removed itsmax_sizefeature, and no longer automatically serialize payloads as strings - #376 - Renamed
prefect.utilities.serialization.NestedField->Nested- #376 - Renamed
prefect.utilities.serialization.NestedField.dump_fn->NestedField.value_selection_fnfor clarity - #377 - Local secrets are now pulled from
secretsin context instead of_secrets- #382 - Remove Task and Flow descriptions, Flow project & version attributes - #383
- Changed
Scheduleparameter fromon_or_aftertoafter- #396 - Environments are immutable and return
dictkeys instead ofstr; some arguments forContainerEnvironmentare removed - #398 environment.run()andenvironment.build(); removed theflowsCLI and replaced it with a top-level CLI command,prefect run- #400- The
set_temporary_configutility now accepts a single dict of multiple config values, instead of just a key/value pair, and is located inutilities.configuration- #401 - Bump
clickrequirement to 7.0, which changes underscores to hyphens at CLI - #409 IntervalSchedulerejects intervals of less than one minute - #427FlowRunnerreturns aRunningstate, not aPendingstate, when flows do not finish - #433- Remove the
task_contextsargument fromFlowRunner.run()- #440 - Remove the leading underscore from Prefect-set context keys - #446
- Removed throttling tasks within the local cluster - #470
- Even
start_taskswill not run before their state'sstart_time(if the state isScheduled) - #474 DaskExecutor's "processes" keyword argument was renamed "local_processes" - #477- Removed the
mappedandmap_indexkwargs fromTaskRunner.run(). These values are now inferred automatically - #485 - The
upstream_statesdictionary used by the Runners only includesStatevalues, not lists ofStates. The use case that required lists ofStatesis now covered by theMappedstate. - #485
Version 0.3.3
Major Features
- Refactor
FlowRunnerandTaskRunnerinto a modularRunnerpipelines - #260, #267 - Add configurable
state_handlersforFlowRunners,Flows,TaskRunners, andTasks- #264, #267 - Add gmail and slack notification state handlers w/ tutorial - #274, #294
Minor Features
- Add a new method
flow.get_tasks()for easily filtering flow tasks by attribute - #242 - Add new
JinjaTemplateTaskfor easily rendering jinja templates - #200 - Add new
PAUSEsignal for halting task execution - #246 - Add new
Pausedstate corresponding toPAUSEsignal, and newpause_taskutility - #251 - Add ability to timeout task execution for all executors except
DaskExecutor(processes=True)- #240 - Add explicit unit test to check Black formatting (Python 3.6+) - #261
- Add ability to set local secrets in user config file - #231, #274
- Add
is_skipped()andis_scheduled()methods forStateobjects - #266, #278 - Adds
now()as a defaultstart_timeforScheduledstates - #278 Signalclasses now pass arguments to underlyingStateobjects - #279- Run counts are tracked via
Retryingstates - #281
Fixes
- Flow consistently raises if passed a parameter that doesn't exist - #149