Releases: PrefectHQ/prefect
Releases · PrefectHQ/prefect
Open Source Launch!
0.5.0
Released March 24, 2019
Features
- Add
checkpointoption for individualTasks, as well as a globalcheckpointconfig setting for storing the results of Tasks using their result handlers - #649 - Add
defaults_from_attrsdecorator to easily constructTasks whose attributes serve as defaults forTask.run- #293 - Environments follow new hierarchy (PIN-3) - #670
- Add
OneTimeSchedulefor one-time execution at a specified time - #680 flow.runis now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690- Pre-populate
prefect.contextwith various formatted date strings during execution - #704 - Add ability to overwrite task attributes such as "name" when calling tasks in the functional API - #717
- Release Prefect Core under the Apache 2.0 license - #762
Enhancements
- Refactor all
Stateobjects to store fully hydratedResultobjects which track information about how results should be handled - #612, #616 - Add
google.cloud.storageas an optional extra requirement so that theGCSResultHandlercan be exposed better - #626 - Add a
start_timecheck for Scheduled flow runs, similar to the one for Task runs - #605 - Project names can now be specified for deployments instead of IDs - #633
- Add a
createProjectmutation function to the client - #633 - Add timestamp to auto-generated API docs footer - #639
- Refactor
Resultinterface intoResultandSafeResult- #649 - The
manual_onlytrigger will pass ifresume=Trueis found in context, which indicates that aResumestate was passed - #664 - Added DockerOnKubernetes environment (PIN-3) - #670
- Added Prefect docker image (PIN-3) - #670
defaults_from_attrsnow accepts a splatted list of arguments - #676- Add retry functionality to
flow.run(on_schedule=True)for local execution - #680 - Add
helper_fnskeyword toShellTaskfor pre-populating helper functions to commands - #681 - Convert a few DEBUG level logs to INFO level logs - #682
- Added DaskOnKubernetes environment (PIN-3) - #695
- Load
contextfrom Cloud when running flows - #699 - Add
Queuedstate - #705 flow.serialize()will always serialize its environment, regardless ofbuild- #696flow.deploy()now raises an informative error if your container cannot deserialize the Flow - #711- Add
_MetaStateas a parent class for states that modify other states - #726 - Add
flowkeyword argument toTask.set_upstream()andTask.set_downstream()- #749 - Add
is_retrying()helper method to allStateobjects - #753 - Allow for state handlers which return
None- #753 - Add daylight saving time support for
CronSchedule- #729 - Add
idempotency_keyandcontextarguments toClient.create_flow_run- #757 - Make
EmailTaskmore secure by pulling credentials from secrets - #706
Task Library
- Add
GCSUploadandGCSDownloadfor uploading / retrieving string data to / from Google Cloud Storage - #673 - Add
BigQueryTaskandBigQueryInsertTaskfor executing queries against BigQuery tables and inserting data - #678, #685 - Add
FilterTaskfor filtering out lists of results - #637 - Add
S3DownloadandS3Uploadfor interacting with data stored on AWS S3 - #692 - Add
AirflowTaskandAirflowTriggerDAGtasks to the task library for running individual Airflow tasks / DAGs - #735 - Add
OpenGitHubIssueandCreateGitHubPRtasks for interacting with GitHub repositories - #771 - Add Kubernetes tasks for deployments, jobs, pods, and services - #779
- Add Airtable tasks - #803
- Add Twitter tasks - #803
- Add
GetRepoInfofor pulling GitHub repository information - #816
Fixes
- Fix edge case in doc generation in which some
Exceptions' call signature could not be inspected - #513 - Fix bug in which exceptions raised within flow runner state handlers could not be sent to Cloud - #628
- Fix issue wherein heartbeats were not being called on a fixed interval - #669
- Fix issue wherein code blocks inside of method docs couldn't use
**kwargs- #658 - Fix bug in which Prefect-generated Keys for S3 buckets were not properly converted to strings - #698
- Fix next line after Docker Environment push/pull from overwriting progress bar - #702
- Fix issue with
JinjaTemplatenot being pickleable - #710 - Fix issue with creating secrets from JSON documents using the Core Client - #715
- Fix issue with deserialization of JSON secrets unnecessarily calling
json.loads- #716 - Fix issue where
IntervalSchedulesdidn't respect daylight saving time after serialization - #729
Breaking Changes
- Remove the
BokehRunnerand associated webapp - #609 - Rename
ResultHandlermethods fromserialize/deserializetowrite/read- #612 - Refactor all
Stateobjects to store fully hydratedResultobjects which track information about how results should be handled - #612, #616 Client.create_flow_runnow returns a string instead of aGraphQLResultobject to match the API ofdeploy- #630flow.deployandclient.deployrequire aproject_nameinstead of an ID - #633- Upstream state results now take precedence for task inputs over
cached_inputs- #591 - Rename
Matchtask (used inside control flow) toCompareValue- #638 Client.graphql()now returns a response with up to two keys (dataanderrors). Previously thedatakey was automatically selected - #642ContainerEnvironmentwas changed toDockerEnvironment- #670- The environment
from_filewas moved toutilities.environments- #670 - Removed
start_tasksargument fromFlowRunner.run()andcheck_upstreamargument fromTaskRunner.run()- #672 - Remove support for Python 3.4 - #671
flow.runis now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690- Remove
make_return_failed_handlerasflow.runnow returns all task states - #693 - Refactor Airflow migration tools into a single
AirflowTaskin the task library for running individual Airflow tasks - #735 nameis now required on all Flow objects - #732- Separate installation "extras" packages into multiple, smaller extras - #739
...
Version 0.4.1
Major Features
- Add ability to run scheduled flows locally via
on_schedulekwarg inflow.run()- #519 - Allow tasks to specify their own result handlers, ensure inputs and outputs are stored only when necessary, and ensure no raw data is sent to the database - #587
Minor Features
- Allow for building
ContainerEnvironments locally without pushing to registry - #514 - Make mapping more robust when running children tasks multiple times - #541
- Always prefer
cached_inputsover upstream states, if available - #546 - Add hooks to
FlowRunner.initialize_run()for manipulating task states and contexts - #548 - Improve state-loading strategy for Prefect Cloud - #555
- Introduce
on_failurekwarg to Tasks and Flows for user-friendly failure callbacks - #551 - Include
scheduled_start_timein context for Flow runs - #524 - Add GitHub PR template - #542
- Allow flows to be deployed to Prefect Cloud without a project id - #571
- Introduce serialization schemas for ResultHandlers - #572
- Add new
metadataattribute to States for managing user-generated results - #573 - Add new 'JSONResultHandler' for serializing small bits of data without external storage - #576
- Use
JSONResultHandlerfor all Parameter caching - #590
Fixes
- Fixed
flow.deploy()attempting to access a nonexistent string attribute - #503 - Ensure all logs make it to the logger service in deployment - #508, #552
- Fix a situation where
Pausedtasks would be treated asPendingand run - #535 - Ensure errors raised in state handlers are trapped appropriately in Cloud Runners - #554
- Ensure unexpected errors raised in FlowRunners are robustly handled - #568
- Fixed non-deterministic errors in mapping caused by clients resolving futures of other clients - #569
- Older versions of Prefect will now ignore fields added by newer versions when deserializing objects - #583
- Result handler failures now result in clear task run failures - #575
- Fix issue deserializing old states with empty metadata - #590
- Fix issue serializing
cached_inputs- #594
Breaking Changes
- Move
prefect.client.result_handlerstoprefect.engine.result_handlers- #512 - Removed
inputskwarg fromTaskRunner.run()- #546 - Moves the
start_task_idsargument fromFlowRunner.run()toEnvironment.run()- #544, #545 - Convert
timeoutkwarg fromtimedeltatointeger- #540 - Remove
timeoutkwarg fromexecutor.wait- #569 - Serialization of States will ignore any result data that hasn't been processed - #581
- Removes
VersionedSchemain favor of implicit versioning: serializers will ignore unknown fields and thecreate_objectmethod is responsible for recreating missing ones - #583 - Convert and rename
CachedStateto a successful state namedCached, and also remove the superfluouscached_resultattribute - #586
Version 0.4.0
Major Features
- Add support for Prefect Cloud - #374, #406, #473, #491
- Add versioned serialization schemas for
Flow,Task,Parameter,Edge,State,Schedule, andEnvironmentobjects - #310, #318, #319, #340 - Add ability to provide
ResultHandlers for storing private result data - #391, #394, #430 - Support depth-first execution of mapped tasks and tracking of both the static "parent" and dynamic "children" via
Mappedstates - #485
Minor Features
- Add new
TimedOutstate for task execution timeouts - #255 - Use timezone-aware dates throughout Prefect - #325
- Add
descriptionandtagsarguments toParameters- #318 - Allow edge
keychecks to be skipped in order to create "dummy" flows from metadata - #319 - Add new
names_onlykeyword toflow.parameters- #337 - Add utility for building GraphQL queries and simple schemas from Python objects - #342
- Add links to downloadable Jupyter notebooks for all tutorials - #212
- Add
to_dictconvenience method forDotDictclass - #341 - Refactor requirements to a custom
inifile specification - #347 - Refactor API documentation specification to
tomlfile - #361 - Add new SQLite tasks for basic SQL scripting and querying - #291
- Executors now pass
map_indexinto theTaskRunners - #373 - All schedules support
start_dateandend_dateparameters - #375 - Add
DateTimemarshmallow field for timezone-aware serialization - #378 - Adds ability to put variables into context via the config - #381
- Adds new
client.deploymethod for adding new flows to the Prefect Cloud - #388 - Add
idattribute toTaskclass - #416 - Add new
Resumestate for resuming fromPausedtasks - #435 - Add support for heartbeats - #436
- Add new
Submittedstate for signaling thatScheduledtasks have been handled - #445 - Add ability to add custom environment variables and copy local files into
ContainerEnvironments - #453 - Add
set_secretmethod to Client for creating and setting the values of user secrets - #452 - Refactor runners into
CloudTaskRunnerandCloudFlowRunnerclasses - #431 - Added functions for loading default
engineclasses from config - #477
Fixes
- Fixed issue with
GraphQLResultreprs - #374 CronScheduleproduces expected results across daylight savings time transitions - #375utilities.serialization.Nestedproperly respectsmarshmallow.missingvalues - #398- Fixed issue in capturing unexpected mapping errors during task runs - #409
- Fixed issue in
flow.visualize()so that mapped flow states can be passed and colored - #387 - Fixed issue where
IntervalSchedulewas serialized at "second" resolution, not lower - #427 - Fixed issue where
SKIPsignals were preventing multiple layers of mapping - #455 - Fixed issue with multi-layer mapping in
flow.visualize()- #454 - Fixed issue where Prefect Cloud
cached_inputsweren't being used locally - #434 - Fixed issue where
Config.set_nestedwould have an error if the provided key was nested deeper than an existing terminal key - #479 - Fixed issue where
state_handlerswere not called for certain signals - #494
Breaking Changes
- Remove
NoScheduleandDateScheduleschedule classes - #324 - Change
serialize()method to use schemas rather than custom dict - #318 - Remove
timestampproperty fromStateclasses - #305 - Remove the custom JSON encoder library at
prefect.utilities.json- #336 flow.parametersnow returns a set of parameters instead of a dictionary - #337- Renamed
to_dotdict->as_nested_dict- #339 - Moved
prefect.utilities.collections.GraphQLResulttoprefect.utilities.graphql.GraphQLResult- #371 SynchronousExecutornow does not do depth first execution for mapped tasks - #373- Renamed
prefect.utilities.serialization.JSONField->JSONCompatible, removed itsmax_sizefeature, and no longer automatically serialize payloads as strings - #376 - Renamed
prefect.utilities.serialization.NestedField->Nested- #376 - Renamed
prefect.utilities.serialization.NestedField.dump_fn->NestedField.value_selection_fnfor clarity - #377 - Local secrets are now pulled from
secretsin context instead of_secrets- #382 - Remove Task and Flow descriptions, Flow project & version attributes - #383
- Changed
Scheduleparameter fromon_or_aftertoafter- #396 - Environments are immutable and return
dictkeys instead ofstr; some arguments forContainerEnvironmentare removed - #398 environment.run()andenvironment.build(); removed theflowsCLI and replaced it with a top-level CLI command,prefect run- #400- The
set_temporary_configutility now accepts a single dict of multiple config values, instead of just a key/value pair, and is located inutilities.configuration- #401 - Bump
clickrequirement to 7.0, which changes underscores to hyphens at CLI - #409 IntervalSchedulerejects intervals of less than one minute - #427FlowRunnerreturns aRunningstate, not aPendingstate, when flows do not finish - #433- Remove the
task_contextsargument fromFlowRunner.run()- #440 - Remove the leading underscore from Prefect-set context keys - #446
- Removed throttling tasks within the local cluster - #470
- Even
start_taskswill not run before their state'sstart_time(if the state isScheduled) - #474 DaskExecutor's "processes" keyword argument was renamed "local_processes" - #477- Removed the
mappedandmap_indexkwargs fromTaskRunner.run(). These values are now inferred automatically - #485 - The
upstream_statesdictionary used by the Runners only includesStatevalues, not lists ofStates. The use case that required lists ofStatesis now covered by theMappedstate. - #485
Version 0.3.3
Major Features
- Refactor
FlowRunnerandTaskRunnerinto a modularRunnerpipelines - #260, #267 - Add configurable
state_handlersforFlowRunners,Flows,TaskRunners, andTasks- #264, #267 - Add gmail and slack notification state handlers w/ tutorial - #274, #294
Minor Features
- Add a new method
flow.get_tasks()for easily filtering flow tasks by attribute - #242 - Add new
JinjaTemplateTaskfor easily rendering jinja templates - #200 - Add new
PAUSEsignal for halting task execution - #246 - Add new
Pausedstate corresponding toPAUSEsignal, and newpause_taskutility - #251 - Add ability to timeout task execution for all executors except
DaskExecutor(processes=True)- #240 - Add explicit unit test to check Black formatting (Python 3.6+) - #261
- Add ability to set local secrets in user config file - #231, #274
- Add
is_skipped()andis_scheduled()methods forStateobjects - #266, #278 - Adds
now()as a defaultstart_timeforScheduledstates - #278 Signalclasses now pass arguments to underlyingStateobjects - #279- Run counts are tracked via
Retryingstates - #281
Fixes
- Flow consistently raises if passed a parameter that doesn't exist - #149
Breaking Changes
Version 0.3.2
Major Features
- Local parallelism with
DaskExecutor- #151, #186 - Resource throttling based on
tags- #158, #186 Task.mapfor mapping tasks - #186- Added
AirFlowutility for importing Airflow DAGs as Prefect Flows - #232
Minor Features
- Use Netlify to deploy docs - #156
- Add changelog - #153
- Add
ShellTask- #150 - Base
Taskclass can now be run as a dummy task - #191 - New
return_failedkeyword toflow.run()for returning failed tasks - #205 - some minor changes to
flow.visualize()for visualizing mapped tasks and coloring nodes by state - #202 - Added new
flow.replace()method for swapping out tasks within flows - #230 - Add
debugkwarg toDaskExecutorfor optionally silencing dask logs - #209 - Update
BokehRunnerfor visualizing mapped tasks - #220 - Env var configuration settings are typed - #204
- Implement
mapfunctionality for theLocalExecutor- #233
Fixes
- Fix issue with Versioneer not picking up git tags - #146
DotDictscan have non-string keys - #193- Fix unexpected behavior in assigning tags using contextmanagers - #190
- Fix bug in initialization of Flows with only
edges- #225 - Remove "bottleneck" when creating pipelines of mapped tasks - #224
Breaking Changes
Version 0.3.1
Version 0.3.0
Major Features
- BokehRunner - #104, #128
- Control flow:
ifelse,switch, andmerge- #92 - Set state from
reference_tasks- #95, #137 - Add flow
Registry- #90 - Output caching with various
cache_validators- #84, #107 - Dask executor - #82, #86
- Automatic input caching for retries, manual-only triggers - #78
- Functional API for
Flowdefinition StateclassesSignalsto transmitState
Minor Features
- Add custom syntax highlighting to docs - #141
- Add
bind()method for tasks to call without copying - #132 - Cache expensive flow graph methods - #125
- Docker environments - #71
- Automatic versioning via Versioneer - #70
TriggerFailstate - #67- State classes - #59
Fixes
- None
Breaking Changes
- None