This example demonstrates a Lakeflow Job that uses conditional task execution based on data quality checks.
The Lakeflow Job consists of following tasks:
- Checks data quality and calculates bad records
- Evaluates if bad records exceed a threshold (100 records)
- Routes to different processing paths based on the condition:
- If bad records > 100: runs
fix_pathtask - If bad records ≤ 100: runs
skip_pathtask
- If bad records > 100: runs
src/: Notebook source code for this project.src/check_quality.py: Checks data quality and outputs bad record countsrc/fix_path.py: Handles cases with high bad record countsrc/skip_path.py: The skip path
resources/: Resource configurations (jobs, pipelines, etc.)resources/conditional_execution.py: job definition with conditional tasks
For more information about conditional task execution, see:
Choose how you want to work on this project:
(a) Directly in your Databricks workspace, see https://docs.databricks.com/dev-tools/bundles/workspace.
(b) Locally with an IDE like Cursor or VS Code, see https://docs.databricks.com/vscode-ext.
(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html
If you're developing with an IDE, dependencies for this project should be installed using uv:
- Make sure you have the UV package manager installed. It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/.
- Run
uv sync --devto install the project's dependencies.
The Databricks workspace and IDE extensions provide a graphical interface for working with this project. It's also possible to interact with it directly using the CLI:
-
Authenticate to your Databricks workspace, if you have not done so already:
$ databricks configure -
To deploy a development copy of this project, type:
$ databricks bundle deploy --target dev(Note that "dev" is the default target, so the
--targetparameter is optional here.)This deploys everything that's defined for this project. For example, this project will deploy a job called
[dev yourname] conditional_execution_exampleto your workspace. You can find that resource by opening your workspace and clicking on Jobs & Pipelines. -
To run the job, use the "run" command:
$ databricks bundle run conditional_execution_example