Get started with Kestra in 3 minutes.
A plugin to execute Apache Beam pipelines via Kestra.
This plugin allows you to execute Apache Beam YAML pipelines directly from Kestra. It supports both Java and Python SDKs and can dispatch jobs to various runners including Direct, Flink, Spark, and Google Cloud Dataflow.
Key features:
- Multi-Runner Support: Switch between local execution (Direct) and distributed processing (Flink, Spark, Dataflow) by changing a simple property.
- Dual SDK: Run pipelines using the Java SDK (embedded) or the Python SDK (via Docker/TaskRunner).
- Metrics Integration: Automatically captures Beam Counters, Distributions, and Gauges and exposes them as Kestra metrics.
- Declarative Pipelines: Define your Beam pipeline using the Beam YAML syntax inline or from a file.
- Java 21
- Docker
- Gradle
To test this plugin with distributed runners (Flink and Spark) locally, a helper script is provided to spin up the necessary infrastructure.
Starting the environment: Run the following script to start Flink (JobManager/TaskManager) and Spark (Master/Worker) using Docker Compose. The script includes health checks to ensure services are fully ready before returning.
./local-setup-unit.shAccessing Services: Once the script completes, you can access the dashboards:
- Flink Dashboard: http://localhost:56478
- Spark Master UI: http://localhost:56479
./gradlew check --parallel
VSCode:
Follow the README.md within the .devcontainer folder for a quick and easy way to get up and running with developing plugins if you are using VSCode.
Other IDEs:
./gradlew shadowJar && docker build -t kestra-custom . && docker run --rm -p 8080:8080 kestra-custom server local
Note
You need to relaunch this whole command everytime you make a change to your plugin
go to http://localhost:8080, your plugin will be available to use
- Full documentation can be found under: kestra.io/docs
- Documentation for developing a plugin is included in the Plugin Developer Guide
RunPipeline executes Beam YAML pipelines inline or from a file, forwards Beam counters, distributions, and gauges into Kestra metrics, and can target Direct, Flink, Spark, or Dataflow runners with the Java or Python SDK.
id: beam-direct
namespace: dev.beam
tasks:
- id: run-beam
type: io.kestra.plugin.beam.RunPipeline
sdk: JAVA
beamRunner: DIRECT
definition: |
pipeline:
transforms:
- type: ReadFromText
config:
path: "data.txt"
- type: CountApache 2.0 © Kestra Technologies
We release new versions every month. Give the main repository a star to stay up to date with the latest releases and get notified about future updates.

