You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Simulates random data generation with INSERTs, UPDATEs, and DELETEs across three tables (`customers`, `orders`, `order_items`). Designed to be used in combination with the CDC PostgreSQL Connector to demonstrate change data capture pipelines.
7
+
8
+
---
9
+
10
+
## Purpose
11
+
12
+
This flow provides a continuous stream of realistic database mutations against a PostgreSQL instance. It automatically creates the required schema, tables, and publication on first run — no manual SQL setup is needed.
13
+
14
+
## Components
15
+
16
+
| Component | Type | Description |
17
+
|-----------|------|-------------|
18
+
| ExecuteScript (init) | Processor | Creates schema, tables, and publication on first run |
19
+
| GenerateFlowFile | Processor | Triggers periodic data generation |
-`org.apache.nifi:nifi-standard-nar:2.8.0` — included with standard NiFi installations
25
+
- PostgreSQL JDBC driver (`postgresql-42.7.10.jar`) — uploaded as a Parameter Context Asset
26
+
27
+
## Parameters
28
+
29
+
| Parameter | Description |
30
+
|-----------|-------------|
31
+
|`Database Connection URL`| JDBC URL to the Postgres instance |
32
+
|`Database Name`| Postgres database name |
33
+
|`Database User`| Postgres username |
34
+
|`Database Password`| Postgres password (sensitive — use parameter provider) |
35
+
|`Schema Name`| Schema for the generated tables |
36
+
|`Database Driver`| JDBC driver asset (bound to the uploaded JAR) |
37
+
38
+
## Configuration
39
+
40
+
Deploy via the Environment CD pipeline using an `environments/<env>/config.yaml` entry, or test in CI using the test YAML at `flows/data-generator/tests/test_postgres_cdc_demo.yaml`.
41
+
42
+
For secrets (`Database Connection URL`, `Database Password`), use the auto-provisioned Snowflake Parameter Provider with `#{PARAM_NAME}` references.
43
+
44
+
## Expected Behaviour
45
+
46
+
Once started, the flow continuously generates random INSERT, UPDATE, and DELETE operations against the three tables. The PostgreSQL publication enables downstream CDC connectors to capture these changes in real time.
47
+
48
+
## Validation Tests
49
+
50
+
The test file at [`flows/data-generator/tests/test_postgres_cdc_demo.py`](../flows/data-generator/tests/test_postgres_cdc_demo.py) validates the flow deploys correctly and processes data against a live PostgreSQL instance provisioned via the ephemeral CI runtime.
Copy file name to clipboardExpand all lines: wiki/Flows.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,12 @@ See [How to Use This Repo](How-to-Use-This-Repo) for instructions on importing o
14
14
|------|-------------|
15
15
|[Hello World](Flows--Hello-World)| Minimal example demonstrating the NiFi Hub flow structure. Generates a FlowFile and logs its attributes. |
16
16
17
+
### Data Generator
18
+
19
+
| Flow | Description |
20
+
|------|-------------|
21
+
|[Postgres CDC Demo](Flows--Postgres-CDC-Demo)| Simulates random data generation (INSERTs, UPDATEs, DELETEs) across multiple tables for use with the CDC PostgreSQL Connector. |
Copy file name to clipboardExpand all lines: wiki/Introduction-and-Concepts--CD.md
+38-13Lines changed: 38 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# CD Pipeline
2
2
3
-
NiFi Hub has two CD mechanisms: **Flow Deploy** for testing flows against a live Snowflake runtime during PR review, and **Environment CD** for managing Openflow infrastructure declaratively as code.
3
+
NiFi Hub has two CD mechanisms: **Flow Deploy** for testing flows against ephemeral Snowflake runtimes during PR review, and **Environment CD** for managing Openflow infrastructure declaratively as code.
4
4
5
5
---
6
6
@@ -9,22 +9,47 @@ NiFi Hub has two CD mechanisms: **Flow Deploy** for testing flows against a live
9
9
**Workflow:**`flow-deploy.yml`
10
10
**Trigger:** A maintainer with admin or maintain permission comments `deploy this flow` on a PR
11
11
12
-
This workflow lets maintainers test a flow against a real Snowflake Openflow runtime before merging. It is intended as a validation step during PR review, not for production deployment.
12
+
This workflow lets maintainers test a flow against a real Snowflake Openflow runtime before merging. It provisions an **ephemeral runtime**, deploys the flow, runs tests, and tears everything down automatically.
13
13
14
14
### What Happens
15
15
16
16
1. The workflow identifies flow JSON files changed in the PR
17
-
2.**Builds all extension bundles** and uploads the resulting NARs to the target runtime
18
-
3.**Deploys each changed flow** as a process group on the runtime, using the runtime's REST API
19
-
4.**Runs the flow's validation tests** (`flows/<bucket>/tests/test_<flow-name>.py`) against the deployed process group
20
-
5.**Cleans up** the deployed process group and uploaded NARs (unless the comment includes "do not clean")
21
-
6.**Posts a comment** on the PR with deployment details, processor/controller service summary, and per-test results
22
-
23
-
### Configuration
24
-
25
-
The target runtime is configured via a GitHub Environment named `snowflake-runtime-ci`, which provides:
26
-
-`SNOWFLAKE_RUNTIME_URL` — the Openflow runtime endpoint
27
-
-`SNOWFLAKE_RUNTIME_PAT` — a PAT with permission to deploy to the runtime
17
+
2. For each changed flow that has a test YAML (`flows/<bucket>/tests/test_<flow-name>.yaml`):
18
+
-**Provisions an ephemeral runtime** named `CI_<FLOW>_<PR>_<RUN_ID>` with the configuration from the test YAML (node type, network rules, registries, etc.)
19
+
-**Uploads custom NARs** from GitHub Releases if specified in the test YAML's `nars` field
20
+
-**Deploys the flow** from the PR branch as a process group on the runtime
21
+
-**Applies parameters and assets** from the test YAML's `flow` section
22
+
-**Fetches the auto-provisioned Snowflake Parameter Provider** to inject secrets from Snowflake
23
+
-**Waits** for the flow to process data (60 seconds)
24
+
-**Runs validation tests** (`flows/<bucket>/tests/test_<flow-name>.py`) against the deployed process group
25
+
-**Tears down** the ephemeral runtime (unless the comment includes "do not clean")
26
+
3.**Posts a comment** on the PR with per-test results and failure details
27
+
4.**Creates a Check Run** that blocks the PR merge if tests fail
28
+
29
+
### Test YAML Configuration
30
+
31
+
Each flow that needs CI testing has a `flows/<bucket>/tests/test_<flow-name>.yaml` file following the [CI runtime schema](../scripts/ci/ci-runtime-schema.json). Key fields:
0 commit comments