Skip to content

Commit 834ca67

Browse files
authored
Snowflake Loader 0.3.0 (#1088)
1 parent 43493d4 commit 834ca67

File tree

5 files changed

+29
-9
lines changed

5 files changed

+29
-9
lines changed

docs/pipeline-components-and-applications/loaders-storage-targets/snowflake-streaming-loader/configuration-reference/_common_config.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,12 @@ import Link from '@docusaurus/Link';
1111
<td>Optional. Default value <code>1 second</code>. Events are emitted to Snowflake after a maximum of this duration, even if the <code>maxBytes</code> size has not been reached</td>
1212
</tr>
1313
<tr>
14-
<td><code>batching.uploadConcurrency</code></td>
15-
<td>Optional. Default value 3. How many batches can we send simultaneously over the network to Snowflake</td>
14+
<td><code>batching.uploadParallelismFactor</code></td>
15+
<td>Optional. Default value 2.5. Controls how many batches can we send simultaneously over the network to Snowflake. E.g. If there are 4 available processors, and <code>uploadParallelismFactor</code> is 2.5, then the loader sends up to 10 batches in parallel. Adjusting this value can cause the app to use more or less of the available CPU.</td>
16+
</tr>
17+
<tr>
18+
<td><code>cpuParallelismFactor</code></td>
19+
<td>Optional. Default value 0.75. Controls how the loaders splits the workload into concurrent batches which can be run in parallel. E.g. If there are 4 available processors, and <code>cpuParallelismFactor</code> is 0.75, then the loader processes 3 batches concurrently. Adjusting this value can cause the app to use more or less of the available CPU.</td>
1620
</tr>
1721
<tr>
1822
<td><code>retries.setupErrors.delay</code></td>
@@ -67,6 +71,10 @@ import Link from '@docusaurus/Link';
6771
<td><code>monitoring.webhook.tags.*</code></td>
6872
<td>Optional. A map of key/value strings to be included in the payload content sent to the webhook.</td>
6973
</tr>
74+
<tr>
75+
<td><code>monitoring.webhook.heartbeat.*</code></td>
76+
<td>Optional. Default value <code>5.minutes</code>. How often to send a heartbeat event to the webhook when healthy.</td>
77+
</tr>
7078
<tr>
7179
<td><code>monitoring.sentry.dsn</code></td>
7280
<td>Optional. Set to a Sentry URI to report unexpected runtime exceptions.</td>
@@ -95,3 +103,7 @@ import Link from '@docusaurus/Link';
95103
<td><code>output.good.jdbcQueryTimeout</code></td>
96104
<td>Optional. Sets the query timeout on the JDBC driver which connects to Snowflake</td>
97105
</tr>
106+
<tr>
107+
<td><code>http.client.maxConnectionsPerServer</code></td>
108+
<td> Optional. Default value 4. Configures the internal HTTP client used for alerts and telemetry. The maximum number of open HTTP requests to any single server at any one time.</td>
109+
</tr>

docs/pipeline-components-and-applications/loaders-storage-targets/snowflake-streaming-loader/configuration-reference/_kafka_config.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,6 @@
2323
<td>Optional. A map of key/value pairs for <a href="https://docs.confluent.io/platform/current/installation/configuration/producer-configs.html" target="_blank">any standard Kafka producer configuration option</a>.</td>
2424
</tr>
2525
<tr>
26-
<td><code>output.bad.maxRecordSize.*</code></td>
26+
<td><code>output.bad.maxRecordSize</code></td>
2727
<td>Optional. Default value 1000000. Any single failed event sent to Kafka should not exceed this size in bytes</td>
2828
</tr>

docs/pipeline-components-and-applications/loaders-storage-targets/snowflake-streaming-loader/configuration-reference/_kinesis_config.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,16 @@
2323
<td>Optional. Default value 1000. How many events the Kinesis client may fetch in a single poll. Only used when `input.retrievalMode` is Polling.</td>
2424
</tr>
2525
<tr>
26-
<td><code>input.bufferSize</code></td>
27-
<td>Optional. Default value 1. The number of batches of events which are pre-fetched from kinesis. The default value is known to work well.</td>
26+
<td><code>input.workerIdentifier</code></td>
27+
<td>Optional. Defaults to the <code>HOSTNAME</code> environment variable. The name of this KCL worker used in the dynamodb lease table.</td>
28+
</tr>
29+
<tr>
30+
<td><code>input.leaseDuration</code></td>
31+
<td>Optional. Default value <code>10 seconds</code>. The duration of shard leases. KCL workers must periodically refresh leases in the dynamodb table before this duration expires.</td>
32+
</tr>
33+
<tr>
34+
<td><code>input.maxLeasesToStealAtOneTimeFactor</code></td>
35+
<td>Optional. Default value <code>2.0</code>. Controls how to pick the max number of shard-leases to steal at one time. E.g. If there are 4 available processors, and <code>maxLeasesToStealAtOneTimeFactor</code> is 2.0, then allow the KCL to steal up to 8 leases. Allows bigger instances to more quickly acquire the shard-leases they need to combat latency.</td>
2836
</tr>
2937
<tr>
3038
<td><code>output.bad.streamName</code></td>
@@ -47,6 +55,6 @@
4755
<td>Optional. Default value 5242880. The maximum number of bytes we are allowed to send to Kinesis in 1 PutRecords request.</td>
4856
</tr>
4957
<tr>
50-
<td><code>output.bad.maxRecordSize.*</code></td>
58+
<td><code>output.bad.maxRecordSize</code></td>
5159
<td>Optional. Default value 1000000. Any single event failed event sent to Kinesis should not exceed this size in bytes</td>
5260
</tr>

docs/pipeline-components-and-applications/loaders-storage-targets/snowflake-streaming-loader/configuration-reference/_snowflake_config.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@
2828
</tr>
2929
<tr>
3030
<td><code>output.good.table</code></td>
31-
<td>Optional. Default value `events`. Name to use for the events table</td>
31+
<td>Optional. Default value <code>events</code>. Name to use for the events table</td>
3232
</tr>
3333
<tr>
3434
<td><code>output.good.channel</code></td>
35-
<td>Optional. Default value `snowplow`. Name to use for the Snowflake channel. If you run multiple loaders in parallel, then each channel must be given a unique name.</td>
35+
<td>Optional. Default value <code>snowplow</code>. Prefix to use for the snowflake channels. The full name will be suffixed with a number, e.g. <code>snowplow-1</code>. If you run multiple loaders in parallel, then each loader must be configured with a unique channel prefix.</td>
3636
</tr>

src/componentVersions.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ export const versions = {
3737
s3Loader: '2.2.9',
3838
s3Loader22x: '2.2.9',
3939
lakeLoader: '0.5.0',
40-
snowflakeStreamingLoader: '0.2.4',
40+
snowflakeStreamingLoader: '0.3.0',
4141

4242
// Data Modelling
4343
// dbt

0 commit comments

Comments
 (0)