You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/data_source/catalog/hive_catalog.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,7 +38,9 @@ To ensure successful SQL workloads on your Hive cluster, your StarRocks cluster
38
38
- Parquet and ORC files support the following compression formats: NO_COMPRESSION, SNAPPY, LZ4, ZSTD, and GZIP.
39
39
- Textfile files support the NO_COMPRESSION compression format.
40
40
41
-
You can use the session variable [`connector_sink_compression_codec`](../../sql-reference/System_variable.md#connector_sink_compression_codec) to specify the compression algorithm used for sinking data to Hive tables.
41
+
You can use the table property [`compression_codec`](../../data_source/catalog/hive_catalog.md#properties) or the system variable [`connector_sink_compression_codec`](../../sql-reference/System_variable.md#connector_sink_compression_codec) to specify the compression algorithm used for sinking data to Hive tables.
42
+
43
+
When writing to a Hive table, if the table's properties include a compression codec, StarRocks will preferentially use that algorithm to compress the written data. Otherwise, it will use the compression algorithm set in the system variable `connector_sink_compression_codec`.
42
44
43
45
## Integration preparations
44
46
@@ -1021,7 +1023,7 @@ The following table describes a few key properties.
| location | The file path in which you want to create the managed table. When you use HMS as metastore, you do not need to specify the `location` parameter, because StarRocks will create the table in the default file path of the current Hive catalog. When you use AWS Glue as metadata service:<ul><li>If you have specified the `location` parameter for the database in which you want to create the table, you do not need to specify the `location` parameter for the table. As such, the table defaults to the file path of the database to which it belongs. </li><li>If you have not specified the `location` for the database in which you want to create the table, you must specify the `location` parameter for the table.</li></ul> |
1023
1025
| file_format | The file format of the managed table. Supported file formats are Parquet, ORC, and Textfile. ORC and Textfile formats are supported from v3.3 onwards. Valid values: `parquet`, `orc`, and `textfile`. Default value: `parquet`. |
1024
-
| compression_codec | The compression algorithm used for the managed table. This property is deprecated in v3.2.3, since which version the compression algorithm used for sinking data to Hive tables is uniformly controlled by the session variable [connector_sink_compression_codec](../../sql-reference/System_variable.md#connector_sink_compression_codec). |
1026
+
| compression_codec | The compression algorithm used for the managed table. |
1025
1027
1026
1028
### Examples
1027
1029
@@ -1068,7 +1070,7 @@ Note that sinking data to external tables is disabled by default. To sink data t
1068
1070
:::note
1069
1071
1070
1072
- You can grant and revoke privileges by using [GRANT](../../sql-reference/sql-statements/account-management/GRANT.md) and [REVOKE](../../sql-reference/sql-statements/account-management/REVOKE.md).
1071
-
- You can use the session variable [connector_sink_compression_codec](../../sql-reference/System_variable.md#connector_sink_compression_codec) to specify the compression algorithm used for sinking data to Hive tables.
1073
+
- You can use the table property [`compression_codec`](../../data_source/catalog/hive_catalog.md#properties) or the system variable [`connector_sink_compression_codec`](../../sql-reference/System_variable.md#connector_sink_compression_codec) to specify the compression algorithm used for sinking data to Hive tables. StarRocks will prioritize using the compression codec specified in the table property.
Copy file name to clipboardExpand all lines: docs/en/data_source/catalog/iceberg/iceberg_catalog.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1432,7 +1432,7 @@ Description: The file format of the Iceberg table. Only the Parquet format is su
1432
1432
1433
1433
###### compression_codec
1434
1434
1435
-
Description: The compression algorithm used for the Iceberg table. The supported compression algorithms are SNAPPY, GZIP, ZSTD, and LZ4. Default value: `gzip`. This property is deprecated in v3.2.3, since which version the compression algorithm used for sinking data to Iceberg tables is uniformly controlled by the session variable [connector_sink_compression_codec](../../../sql-reference/System_variable.md#connector_sink_compression_codec).
1435
+
Description: The compression algorithm used for the Iceberg table. The supported compression algorithms are SNAPPY, GZIP, ZSTD, and LZ4. Default value: `zstd`.
Copy file name to clipboardExpand all lines: docs/en/sql-reference/System_variable.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -349,7 +349,10 @@ Used for MySQL client compatibility. No practical usage.
349
349
350
350
### connector_sink_compression_codec
351
351
352
-
***Description**: Specifies the compression algorithm used for writing data into Hive tables or Iceberg tables, or exporting data with Files().
352
+
***Description**: Specifies the compression algorithm used for writing data into Hive tables or Iceberg tables, or exporting data with Files(). This parameter only takes effect in the following situations:
353
+
* The `compression_codec` property is not exist in the Hive tables.
354
+
* The `compression_codec` and `write.parquet.compression-codec` property are not exist in the Iceberg tables.
355
+
* The `compression` property is not set when `INSERT INTO FILES`.
353
356
***Valid values**: `uncompressed`, `snappy`, `lz4`, `zstd`, and `gzip`.
0 commit comments