-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Description
Description
The Delta Kernel API accepts Hadoop config via the Engine
interface. However, DeltaInputSource
hides this and instantiates the engine with an empty Configuration()
.
It is proposed to add a hadoopProperties JSON field to the Delta Lake input source.
Example:
"inputSource": {
"type": "delta",
"tablePath": "s3a://bucket/path/to/table",
"hadoopProperties": {
"fs.s3a.access.key": "${AWS_ACCESS_KEY_ID}",
"fs.s3a.secret.key": "${AWS_SECRET_ACCESS_KEY}",
"fs.s3a.session.token": "${AWS_SESSION_TOKEN}",
"fs.s3a.endpoint": "s3.amazonaws.com"
}
}
Motivation
Allow providing Hadoop properties such as S3A credentials, endpoints, and similar options at the ingestion spec level rather than globally.
This enables ingesting from different environments and adding configuration specific to the ingestion task.
Example use case: support reading Delta tables from S3 with temporary AWS STS credentials without relying on global configuration.