Enable Support for Custom Session+Proxy Configurations #644
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces the ability for users to pass custom
requests.Sessionobjects to theSharingClientin the Delta Sharing Python library. This enhancement allows users to configure more complex session settings that cannot be achieved using environment variables alone, such as authenticated proxies, custom headers, SSL configurations, timeout settings, and other session-related configurations. This provides users with greater flexibility when working in complex network environments or when specific session configurations are required.This PR also updates the Delta Sharing File System in Spark to support proxy configurations. This means users can now define proxy settings, including authenticated proxies, custom headers, SSL configurations, and timeout settings, through the Spark configuration.
Key Changes
1.
SharingClientClass Update (Python)The
SharingClientclass now accepts an optionalsessionparameter in its constructor. This allows users to pass a customrequests.Sessionobject when creating aSharingClient. If no session is provided, a newrequests.Sessionwill be created as before:2.
DataSharingRestClientClass Update (Python)The
DataSharingRestClientclass now accepts an optionalsessionparameter. The custom session is passed fromSharingClienttoDataSharingRestClient, ensuring all HTTP requests utilize the custom session.The
__auth_sessionmethod uses the provided session or creates a new one if none is provided:3. High-Level Function Updates (Python)
The
load_as_pandasandload_table_changes_as_pandasfunctions now accept an optionalsharing_clientparameter. If asharing_clientis provided, these functions will use itsrest_clientfor making HTTP requests, ensuring the custom session is used.3. Proxy Configuration Support (Spark)
The
DeltaSharingFileSystemclass now supports proxy configurations through its configuration settings. Users can define proxy hosts, ports, and other related settings in the Spark configuration. In order to use this, you can configure the following properties:5.
ConfUtilsUpdates (Spark)The
ConfUtilsutility object has been updated to handle the retrieval and validation of proxy-related configurations, including custom headers and SSL configurations.6. HTTP Client Configuration (Spark)
The
DeltaSharingFileSystem.createHttpClientmethod has been enhanced to configure the HTTP client with proxy settings, custom headers, SSL configurations, and timeout settings.Example Usage
Python
This is a simplified example of how to use the updated
SharingClientwith a customrequests.Sessionto configure an authenticated proxy, custom headers, and other settings:Spark