-
Notifications
You must be signed in to change notification settings - Fork 854
feat: Add API and Connection timeout similar to OpenAI python SDK #2458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Add API and Connection timeout similar to OpenAI python SDK #2458
Conversation
|
Hey @ranadeepsingh 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #2458 +/- ##
==========================================
- Coverage 84.36% 84.34% -0.02%
==========================================
Files 335 335
Lines 17734 17777 +43
Branches 1602 1633 +31
==========================================
+ Hits 14961 14994 +33
- Misses 2773 2783 +10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds configurable timeout parameters to the SynapseML OpenAI connector to align with the OpenAI Python SDK defaults and improve reliability for long-running requests. The changes introduce three distinct timeout parameters: apiTimeout (per-request timeout for API responses), connectionTimeout (timeout for establishing HTTP connections), and timeout (global operation timeout for the entire DataFrame transformation).
Key changes:
- Updated default timeout values: API timeout increased from 360s to 600s (10 minutes), connection timeout separated from API timeout and set to 5.0s to match OpenAI SDK
- Added global operation timeout feature that stops processing new rows after a specified duration, returning HTTP 408 responses for remaining rows
- Fixed synthetic HTTP responses to include proper
HTTP/1.1protocol version instead ofNULL
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| HTTPTransformer.scala | Added three timeout parameters (apiTimeout, connectionTimeout, timeout), implemented global timeout tracking with broadcast variables, and added early timeout checks |
| HTTPSchema.scala | Fixed protocol version in stringToResponse, emptyResponse, and binaryToResponse to use HTTP/1.1 instead of null |
| SimpleHTTPTransformer.scala | Updated to pass all three timeout parameters to HTTPTransformer |
| Clients.scala | Added precomputedResponse field to RequestWithContext for handling timeout responses without making HTTP calls |
| HTTPClients.scala | Modified sendRequestWithContext to handle precomputed responses and separated connection timeout from API timeout |
| OpenAI.scala | Added global parameter keys for the three timeout types |
| OpenAIDefaults.scala | Added setters, getters, and resetters for timeout parameters with validation (must be > 0) |
| OpenAIDefaults.py | Added Python bindings for timeout methods with client-side validation |
| OpenAIPrompt.scala | Registered timeout parameters with GlobalParams and changed default from timeout to apiTimeout (600.0s) |
| CognitiveServiceBase.scala | Modified to pass timeout parameters to SimpleHTTPTransformer |
| OpenAIPromptSuite.scala | Added tests for long input handling, timeout configuration, and short timeout behavior |
| OpenAIDefaultsSuite.scala | Added tests for timeout getters, resetters, and validation |
| test_OpenAIDefaults.py | Added Python tests for timeout setters, getters, resetters, and validation |
| HTTPTransformerSuite.scala | Added tests for default timeout values, custom timeout values, and global timeout behavior |
| SimpleHTTPTransformerSuite.scala | Updated test to use setApiTimeout instead of setTimeout |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/src/test/scala/com/microsoft/azure/synapse/ml/io/split1/HTTPTransformerSuite.scala
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/io/http/HTTPTransformer.scala
Outdated
Show resolved
Hide resolved
…PTransformer.scala Co-authored-by: Copilot <[email protected]>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
What changes are proposed in this pull request?
Summary
Align SynapseML OpenAI connector timeout defaults with the OpenAI Python SDK and add user-configurable timeout parameters to prevent hanging operations and improve reliability for long-running requests.
Timeout Configuration Changes
New Parameters
apiTimeoutconnectionTimeouttimeoutKey Features
Global Operation Timeout: When
timeoutis set, the entire DataFrame transformation will stop making new API calls after the specified duration, returning HTTP 408 (Request Timeout) errors for remaining rows. Useful for batch processing with time constraints.Early Timeout Detection: Added early timeout check at partition start to avoid network errors (e.g.,
UnknownHostException) when using very short timeouts.Configurable via OpenAIDefaults or Per-Transformer:
HTTP/1.1 Protocol Version: Fixed synthetic responses (e.g., timeout errors) to include proper
HTTP/1.1protocol version instead ofNULL.Files Changed
Core HTTP Layer:
HTTPTransformer.scala- Added three timeout parameters, broadcast start time for distributed timeout tracking, early timeout checkHTTPSchema.scala- FixedstringToResponse,emptyResponse,binaryToResponseto useProtocolVersionData("HTTP", 1, 1)SimpleHTTPTransformer.scala- Pass timeout parameter to HTTPTransformerClients.scala- AddedprecomputedResponsefield toRequestWithContextHTTPClients.scala- Handle precomputed responses for timed-out requestsOpenAI Layer:
OpenAI.scala- AddedOpenAITimeoutKey,OpenAIApiTimeoutKey,OpenAIConnectionTimeoutKeyOpenAIDefaults.scala- Added setters/getters/resetters for all three timeouts with validationOpenAIDefaults.py- Added Python bindings for timeout methodsOpenAIPrompt.scala- Register timeout parameter with GlobalParamsCognitiveServiceBase.scala- Pass timeout parameters to SimpleHTTPTransformerTests:
OpenAIPromptSuite.scala- Added tests for long input handling, timeout configuration, short timeout behaviorOpenAIDefaultsSuite.scala- Added tests for timeout getters, resetters, and validationtest_OpenAIDefaults.py- Added Python tests for timeout setters/getters, resetters, and validationDocumentation:
OpenAI.ipynb- Added "Configuring Timeouts" section with examplesHow is this patch tested?
Test Coverage
Scala Tests (
OpenAIPromptSuite):Long Input Handling- Tests behavior with normal, null, and very long (1000x, 10000x repetition) inputsTimeout Configuration- Validates all three timeout parameters are properly set and retrievedShort Timeout Returns Timeout Error- Validates HTTP 408 responses for timed-out operationsScala Tests (
OpenAIDefaultsSuite):Test Getters- Validates all timeout getters return correct valuesTest Resetters- Validates all timeout resetters clear values properlyTest Parameter Validation- Validates timeout values must be > 0Python Tests (
test_OpenAIDefaults.py):test_setters_and_getters- Tests timeout set/get operationstest_resetters- Tests timeout reset operationstest_parameter_validation- Tests timeout validation (must be > 0)Does this PR change any dependencies?
Does this PR add a new feature? If so, have you added samples on website?
Documentation added to
OpenAI.ipynbnotebook with:OpenAIDefaultsfor global configuration