Skip to content

Snow 1853185 jdbc driver v3.16 native okta http retry storm#2064

Merged
sfc-gh-fpawlowski merged 106 commits intomasterfrom
SNOW-1853185-JDBC-Driver-v3.16-Native-Okta-HTTP-Retry-Storm
Mar 11, 2025
Merged

Snow 1853185 jdbc driver v3.16 native okta http retry storm#2064
sfc-gh-fpawlowski merged 106 commits intomasterfrom
SNOW-1853185-JDBC-Driver-v3.16-Native-Okta-HTTP-Retry-Storm

Conversation

@sfc-gh-fpawlowski
Copy link
Copy Markdown
Contributor

@sfc-gh-fpawlowski sfc-gh-fpawlowski commented Feb 5, 2025

Overview

SNOW-1853185

Pre-review self checklist

  • PR branch is updated with all the changes from master branch
  • The code is correctly formatted (run mvn -P check-style validate)
  • New public API is not unnecessary exposed (run mvn verify and inspect target/japicmp/japicmp.html)
  • The pull request name is prefixed with SNOW-XXXX:
  • Code is in compliance with internal logging requirements

External contributors - please answer these questions before submitting a pull request. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Issue: #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency or upgrading an existing one
    • I am adding new public/protected component not marked with @SnowflakeJdbcInternalApi (note that public/protected methods/fields in classes marked with this annotation are already internal)
  3. Please describe how your code solves the related issue.

The issue with the previous code was that each time we did not succeed in authenticating using OKTA we threw an exception (Authentication timeout). It was caught in the outer loop and all steps were redone.
There were two major issues - described in the Jira task. First of all we did not wait before retrying those steps - because the exception was thrown before sleeping for the backoff time. Second problem was retrying all steps, while usually some of them were unnecessary (especially the first step of the federated auth flow).

To make code more modular the chosen solution included creating 2 classes.

  1. RetryContextManager - controlling the custom logic injected into the retry loop of RestRequest class. Instead of throwing the exception we can now pass custom logic that will be able to update the request to be sent with refreshed data.
  2. RetryContext - containing information about currently ongoing request-retrying process. It is important for example to make the custom logic aware of the time passed since the first request was sent (regarding the overall timeout for the operations).

@sfc-gh-fpawlowski sfc-gh-fpawlowski self-assigned this Feb 5, 2025
Comment thread src/main/java/net/snowflake/client/jdbc/RestRequest.java
@sfc-gh-fpawlowski sfc-gh-fpawlowski marked this pull request as ready for review February 10, 2025 01:07
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java
Comment thread src/main/java/net/snowflake/client/jdbc/RetryContext.java Outdated
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java Outdated
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java
Copy link
Copy Markdown
Contributor

@sfc-gh-mkubik sfc-gh-mkubik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the failing StatementLatestIT.testSetQueryTimeoutForAsyncQueryUsingConnectionProperty test somehow related to the changes made in this PR or it's just some flaky test that failed randomly?

@sfc-gh-fpawlowski
Copy link
Copy Markdown
Contributor Author

Is the failing StatementLatestIT.testSetQueryTimeoutForAsyncQueryUsingConnectionProperty test somehow related to the changes made in this PR or it's just some flaky test that failed randomly?

Seems like there were some temporary issues - I reran the failed jobs and they passed.
Failing Jenkins afaik is caused by the GS change - being fixed here: #2109

Comment thread src/main/java/net/snowflake/client/jdbc/RetryContextManager.java Outdated
Comment thread src/test/java/net/snowflake/client/core/SessionUtilLatestIT.java Outdated
Comment thread src/main/java/net/snowflake/client/core/HttpUtil.java
Comment thread src/main/java/net/snowflake/client/core/HttpUtil.java
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java Outdated
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java Outdated
Comment thread src/main/java/net/snowflake/client/core/SessionUtil.java
@sfc-gh-fpawlowski sfc-gh-fpawlowski merged commit c71218f into master Mar 11, 2025
137 of 140 checks passed
@sfc-gh-fpawlowski sfc-gh-fpawlowski deleted the SNOW-1853185-JDBC-Driver-v3.16-Native-Okta-HTTP-Retry-Storm branch March 11, 2025 22:38
@github-actions github-actions bot locked and limited conversation to collaborators Mar 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants