Skip to content

feat(go/adbc/driver/flightsql): Add OAuth Support to Flight Client #2651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Apr 17, 2025

Conversation

xborder
Copy link
Contributor

@xborder xborder commented Mar 25, 2025

Description

This pull request introduces OAuth support to the Flight client in the GO driver. The changes include the addition of OAuth access token support, implementation of token exchange and client credentials OAuth flows.

Related Issues

Changes Made

  1. Added token as a database option
  2. Added support for Token Exchange. If configured, token gets exchanged and the result is added to the Authorization header as a Bearer token
  3. Added support for Client Credentials. If configured, client_id and client_secret are used to obtain a access token that is added to the Authorization header as a Bearer token
  4. Added new driver options to allow third-party applications to configure oauth flows:
  5. Added tests

Here's the markdown code for the OAuth 2.0 configuration options table:
markdown# OAuth 2.0 Configuration Options

Option Description
adbc.flight.sql.oauth.flow Specifies the OAuth 2.0 flow type to use. Possible values: client_credentials, token_exchange
adbc.flight.sql.oauth.client_id Unique identifier issued to the client application by the authorization server
adbc.flight.sql.oauth.client_secret Secret associated to the client_id. Used to authenticate the client application to the authorization server
adbc.flight.sql.oauth.token_uri The endpoint URL where the client application requests tokens from the authorization server
adbc.flight.sql.oauth.scope Space-separated list of permissions that the client is requesting access to (e.g "read.all offline_access")
adbc.flight.sql.oauth.exchange.subject_token The security token that the client application wants to exchange
adbc.flight.sql.oauth.exchange.subject_token_type Identifier for the type of the subject token. Check list below for supported token types.
adbc.flight.sql.oauth.exchange.actor_token A security token that represents the identity of the acting party
adbc.flight.sql.oauth.exchange.actor_token_type Identifier for the type of the actor token. Check list below for supported token types.
adbc.flight.sql.oauth.exchange.aud The intended audience for the requested security token
adbc.flight.sql.oauth.exchange.resource The resource server where the client intends to use the requested security token
adbc.flight.sql.oauth.exchange.scope Specific permissions requested for the new token
adbc.flight.sql.oauth.exchange.requested_token_type The type of token the client wants to receive in exchange. Check list below for supported token types.

Supported token types:

  • urn:ietf:params:oauth:token-type:access_token
  • urn:ietf:params:oauth:token-type:refresh_token
  • urn:ietf:params:oauth:token-type:id_token
  • urn:ietf:params:oauth:token-type:saml1
  • urn:ietf:params:oauth:token-type:saml2
  • urn:ietf:params:oauth:token-type:jwt

@xborder xborder changed the title feat(go/adbc/flightsql): Add OAuth Support to Flight Client feat(go/adbc/driver/flightsql): Add OAuth Support to Flight Client Mar 25, 2025
@xborder xborder marked this pull request as ready for review March 25, 2025 23:42
@xborder xborder requested a review from zeroshade as a code owner March 25, 2025 23:42
@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Mar 25, 2025
@jbonofre jbonofre self-requested a review March 27, 2025 12:59
@jbonofre
Copy link
Member

I would suggest to update docs as part of this PR (with basically what you have in PR description).

Copy link

@howareyouman howareyouman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Helder, for this PR!
The only concerns I have are around thread safety.

}, nil
}

func (f *tokenExchange) GetToken(ctx context.Context) (*oauth2.Token, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add RWLock here to make this method thread safe?

}, nil
}

func (c *clientCredentials) GetToken(ctx context.Context) (*oauth2.Token, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question is here - should it be thread safe? If so, could you please add rwlock here?

Copy link
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can use https://pkg.go.dev/golang.org/x/[email protected]/clientcredentials#Config instead and just call https://pkg.go.dev/golang.org/x/[email protected]/clientcredentials#Config.TokenSource

If we do that, then most of this code can be entirely removed in favor of just using https://pkg.go.dev/google.golang.org/grpc/credentials/oauth#TokenSource and https://pkg.go.dev/google.golang.org/grpc#WithPerRPCCredentials which would manage the oauth flow for us entirely as long as we can construct the token source which performs refreshes as necessary (as per the clientcredentials config above)

Essentially, SetOptions can create the TokenSource and then getFlightClient just adds the token source as a dialoption via WithPerRPCCredentials. That would be much more convenient and less work/code for us to maintain

@xborder
Copy link
Contributor Author

xborder commented Apr 7, 2025

maybe we can use https://pkg.go.dev/golang.org/x/[email protected]/clientcredentials#Config instead and just call https://pkg.go.dev/golang.org/x/[email protected]/clientcredentials#Config.TokenSource

If we do that, then most of this code can be entirely removed in favor of just using https://pkg.go.dev/google.golang.org/grpc/credentials/oauth#TokenSource and https://pkg.go.dev/google.golang.org/grpc#WithPerRPCCredentials which would manage the oauth flow for us entirely as long as we can construct the token source which performs refreshes as necessary (as per the clientcredentials config above)

Essentially, SetOptions can create the TokenSource and then getFlightClient just adds the token source as a dialoption via WithPerRPCCredentials. That would be much more convenient and less work/code for us to maintain

@zeroshade I was not aware of that, thank you for bringing this up. While I was testing this proposal I realized that TokenSource requires transport security. I think this conflicts with development and how to run tests making it mandatory to use tls. What alternatives do I have here? The options I see:

  • grpc-go seems to have some test certs for its examples. Would it make sense to have this for these tests?
  • Implement the PerRPCCredentials to set RequireTransportSecurity to return false on certain conditions. Besides that, I think the implementation would not be that much different. Just more aligned with grpc DialOptions.

Do you have any suggestion?

@xborder xborder closed this Apr 7, 2025
@xborder xborder reopened this Apr 7, 2025
@zeroshade
Copy link
Member

We have a TLS suite set up already for tests here:

you can reuse that

@zeroshade
Copy link
Member

Also TokenSource on it's own shouldn't require transport security, it just has a method that returns a bool letting the caller know whether or not it requires it

xborder added 7 commits April 8, 2025 19:31
Replaced extra structs for OAuth in favor of grpc's TokenSource and grpc.DialOption WithPerRPCCredentials
* Changed OAuth tests to use TLS since it is a requirement from grpc's TokenSource and WithPerRPCCredentials
* Split DoSetupSuit to setupFlightServer and setupDatabase so they can be used independently
* Removed adbc.flight.sql.token in favour of adbc.flight.sql.authorization_header when the client wants to pass a token or adbc.flight.sql.oauth.exchange.subject_token when client wants to do token exchange
* Simplified options to set oauth flow. Now client can set client_credentials or token_exchange instead of integers
@xborder xborder requested a review from lidavidm as a code owner April 11, 2025 17:47
@@ -159,6 +159,12 @@ few optional authentication schemes:
header will then be sent back as the ``authorization`` header on all
future requests.

- (Go only) OAuth 2.0 authentication flows.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it not Go only? Anything that uses the flightsql driver should be able to use the options that are being added. (We should add constants to the python adbc_driver_flightsql package)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to create a separate PR for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to make a separate PR to add the option constants, but I would still say that the "Go only" should be removed as nothing would prevent any other binding from using these options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.#2714

@davidhcoe
Copy link
Contributor

Just to add to this (and this could be a follow-on PR) - what happens if https://github.com/apache/arrow-adbc/blob/a187ead78afebe85c75b466a06ad6e01ae4ac8c6/go/adbc/driver/flightsql/flightsql_statement.go#L194C21-L194C25 is part of a long running operation and the token expires? How does the token get refreshed and then continue the poll operation?

By the way, this is a concern for all ADBC drivers in general, as we don't seem to have a standard way of failing "mid stream" and then "resume" the long running operation without starting it over.

@davidhcoe
Copy link
Contributor

Just to add to this (and this could be a follow-on PR) - what happens if https://github.com/apache/arrow-adbc/blob/a187ead78afebe85c75b466a06ad6e01ae4ac8c6/go/adbc/driver/flightsql/flightsql_statement.go#L194C21-L194C25 is part of a long running operation and the token expires? How does the token get refreshed and then continue the poll operation?

By the way, this is a concern for all ADBC drivers in general, as we don't seem to have a standard way of failing "mid stream" and then "resume" the long running operation without starting it over.

Or if

return cnxn.execute(ctx, s.sqlQuery, opts...)
runs long.

@davidhcoe
Copy link
Contributor

Just to add to this (and this could be a follow-on PR) - what happens if https://github.com/apache/arrow-adbc/blob/a187ead78afebe85c75b466a06ad6e01ae4ac8c6/go/adbc/driver/flightsql/flightsql_statement.go#L194C21-L194C25 is part of a long running operation and the token expires? How does the token get refreshed and then continue the poll operation?
By the way, this is a concern for all ADBC drivers in general, as we don't seem to have a standard way of failing "mid stream" and then "resume" the long running operation without starting it over.

Or if

return cnxn.execute(ctx, s.sqlQuery, opts...)

runs long.

And for context, #2655 is making an attempt to demonstrate this capability for BigQuery in the .NET driver when using Entra authentication. The Databricks driver also needs to do this. Essentially, any OAuth protected data source could support this type of retry behavior.

@xborder
Copy link
Contributor Author

xborder commented Apr 14, 2025

@davidhcoe I think you can refer to this comment.
TokenSource returns the obtain token or triggers a refresh if it expired.
I tested locally and could confirm that it was triggering refresh calls to the IdP in between flight calls. Even if the request runs long, it should still be valid until the request is finished/aborted. For the next request it should refresh.
It will require a better setup to test those edge cases but for most common scenarios this PR should address it.

@xborder xborder requested a review from zeroshade April 16, 2025 13:13
Copy link
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple nits, but otherwise this looks good to me! Thanks for this!

@zeroshade zeroshade merged commit 2ea2fcb into apache:main Apr 17, 2025
40 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants