-
Notifications
You must be signed in to change notification settings - Fork 993
Add pyhive connection timeout #7300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 3 commits
619d06e
e82306e
cb26cb8
5862230
c0e5ba9
6b349b7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -160,7 +160,8 @@ def __init__( | |||||||||
| check_hostname=None, | ||||||||||
| ssl_cert=None, | ||||||||||
| thrift_transport=None, | ||||||||||
| ssl_context=None | ||||||||||
| ssl_context=None, | ||||||||||
| connection_timeout=None, | ||||||||||
| ): | ||||||||||
| """Connect to HiveServer2 | ||||||||||
|
|
||||||||||
|
|
@@ -175,6 +176,7 @@ def __init__( | |||||||||
| Incompatible with host, port, auth, kerberos_service_name, and password. | ||||||||||
| :param ssl_context: A custom SSL context to use for HTTPS connections. If provided, | ||||||||||
| this overrides check_hostname and ssl_cert parameters. | ||||||||||
| :param connection_timeout: Millisecond timeout for Thrift connections. | ||||||||||
|
ryanbordo marked this conversation as resolved.
Outdated
|
||||||||||
| The way to support LDAP and GSSAPI is originated from cloudera/Impyla: | ||||||||||
| https://github.com/cloudera/impyla/blob/255b07ed973d47a3395214ed92d35ec0615ebf62 | ||||||||||
| /impala/_thrift_api.py#L152-L160 | ||||||||||
|
|
@@ -193,6 +195,8 @@ def __init__( | |||||||||
| ), | ||||||||||
| ssl_context=ssl_context, | ||||||||||
| ) | ||||||||||
| if connection_timeout: | ||||||||||
| thrift_transport.setTimeout(connection_timeout) | ||||||||||
|
|
||||||||||
| if auth in ("BASIC", "NOSASL", "NONE", None): | ||||||||||
| # Always needs the Authorization header | ||||||||||
|
|
@@ -236,6 +240,8 @@ def __init__( | |||||||||
| if auth is None: | ||||||||||
| auth = 'NONE' | ||||||||||
| socket = thrift.transport.TSocket.TSocket(host, port) | ||||||||||
| if connection_timeout: | ||||||||||
|
ryanbordo marked this conversation as resolved.
Outdated
|
||||||||||
| socket.setTimeout(connection_timeout) | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI this sets both connection and socket timeout. There's a nice description of the difference here.
Suggested change
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good mention. My goal with this was originally to manage the socket timeout. However, I think I will leave this way to not get too granular.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In Java, a socket can have different
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From my short research, it looks like the short answer is no, but possible with an explicit
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can call
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. okay, I read the code,
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it. Thanks for the feedback.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pushed up a solution for this, don't love it but should work. We can't guarantee a public API for setting a socket timeout so I believe it's this or reverting to only setting a timeout once, for connection and socket. Let me know what you think
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm leaning towards reverting to the connection timeout only, for both connections and socket, plus only on host, port combos for tradeoffs in simplicity. A user with a provided thrift object could fine tune themselves
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reverting to the previous PR for these reasons. Plus, the pyhive connection wrapper provides polling and ability to run async, which lessons the need for a separate socket timeout. We can always add onto this given user feedback and a user can use a custom thrift transport to fine tune this still Line 483 in c8a0ecb
Line 524 in c8a0ecb
|
||||||||||
| if auth == 'NOSASL': | ||||||||||
| # NOSASL corresponds to hive.server2.authentication=NOSASL in hive-site.xml | ||||||||||
| self._transport = thrift.transport.TTransport.TBufferedTransport(socket) | ||||||||||
|
|
||||||||||
Uh oh!
There was an error while loading. Please reload this page.