-
Notifications
You must be signed in to change notification settings - Fork 164
Fix instance cache key collisions with different configurations #530
Conversation
23e4e6f to
36e3c60
Compare
36e3c60 to
9228470
Compare
|
|
||
| // Configuration parameters that should affect cache keys. These are authentication | ||
| // and session parameters that distinguish different connections to the same database. | ||
| var cacheKeyParams = []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to more suggestions. This list is certainly not exhaustive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also allow users to override this global.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding of the issue, users try to pass a unique token here, which is then used to disambiguate two connections (whose parameters otherwise overlap). So maybe it is better to make this completely configurable by the user, without providing any pre-configured values?
|
|
||
| // Configuration parameters that should affect cache keys. These are authentication | ||
| // and session parameters that distinguish different connections to the same database. | ||
| var cacheKeyParams = []string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding of the issue, users try to pass a unique token here, which is then used to disambiguate two connections (whose parameters otherwise overlap). So maybe it is better to make this completely configurable by the user, without providing any pre-configured values?
| paramString := relevantParams.Encode() | ||
| hash := sha256.Sum256([]byte(paramString)) | ||
|
|
||
| return fmt.Sprintf("%s#%x", basePath, hash[:8]) // Use first 8 bytes of hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my understanding: why only the first 8 bytes of the hash?
| paramString := relevantParams.Encode() | ||
| hash := sha256.Sum256([]byte(paramString)) | ||
|
|
||
| return fmt.Sprintf("%s#%x", basePath, hash[:8]) // Use first 8 bytes of hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how does the # work here? If we use, e.g., hello.db#myhash as the DB path, isn't that going to create unique new files with the different hashes instead of, e.g., opening an existing file? I tried to have a look at the C++ side of this: DBInstanceCache::GetOrCreateInstance calls into either CreateInstanceInternal or GetInstanceInternal, both of which call GetDBAbsolutePath(database, *local_fs); on the database string (path). I can't find any special-handling of the #, but maybe I missed it / miss something here?
|
I just had a look at duckdb/duckdb#13129, and my understanding is that the changes in that PR are not yet in go-duckdb. I wonder if we have to do anything on the go-duckdb side at all? Once that change is in, we just pass our key-value (user-defined) parameters into the connection, and DuckDB will disambiguate? |
|
The current PR is nonsense as it stands. Let's just say that my mental model was wrong. 😄 From what I understand, duckdb/duckdb#13129 only added validation (which was shipped with DuckDB 1.4.0), and does not allow multiple users with different auth to connect to the same database file, or am I missing something? |
The current cache key generation only uses the database path and ignores query parameters. This causes cache collisions when multiple users connect to the same database instance with different session/auth params, leading to incorrect connection sharing.
Fixes https://github.com/duckdblabs/duckdb-internal/issues/5534