All notable changes to the DuckDB MSSQL Extension are documented here.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- dbt segfault with
threads >= 2(spec 052, closes #126). Catalog entries (MSSQLTableEntry,MSSQLSchemaEntry) switched fromunique_ptrtoshared_ptrownership; concurrent first-load of the same table is coordinated via per-table singleflight so only one thread issues the SQL Server round trip (waiters re-check the cache). Lifetime extension acrossInvalidate()/OnDetachis done via per-ClientContextbind-time anchors registered asClientContextState: everyMSSQLSchemaEntry::LookupEntry/MSSQLCatalog::LookupSchema/ catalogScancallback path stashes theshared_ptrinto the per-contextMSSQLBindAnchorsholder, which DuckDB releases via theQueryEndhook. In-flight binders therefore hold every entry they touched alive for the full bind+execute span; release happens naturally at end of query, with no catalog-lifetime accumulation. Audit ofMSSQLMetadataCache::GetTableMetadataconfirms its only caller (MSSQLTableSet::LoadSingleEntry) copies fields immediately; contract pinned in the header.MSSQLStatisticsProviderreturns by value, no pointer-handout surface. MSSQLTableEntry::EnsurePKLoadeddouble-free under thread stress (spec 052). Two threads both saw the load flag false, both fetched, and both move-assigned topk_info_— the second move freed the loser's previousvector<PKColumnInfo>while the first thread still held it. Caught by AddressSanitizer during the spec 052 invalidation-race soak. Fixed with apk_load_mutex_+std::atomic<bool> pk_loaded_acquire/release publication so the lock-free fast path stays correct under the C++ memory model.MSSQLCatalog::RefreshCacheconnection leak on TDS hiccup (spec 052). Under thread stress (~318 invalidations / 30 s), SQL Server occasionally returns a transient TDS error mid-Refresh. The exception propagated pastconnection_pool_->Release, leaving one connection checked out and tripping~ConnectionPool's quiescence-contract assert at catalog teardown. Wrapped theRefreshcall in try/release.MSSQLTableEntry::EnsurePKLoaded/GetStorageInfoconnection leak on TDS hiccup (spec 052). Both functions intentionally swallow exceptions to fall back topk_info_.exists = false/ cachedapprox_row_count_, but the outercatch (...)never released the pool-owned connection acquired inside the try. Every SELECT bind callsEnsurePKLoaded, so under scenario 5/8 stress a single TDS hiccup insidePrimaryKeyInfo::Discoverstranded one connection inactive_connections_for the lifetime of the pool. Nested an inner try/release/throw around the SQL Server I/O so the connection is returned BEFORE the outer fallback catch runs.ConnectionPool::Shutdowncleanup-thread strand on quiescence violation (spec 052).D_ASSERTin DuckDB-debug builds throwsInternalExceptionrather than callingabort(). The throw fired whilecleanup_thread_was still joinable;~ConnectionPool noexceptcaught the exception, but~std::threadon the joinable thread then calledstd::terminate(). Reordered: signal + join the cleanup thread and close pooled connections FIRST, then emit the warning and assert at the end. The warning is the operator-visibility signal; the trailing assert preserves the debug invariant without stranding resources on its unwind path.
test/cpp/test_concurrent_reads.cppscenarios 5-8 (spec 052 US2/US3 + concurrent-write acceptance): 30 s soak runs that exercise every spec 052 lifetime guarantee under AddressSanitizer/UBSan. Scenario 5 — 4 readers + invalidator at 50 ms cadence (~2500 reads × ~300 invalidations). Scenario 6 — scenario 5 plus aduckdb_schemas()/duckdb_tables()walker exercising the bulk-scan anchor path (~1200 reads × ~260 invalidations × ~200 schema walks). Scenario 7 — 4 writers + 1 reader on one shared table with disjoint PK ranges (~550 INSERT/UPDATE/DELETE cycles + ~500 reads). Scenario 8 — 4 pure-write threads (~2700 INSERTs / 30 s)..github/workflows/concurrency-tests.ymljob that rebuilds the extension with ASan/UBSan and runstest-concurrent-readson every PR that touches the catalog or singleflight surfaces. Includes an 8 GB swap on/mntto absorb the link-time RAM peak, aTestDB-creation step for scenarios 4-8, andLD_PRELOAD-edlibasan/libubsanon the test binary run only (so the preload doesn't leak into vcpkg's compiler probe).
- Wipe bearer credentials on destruction.
MSSQLConnectionInfogains a user-declared destructor thatOPENSSL_cleansespasswordandaccess_token;~MSSQLCatalogwipes the cachedfedauth_token_utf16le_byte vector.OPENSSL_cleansedefeats dead-store elimination, so secrets do not linger in heap-recycled memory after the owningshared_ptr<MSSQLConnectionInfo>orMSSQLCatalogis destroyed. Rule-of-five compliance: copy/move ctors and assigns onMSSQLConnectionInfoare explicitly= default-ed so that user-declaring the destructor doesn't silently disable move generation.
Major release: integrated authentication (Kerberos + Windows SSPI), process-wide singleton cleanup, security hardening, and a deep codec refactor. Closes #82 (custom Application Name) and #96 (ATTACH/DETACH-in-Python-loop crash class).
- Custom Application Name in connection string (spec 047 FR-014,
closes #82).
ADO.NET keys
Application Name/ApplicationName/App Name/application_name(case-insensitive), URI query parameterapplicationname, and MSSQL secret fieldsapplication_name(canonical) /applicationname(fallback) propagate to LOGIN7program_name— visible asAPP_NAME()/sys.dm_exec_sessions.program_name. Empty falls back to the extension default ("DuckDB MSSQL Extension"); values exceeding 128 UTF-16 code units are clamped client-side so what the user sees inAPP_NAME()equals what we sent. lazy_validationATTACH option +mssql_attach_validation_timeoutsetting (spec 047 FR-011). Eager ATTACH validation is on by default (wrong password / unreachable host fail ATTACH instead of being deferred to the first query); opt out per ATTACH withlazy_validation trueto preserve the pre-047 lazy behaviour (useful for container / orchestrator startup where the SQL Server may not yet be reachable). The setting bounds the eager round-trip;0(default) inheritsmssql_connection_timeout.mssql_close_all()scalar function (spec 047 FR-013). Closes every diagnostic-API connection opened viamssql_open()in one call; returns the count of handles closed. Idempotent — a second call after a full close returns 0. Recommended shutdown hook for hosts that use the diagnostic API but do not track individual handles. Marked[DEPRECATED]from registration: lives in the same group asmssql_open/mssql_close/mssql_pingand will be removed alongside them in a future major release once the catalog-bound API covers all diagnostic needs (FR-010).
- LOGIN7 default
program_nameunified to"DuckDB MSSQL Extension"(spec 047 FR-014 side-effect). Pre-047 SQL auth sent"DuckDB"and integrated auth sent"DuckDB MSSQL Extension". The single resolution point (ResolveAppNamehelper, called by every auth path) now sends"DuckDB MSSQL Extension"uniformly when noapplication_nameis supplied. Observable change for SQL-auth users who previously sawAPP_NAME() = 'DuckDB'; they will now see'DuckDB MSSQL Extension'unless they explicitly setApplication Name=DuckDB. mssql_open/mssql_close/mssql_pingare now documented as[DEPRECATED](spec 047 FR-010). They remain functional and are kept for backward compatibility. Prefer ATTACH + the catalog-bound functions (mssql_scan,mssql_exec,mssql_pool_stats) which integrate with DuckDB's catalog lifecycle and the per-catalog pool ownership introduced in spec 047. The handle manager singleton these three functions share is the last extension-internal process-wide state and will be removed together with the functions themselves.
-
Azure TokenCache cross-instance aliasing fixed (spec 047 FR-012). Pre-047, two DuckDB instances in the same process that each defined a secret with the same name (e.g.
mssql_secret) shared a single TokenCache row keyed bysecret_namealone — instance B could silently authenticate with instance A's already-acquired token even when the two secrets resolved to different Azure principals. The cache key is now namespaced by(DatabaseInstance address, cache_key); tokens from different instances are independent. TheOnDetachinvalidation path is scoped to the calling instance's namespace so a sibling instance sharing the secret name keeps its token. -
ATTACH credentials are now validated eagerly by default (spec 047 FR-011). Wrong passwords / unreachable hosts surface as ATTACH errors instead of being deferred to the first query. Error messages never contain the password (audited via sentinel substring assertion in
test/sql/attach/attach_validates_credentials.test). Opt out withlazy_validation truefor container/orchestration scenarios; ceiling controlled by the newmssql_attach_validation_timeoutsetting.
-
Process-wide singletons removed (spec 047, closes issue #96):
MssqlPoolManager,MSSQLContextManager, andMSSQLResultStreamRegistryare gone. Connection pools are now owned per-MSSQLCatalogviaunique_ptr; result streams live on the catalog asRegisterStream/RetrieveStreammethods. Two attached MSSQL databases under different ATTACH aliases no longer alias to the same pool, and ATTACH/DETACH cycles in a Python-style loop (the issue #96 repro) succeed indefinitely. -
Windows SSPI integrated authentication (spec 042 Phase 4).
WinSspiAuthenticatorviasecur32.dll's Negotiate package. Uses the current Windows logon session — nokinitneeded. Same connection-string surface as POSIX (Trusted_Connection=yes/authenticator=winsspi). Mirrors the structure ofKrb5Authenticator; shares theIAuthenticatorinterface and the SPNEGO continuation loop inTdsConnection::AuthenticateIntegrated. Linked againstsecur32.libfrom the Windows SDK — no third-party dependency. -
Integrated Authentication (Kerberos) for POSIX hosts (spec 042, phases 1-3). Adds the
IAuthenticatormulti-round interface, parser support for themicrosoft/go-mssqldbconnection-string surface, LOGIN7fIntSecuritywiring,0xEDSSPI continuation tokens, and a POSIX Kerberos backend via system GSSAPI. Self-containedtest/kerberos/docker-compose stack (KDC + SQL Server + test-client) — no real Active Directory required.- New connection-string keys (verbatim from
go-mssqldb):authenticator,krb5-configfile,krb5-keytabfile,krb5-credcachefile,krb5-realm,service_principal_name. - Aliases:
Trusted_Connection=yes,Integrated Security=SSPI/true— resolve tokrb5on POSIX,winsspion Windows. - Three credential modes on POSIX (Linux only for keytab + raw):
credential cache (default, uses
kinitticket), keytab, raw credentials (secret-only). - macOS supports credential-cache mode (uses
GSS.framework); keytab and raw modes are rejected at construction time with a clear error pointing at the Linux container path. - Verbatim GSSAPI status text in errors plus actionable hints (no ccache → run kinit; clock skew → ntp/chrony; SPN not registered → setspn -L; etc.) per spec 042 R8.
- New end-user documentation:
Kerberos.md(mirrorsAZURE.md). - Windows SSPI (
winsspiauthenticator) is Phase 4 — pending. WSL2 Ubuntu is the supported testing path on Windows in the meantime.
- New connection-string keys (verbatim from
- Linux build with Kerberos enabled now links
libkrb5explicitly. Previous builds failed at the link step withundefined reference to symbol 'krb5_free_error_message'on distros wherekrb5-gssapi.pcdoesn't transitively pull inlibkrb5(Ubuntu 24.04 is the documented case). Affects spec 042 raw-credentials mode users on Linux only — macOS usesGSS.frameworkwhich bundles all symbols. Configure-time warnings now cite both Debian (libkrb5-dev) and RHEL (krb5-devel) package names.
- Hardened FEDAUTH JWT debug logging:
tds_connection.cpppreviously hex-dumped the first 20 bytes of the access token at debug level 2. Replaced with size-only logging plus(contents redacted). - Raw-credentials Kerberos mode is SECRET-ONLY by design — cleartext
Passwordis rejected in any connection string when integrated auth is selected. Defends against cleartext passwords in connection-string logs. - Per-connection krb5 overrides (
krb5-configfile,krb5-credcachefile) apply throughgss_acquire_cred_fromcred_storeelements per instance, not via process-globalsetenv(). Thread-safe vs concurrentgetenvon pool worker threads. - Raw-mode
MEMORY:ccache is destroyed aftergss_acquire_cred_fromcopies credentials internally, so cleartext credentials don't linger in MIT's process-global ccache registry.
- README's stale "Windows Authentication: Only SQL Server authentication is supported" limitation removed. Windows SSPI is now scoped as "Phase 4 pending" with WSL2 documented as the interim testing path.
- README's Secret Fields and Key Aliases tables expanded with Kerberos rows.
docs/architecture.mdAuthentication Strategy Pattern section updated to document the newIAuthenticatorlayered interface and theIntegratedAuthStrategyadapter.docs/TESTING.mdgained a Kerberos Tests section covering the docker-compose stack and WSL2 testing.
-
New
src/include/tds/auth/iauthenticator.hpp— three-method multi-round interface (InitialBytes/NextBytes/Free), modeled onmicrosoft/go-mssqldb'sintegratedauth.IntegratedAuthenticator. No DuckDB headers — the TDS auth layer is reusable outside DuckDB. -
New
src/tds/auth/krb5_authenticator.{hpp,cpp}— POSIX GSSAPI implementation. SPNEGO mechanism. Inline GSS OID literals to work around macOS GSS.framework not exporting the well-known OID symbols. -
New
src/include/tds/auth/integrated_auth_strategy.hpp— adapter wrappingIAuthenticatorin the existingAuthenticationStrategyinterface. -
src/tds/tds_protocol.cppgainsBuildLogin7WithSSPI(setsOptionFlags2.fIntSecurity, writes SPNEGO blob into LOGIN7's SSPI field;cbSSPILongfallback for blobs > 65 535 bytes) andBuildSSPIMessage(continuation packet, type0x11). -
src/tds/tds_token_parser.cpprecognizesTokenType::SSPI = 0xED. -
src/tds/tds_connection.cppgainsAuthenticateIntegrated()— drives the full SPNEGO continuation loop on0xEDtokens, with an 8-round cap to detect cross-realm misconfiguration. -
src/connection/mssql_pool_manager.cppgainsGetOrCreatePoolWithIntegratedAuth— each pool connection builds a freshKrb5Authenticatorso kinit-refreshed tickets are picked up on the next fill. Logs verbatim GSSAPI errors to stderr on pool refill failures. -
CMakeLists.txtaddsENABLE_KRB5option (default ON on POSIX), pkg-config GSSAPI discovery on Linux,find_library(GSS_FRAMEWORK GSS)on macOS,secur32linkage hook for Windows (Phase 4). -
Type codec consolidation (spec 045). Per-type encoding/decoding/literal/DDL logic consolidated into 9 family modules under
src/codec/: boolean/integer/float/decimal/money/string/binary/datetime/uuid. Each<family>_codec.cppownsEncodeToBcp/DecodeFromTds/FormatSqlLiteral/FormatDdlTypeNamefor its types. Dispatch viaFamilyFromLogicalTypeswitch inliteral_format.cpp+type_family.cpp. 5 LogicalType-side dispatch sites collapsed; net −762 LOC across dispatch sites (3243→2481, −23.5%). Bonus: TIMESTAMP_MS/NS/S/TZ now round-trip losslessly through SQL Server DATETIME2(3/7/0/7) with full type-transparency. Closes issue #91 (BCP nvarchar character-vs-byte length) and #89 (VIEW catalog-vs-runtime type divergence). No new vcpkg deps. Per-row bench (1M rows): within 5% gate vs spec-044 baseline. -
Named instance resolver (spec 045 phases 0-2). SQL Server Browser (UDP 1434) discovery for named instances. Mock-browser test stack under
test/named-instance/. -
UTF-16 codec consolidation (spec 044). Finishes the simdutf migration started in spec 043 — every legacy
Utf16LE*call site moves to the simdutf-backed wrapper. simdutf becomes the production UTF-16 codec; the legacy hand-rolled converter survives only as a private invalid-input fallback. Includes microbenchmark (make bench-utf16) and an end-to-end before/after benchmark (test/bench/bench_codec_e2e.sh). -
LOGIN7 non-ASCII fix + simdutf foundation (spec 043). Non-ASCII bytes in LOGIN7 username/password/database fields no longer get corrupted by the hand-rolled UTF-16 converter. Adds simdutf as a vcpkg dependency (statically linked, MIT). Foundation for spec 044's full migration.
- Tier-1 lint and security checks added: CodeQL (C++), gitleaks, shellcheck, hadolint, yamllint, codespell. Dependabot updates enabled. PR prompt-injection scanner for review descriptions.
- CodeQL speedup (3-part): target restriction + vcpkg cache + submodule trim. Cuts CodeQL job runtime substantially on PR triggers.
- Kerberos integration job in CI: spins up the
test/kerberos/docker-compose.ymlKDC + SQL Server + test-client stack for every PR touching the integrated-auth path. - Drive-by fix: 9 codec headers (spec 045) had
class ColumnMetadata/class BCPColumnMetadataforward declarations while the real definitions arestruct. MSVC manglesclassandstructdifferently (clang/gcc don't), producing 16 unresolved-external LNK2019 errors at link time. Latent regression — last successful MSVC build on main was 2026-05-15, BEFORE spec 045 merged. Fixed all 9 forward declarations.
0.1.18 - 2026-02-24
- XML data type support (spec 041). XML columns read as VARCHAR; BCP write path; clear errors for INSERT-with-RETURNING / UPDATE on XML columns.
- UDT type alias crash in catalog metadata queries (issue #81).
Earlier versions are tracked in git history under specs/NNN-*/ directories.
Notable recent specs:
- 041-xml-type-support — XML column read/write (TDS type 0xF1).
- 040-fix-datetimeoffset-nbc — DATETIMEOFFSET in NBC row reader.
- 039-order-pushdown — ORDER BY pushdown to SQL Server (experimental).
- 037-replace-libcurl-httplib — Replaced libcurl with bundled cpp-httplib for Azure OAuth2.
- 036-azure-token-docs — Azure AD documentation expansion.
- 034-duckdb-v15-upgrade — DuckDB v1.5 upgrade.
- 033-fix-catalog-scan — Catalog metadata cache fix.
- 032-fedauth-token-provider — Manual access token support for Azure AD.
- 031-connection-fedauth-refactor — Auth strategy pattern introduction.
- 027-ctas-bcp-integration — CTAS via BCP protocol.
- 024-mssql-copy-bcp — COPY TO via BCP.
- 020-multi-statement-scan — Multi-statement support in
mssql_scan. - 001-azure-token-infrastructure — Initial Azure AD support.
See specs/ for the full feature design history.