bigquery: add schema resolver, evolution, and table name sanitization#4385
bigquery: add schema resolver, evolution, and table name sanitization#4385
Conversation
Schema resolution + evolution - Add schemaResolver that fetches table metadata, builds a proto descriptor via the adapt pipeline, caches per table, and deduplicates concurrent cold-cache loads via singleflight. - Add schemaEvolver that diffs proto descriptor vs table schema and adds missing columns. Uses ETag with CAS-on-412 retry (max 5 attempts) so concurrent additive evolutions don't clobber each other; on retry exhaustion or a benign "another writer added the columns" outcome, signal the caller to retry the write. - Add grpcSchemaMismatch classification via StorageError details. In handleWriteError: on schema mismatch, evolve and retry; on evolution failure, return a permanent BatchError so messages go to DLQ instead of being retried forever. - Add bigquery_write_api_schema_evolutions_total and bigquery_write_api_schema_evolution_failures_total counters. Table name handling - Sanitize interpolated table names (dots, hyphens, slashes, and whitespace become underscores; non-ASCII-alphanumerics are stripped; leading digits prefixed with _; cap at 1024 chars). - Reject empty-after-sanitization names with a permanent BatchError. Error classification - Add isPermanentBQError that recognises both gRPC permanent codes and *googleapi.Error 4xx (excluding 408/429), so REST 4xx errors from the resolver path no longer retry forever. Lifecycle and concurrency - Drop bqClient/datasetID fields from resolver and evolver; pass client + dataset as method arguments so they can't outlive the client. Close drains the resolver cache. - Track previously fire-and-forget stream closes (evictStream and the idle sweeper) on a closeWg so Close waits for them before tearing down the underlying clients. - Snapshot client and resolvedProjectID together under one connMu RLock in WriteBatch; pass projectID through to tableCacheKey. - Run client closes unconditionally in Close — the bigquery and managedwriter clients are non-blocking gRPC teardowns; gating them on ctx.Err() leaks connections. Config validation and tuning - Reject delegates without target_principal at parse time. - Lower max_in_flight default from 64 to 4 to match snowflake_streaming and iceberg sibling outputs. - Detach the impersonate token source from Connect's ctx with context.WithoutCancel so a cancelled connect doesn't break later token refreshes. - Warn explicitly that endpoint overrides disable authentication. Cleanup - Drop dead fieldNameMapping plumbing (resolvedSchema field, jsonToProtoBytes parameter, streamWithDescriptor field) — never wired in production. - Drop unused *service.Message argument from Resolve. - Reject non-positive proto Kind in protoKindToBQFieldType instead of silently coercing to STRING; cap RECORD nesting at 15 (BigQuery's own limit) to fail fast on self-referential descriptors. - Set Repeated on field schemas for repeated proto fields. Tests - Unit tests for resolver (cache, evict, no-client fallback), evolver helpers (kind mapping, schema diff, repeated fields), buildAuthOpts (endpoint override, credentials_json, no-auth), error classification (gRPC permanent, REST 4xx, wrapped), table-name sanitization, sweep goroutine, config validation (durations, delegates). - Integration tests for schema evolution and table-name sanitization against the goccy bigquery emulator.
|
Commits Single commit Review Schema resolver/evolver design is well thought out (singleflight on cold-cache loads, ETag-based CAS retry on 412, separate gRPC LGTM |
createStream caches the ManagedStream for reuse across every batch routing to the same table. Bind it to the per-batch ctx and the first batch's cancellation (per-message deadline, source shutdown, ack timeout) will sever the cached stream, blocking every later AppendRows against it until the idle sweeper evicts the entry. Wrap with context.WithoutCancel so cancellation propagates only to the in-flight NewManagedStream dial, not to the long-lived stream itself.
|
Commits Review LGTM |
Schema resolution + evolution
Table name handling
Error classification
Lifecycle and concurrency
Config validation and tuning
Cleanup
Tests