Skip to content

Conciseness cleanups#30

Open
tonyalaribe wants to merge 8 commits into
masterfrom
claude/vibrant-bohr-sLKFW
Open

Conciseness cleanups#30
tonyalaribe wants to merge 8 commits into
masterfrom
claude/vibrant-bohr-sLKFW

Conversation

@tonyalaribe

Copy link
Copy Markdown
Contributor

Closes #

How to test

Checklist

  • Make sure you have described your changes and added all relevant screenshots or data.
  • Make sure your changes are tested (stories and/or unit, integration, or end-to-end tests).
  • Make sure to add/update documentation regarding your changes (or request one from the team).
  • You are NOT deprecating/removing a feature.

claude added 8 commits June 4, 2026 20:29
- metrics.rs: replace the 13 hand-written counter registrations with a
  counter_registry! macro (single source of truth for field + id + desc),
  collapse 5 observable gauges into a layer_gauge! macro, and generate the
  no-arg record_* helpers via a simple_recorders! macro.
- Remove a duplicated #[instrument] attribute on head_cached.
- Extract config::is_insecure_auth_allowed(), shared by the pgwire and gRPC
  auth paths instead of inlining the env-var check twice.
- Extract record_query_span() in pgwire_handlers, used by both the simple
  and extended query handlers.
Adds derive_more (debug + display features) and uses it for:
- StorageConfig: #[debug("[redacted]")] on the two credential fields keeps
  them out of {:?} output without a hand-rolled fmt impl.
- DmlQueryPlanner / DmlExec: #[debug(skip)] on the non-printable plan/session
  fields, preserving the previous Debug output.
- FoyerObjectStoreCache: container-level #[display]/#[debug] over inner.

Behavior-preserving; removes ~45 lines of boilerplate fmt code.
- Route the two error-swallowing payload-collection sites (proactive and
  background _last_checkpoint refresh) through the existing collect_payload
  helper instead of re-inlining the Stream/File match. The two sites that
  propagate errors or slice ranges keep their bespoke handling.
- Extract table_path_from_uri(), shared by both invalidate_checkpoint_cache
  impls instead of duplicating the scheme-strip + trim logic.
- Extract is_parquet_file() and use it at the 7 inline .ends_with(".parquet")
  call sites.
ToCharUDF and AtTimeZoneUDF both inlined the same ~20-line block to pull a
constant UTF-8 string out of a scalar-or-length-1-array argument (StringView
or String). Factor it into extract_scalar_string(arg, label); the label
parameterizes the error messages. Preserves the && short-circuit so is_null(0)
is never evaluated on an empty array.
The as_any/name/signature trio was spelled out identically in every UDF
ScalarUDFImpl block. A scalar_udf_boilerplate!("name") macro emits all three;
return_type and invoke_with_args stay per-impl. Covers JsonToPgText, ToChar,
AtTimeZone, JsonBuildArray, ToJson, ExtractEpoch, ApproxPercentile and
JsonbPathExists.
- functions.rs: push_json_primitive! macro collapses the identical
  downcast + null-aware json!() loop for the Int64/Float64/Boolean arms.
- manifest.rs: extract mutate() for the shared load->apply->save skeleton
  behind upsert() and remove_many().
query() and query_partitioned() both matched on the optional compiled
predicate to filter a bucket snapshot. Factor that into filter_snapshot();
the text_match query variant keeps its bespoke id-set + predicate flow.
Verified by the mem_buffer unit tests (27 passed).
The DynamoDB-locking storage-option block was written twice: once in
AwsConfig::build_storage_options and again, hand-inlined, in the per-project
custom-table path in database.rs. Extract a single add_dynamodb_locking_options
method and call it from both, so the locking config has one source of truth.
@claude

claude Bot commented Jun 4, 2026

Copy link
Copy Markdown

Code Review: Conciseness Cleanups

This is a solid refactoring PR that achieves genuine reduction of repetition (-180 net lines across 12 files) without behavioral changes. The extractions are well-scoped and improve long-term maintainability. A few minor items worth noting.


Summary of Changes

Area Technique Impact
config.rs Extract is_insecure_auth_allowed() and add_dynamodb_locking_options() Removes 3x env-var duplication; DynamoDB block was in 2 places
database.rs / dml.rs derive_more::Debug with #[debug("[redacted]")] Replaces manual Debug impls
functions.rs extract_scalar_string(), scalar_udf_boilerplate!, push_json_primitive! Removes ~60 lines of repeated UDF boilerplate
metrics.rs counter_registry!, layer_gauge!, simple_recorders! Each new counter/gauge is now a single line
object_store_cache.rs table_path_from_uri(), is_parquet_file(), collect_payload() 7x .ends_with(".parquet") calls consolidated; duplicate #[instrument] removed
mem_buffer.rs / manifest.rs / pgwire_handlers.rs Small function extractions Obvious dedup

Security: Improvement Worth Highlighting

Replacing the manual Debug impl on StorageConfig with derive_more::Debug and #[debug("[redacted]")] is strictly safer. A developer adding a new credential field to a manual impl could forget to redact it; with the attribute, a credential can only leak if the annotation is explicitly absent — a much easier failure to spot in review.


Minor Issues

1. Subtle error message regression in extract_scalar_string (src/functions.rs)

The shared helper normalizes messages. For the non-scalar array path, the original AtTimeZoneUDF said "Timezone must be a scalar string value", but the new helper says "Timezone must be a scalar value" (the word "string" is dropped). Harmless in practice, but if any tests assert on exact error text they will break. Worth a quick grep for the old strings.

2. DmlQueryPlanner::Debug output may differ slightly

The original impl called f.debug_struct("DmlQueryPlanner").finish() which renders as DmlQueryPlanner. derive_more with all fields #[debug(skip)] emits DmlQueryPlanner { } (with empty braces). Cosmetic only, but worth confirming no snapshot tests match on this output.

3. is_insecure_auth_allowed() reads the env var on every call

This was the same behaviour before (the read was inlined), but extraction makes it more visible. It is only called at startup/connection time so there is no performance concern. The real risk is in tests that mutate env vars between calls — both auth paths could see different values. Caching in OnceLock would be more robust, but that is a larger change worth a separate PR. Flagging for awareness only.

4. PR description is not filled in

The "Closes #" line and checklist items are left at template defaults. Even for a cleanup PR it helps to note that no tests are required (pure refactor) and which issue, if any, prompted it.


Nits

  • src/config.rs:92 — the one-liner for is_insecure_auth_allowed() runs ~95 characters. Worth checking against the project rustfmt max_width.
  • The scalar_udf_boilerplate! macro makes IDE go-to-definition for as_any / name / signature harder inside UDF impls. A short comment on each invocation site would help future readers.

Verdict

Approved with minor suggestions. Each extraction is small enough to verify at a glance and improves the code without over-abstracting. The credential-redaction change is a net security improvement. The only thing worth confirming before merge is that no tests assert on the exact error strings that were normalized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants