Skip to content

Improve logging by reducing noise and logging high-level events#350

Open
sfc-gh-mslot wants to merge 5 commits into
mainfrom
marcoslot/improve-logging
Open

Improve logging by reducing noise and logging high-level events#350
sfc-gh-mslot wants to merge 5 commits into
mainfrom
marcoslot/improve-logging

Conversation

@sfc-gh-mslot
Copy link
Copy Markdown
Collaborator

@sfc-gh-mslot sfc-gh-mslot commented May 13, 2026

Reduce noise and have more meaningful messages like:

  LOG:  pg_lake: compacted iceberg table public.events: rewrote 12 files (8456123 bytes, 450000 rows) into 1 files (7982041 bytes, 450000 rows); table is now 85417923 bytes
  LOG:  pg_lake: compacted iceberg table public.orders: rewrote 5 files (12582912 bytes, 60000 rows) into 1 files (11534336 bytes, 58000 rows), resolved 2000 position-deleted rows; table is
  now 11534336 bytes
 7982041 bytes, 450000 rows)
  LOG:  pg_lake: expired 48 files from iceberg table public.events
  LOG:  pg_lake: expired 48 files from dropped iceberg tables
  LOG:  pg_lake: cleaned up 3 orphaned files from iceberg table public.events
  LOG:  pg_lake: cleaned up 3 orphaned files from dropped iceberg tables

Signed-off-by: Marco Slot <marco.slot@snowflake.com>
Log a one-line summary after each cache-pressure pass (files/bytes
added, evicted, and skipped by reason) so we can see why the cache
stayed full or churned. Append the first 500 chars of the failing
query to pgduck_server parse/bind/execute WARNING lines so a stray
error is debuggable without replaying the session.

Signed-off-by: Marco Slot <marco.slot@snowflake.com>
Copy link
Copy Markdown
Collaborator

@sfc-gh-okalaci sfc-gh-okalaci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you can see all my notes as optional.

I'm a bit torn on the user-initiated lines. The user already sees those, they ran the command and got the result back over the wire, and a DBA who wants an audit trail can use log_statement = ddl.

The background ones are exactly the right thing to log at LOG though.

+1 on demoting the per-write adding/removing/expiring noise.

Comment thread pg_lake_table/src/init.c
Comment thread duckdb_pglake/src/fs/file_cache_manager.cpp
Comment thread pg_lake_table/src/fdw/writable_table.c Outdated
Comment thread pg_lake_table/src/ddl/create_table.c Outdated
Comment thread pg_lake_copy/src/copy/copy.c Outdated
Comment thread pg_lake_table/src/ddl/vacuum.c
Wrap the remote OpenFile call and both Write overloads with a try
block that emits a PGDUCK_SERVER_WARN for ExceptionType::HTTP and
rethrows. DuckDB's HTTPException message already carries URL, status
code, and reason, so a 404, 403, or 500 on an S3 object now surfaces
a single WARNING line before the error propagates to the client.
Non-HTTP exceptions pass through silently.

Signed-off-by: Marco Slot <marco.slot@snowflake.com>
@sfc-gh-mslot sfc-gh-mslot force-pushed the marcoslot/improve-logging branch from dd6013d to c2a59cb Compare May 15, 2026 08:52
sfc-gh-mslot added a commit that referenced this pull request May 15, 2026
- Demote "adding %s to cache" to DEBUG1 to match the existing "removing"
  counterpart (#350: Onur).
- Drop the LOG line for CREATE TABLE / DROP TABLE; per Onur, these
  events aren't valuable to log unconditionally. The compaction log in
  VACUUM (which runs in the background) stays.
- Drop the LOG line for COPY TO; same reason. Removes the now-unused
  StatsCollector capture and data_file_stats.h include.
- Refactor compaction stat bumps in TryCompactDataFiles into
  RecordRemovedFile / RecordAddedFile helpers (the runStats != NULL
  guard moves into the helper).
- Add test_vacuum_compaction_summary_log to verify the VACUUM
  "compacted iceberg table" LOG line emits exactly once with the
  expected file/byte/row counts.
- Demote "adding %s to cache" to DEBUG1 to match the existing "removing"
  counterpart (#350: Onur).
- Drop the LOG line for CREATE TABLE / DROP TABLE; per Onur, these
  events aren't valuable to log unconditionally. The compaction log in
  VACUUM (which runs in the background) stays.
- Drop the LOG line for COPY TO; same reason. Removes the now-unused
  StatsCollector capture and data_file_stats.h include.
- Refactor compaction stat bumps in TryCompactDataFiles into
  RecordRemovedFile / RecordAddedFile helpers (the runStats != NULL
  guard moves into the helper).
- Add test_vacuum_compaction_summary_log to verify the VACUUM
  "compacted iceberg table" LOG line emits exactly once with the
  expected file/byte/row counts.

Signed-off-by: Marco Slot <marco.slot@snowflake.com>
@sfc-gh-mslot sfc-gh-mslot force-pushed the marcoslot/improve-logging branch from 952c0ab to c2a59cb Compare May 15, 2026 09:43
The setup path waits 0.1s after pg_reload_conf so backends pick up
ALTER SYSTEM SET before the test runs. The teardown path skipped the
sleep, so the next test could see still-applied ALTER SYSTEM values
on its module-scoped pg_conn. Mirror the setup so the RESET propagates
before the next test starts.

Surfaced by pytest-split rebalance after adding
test_vacuum_compaction_summary_log: shard 3 sometimes ran
test_writable_object_store_catalog_options immediately before
test_writable_object_store_without_default_location_guc, where the
leftover object_store_catalog_location_prefix flipped which error path
fired.

Signed-off-by: Marco Slot <marco.slot@snowflake.com>
@sfc-gh-mslot sfc-gh-mslot force-pushed the marcoslot/improve-logging branch from 08245c4 to 803c555 Compare May 15, 2026 10:59
@sfc-gh-mslot sfc-gh-mslot marked this pull request as ready for review May 15, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants