-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
area/docdbYugabyteDB core featuresYugabyteDB core featureskind/bugThis issue is a bugThis issue is a bugpriority/highHigh PriorityHigh Priorityqa_automationBugs identified via itest-system, LST, Stress automation or causing automation failuresBugs identified via itest-system, LST, Stress automation or causing automation failuresqa_stressBugs identified via Stress automationBugs identified via Stress automation
Description
Jira Link: DB-19240
Description
Steps:
- Create 3 nodes RF=3 universe 2.29.0.0-b190
- Create database and load some data in 4 tables for 30 minutes
- Stop load
- Start loop:
4.1. Start workload with DDL operations (INSERT, DELETE, UPDATE, DROP COLUMN, ADD COLUMN, CHANGE TYPE) operations that happen in one thread but executes on different nodes that could cause multiple concurrent catalog rewrite in one moment of time
4.2. Start backup creation
4.3. Stop workload
4.4. Restore on different database
Each loop increase amount of threads that doing writes (ALTER DDLs still happen “sequentially”, at least tries to do it)
Backup task fails on a second try:
Caused by: java.lang.RuntimeException: Task id 479eba1e-8f45-4820-af70-e8c10f961931_PGSQL_TABLE_TYPE_db_8942b77c-a572-4bd6-a707-61fb0bceff12 status: Task failed during YSQL Dump phase with status YSQL_DUMP_COMMAND_FAILED. Please check YB-Controller logs on node 172.151.26.116 for more details
And in controller logs i see:
ysql_dump: error: query failed: ERROR: deadlock detected (query layer retry isn't possible because this is not the first command in the transaction. Consider using READ COMMITTED isolation level.)
DETAIL: Heartbeat: Transaction b5209972-4021-42e4-afd9-4775b658d6f8 aborted due to a deadlock: <1763692810079569>56207663-3bb0-4bbe-b302-977f52e82490-><1763692813568114>b5209972-4021-42e4-afd9-4775b658d6f8->: kDeadlock [serializable]
It is a new test, i tried to run it on 2025.2, 2024.2 and latest master (this one) and only this master fails with this issue.
Concurrent DDL. ysql dump is running with txn isolation serializable, read only, deferrable and for this isolation level, PG attempts to find a point where it can be run without a serialization failure, so the dump can never fail. However, YB does not have seem to have the same meaning for this, so the SELECT query in the dump can fail.
2025-11-21 02:41:11.686 UTC [50295] STATEMENT: SELECT t.tableoid, t.oid, i.indrelid, t.relname AS indexname, t.relpages, t.reltuples, t.relallvisible, pg_catalog.pg_get_indexdef(i.ind
exrelid) AS indexdef, i.indkey, i.indisclustered, c.contype, c.conname, c.condeferrable, c.condeferred, c.tableoid AS contableoid, c.oid AS conoid, pg_catalog.pg_get_constraintdef(c.oi
d, false) AS condef, CASE WHEN i.indexprs IS NOT NULL THEN (SELECT pg_catalog.array_agg(attname ORDER BY attnum) FROM pg_catalog.pg_attribute WHERE attrelid = i.indexrelid) ELSE NUL
L END AS indattnames, (SELECT spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) AS tablespace, t.reloptions AS indreloptions, i.indisreplident, i.indoption, inh.in
hparent AS parentidx, i.indnkeyatts AS indnkeyatts, i.indnatts AS indnatts, (SELECT pg_catalog.array_agg(attnum ORDER BY attnum) FROM pg_catalog.pg_attribute WHERE attrelid = i.ind
exrelid AND attstattarget >= 0) AS indstatcols, (SELECT pg_catalog.array_agg(attstattarget ORDER BY attnum) FROM pg_catalog.pg_attribute WHERE attrelid = i.indexrelid AND a
ttstattarget >= 0) AS indstatvals, i.indnullsnotdistinct FROM unnest('{16410,16415,16432,16437,16454,16459,16476,16481}'::pg_catalog.oid[]) AS src(tbloid)
JOIN pg_catalog.pg_index i ON (src.tbloid = i.indrelid) JOIN pg_catalog.pg_class t ON (t.oid = i.indexrelid) JOIN pg_catalog.pg_class t2 ON (t2.oid = i.indrelid) LEFT JOIN pg_c
atalog.pg_constraint c ON (i.indrelid = c.conrelid AND i.indexrelid = c.conindid AND c.contype IN ('p','u','x')) LEFT JOIN pg_catalog.pg_inherits inh ON (inh.inhrelid = indexrelid) WHE
RE (i.indisvalid OR t2.relkind = 'p') AND i.indisready ORDER BY i.indrelid, indexname
All links in JIRA first comment
Issue Type
kind/bug
Warning: Please confirm that this issue does not contain any sensitive information
- I confirm this issue does not contain any sensitive information.
Metadata
Metadata
Assignees
Labels
area/docdbYugabyteDB core featuresYugabyteDB core featureskind/bugThis issue is a bugThis issue is a bugpriority/highHigh PriorityHigh Priorityqa_automationBugs identified via itest-system, LST, Stress automation or causing automation failuresBugs identified via itest-system, LST, Stress automation or causing automation failuresqa_stressBugs identified via Stress automationBugs identified via Stress automation