Skip to content

Commit 1c12f5d

Browse files
mfvanekclaude
andauthored
Add a check for tables with incrementing column names [schemacrawler] (#852)
* Add a check for tables with incrementing column names [schemacrawler] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Implement check for tables with incrementing column names [schemacrawler] Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix lint violations in DatabaseStructureChecksAutoConfiguration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix imports * Fix tests --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent cf3e940 commit 1c12f5d

18 files changed

Lines changed: 522 additions & 7 deletions

File tree

.claude/rules/javadoc.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Rules for Javadoc
2+
3+
Applies to: all Java source files in `src/main/java/`
4+
5+
## @since tag must reflect the current project version (JAVADOC_SINCE_MUST_MATCH_PROJECT_VERSION)
6+
7+
Every new public class and every new public method added to an existing class must carry a `@since` Javadoc tag.
8+
9+
**Determining the version:**
10+
11+
1. Open the root `build.gradle.kts` file.
12+
2. Read the value of the `version` property (e.g., `version = "0.41.1"`).
13+
3. Use that exact string as the `@since` value.
14+
15+
Example — if `build.gradle.kts` contains `version = "0.41.1"`:
16+
17+
```java
18+
/**
19+
* Check for tables with incrementing column names on a specific host.
20+
*
21+
* @author Ivan Vakhrushev
22+
* @since 0.41.1
23+
*/
24+
public class TablesWithIncrementingColumnsCheckOnHost extends AbstractCheckOnHost<TableWithColumns> {
25+
```
26+
27+
Do **not** hardcode a version from memory or a previous session. Always read `build.gradle.kts` at the time of writing to get the current value.

doc/available_checks.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ All checks can be divided into two groups:
5555
| 41 | Tables with no data | **runtime** | yes | [sql](https://github.com/mfvanek/pg-index-health-sql/blob/master/sql/tables_with_no_data.sql) |
5656
| 42 | Self-referenced foreign keys without `ON DELETE CASCADE` or `ON DELETE SET NULL` | static | yes | [sql](https://github.com/mfvanek/pg-index-health-sql/blob/master/sql/self_referenced_foreign_keys.sql) |
5757
| 43 | Columns that use [large object types (BLOB/CLOB)](https://www.postgresql.org/docs/current/largeobjects.html) | static | yes | [sql](https://github.com/mfvanek/pg-index-health-sql/blob/master/sql/columns_with_blob_type.sql) |
58+
| 44 | Tables with incrementing column names | static | yes | [sql](https://github.com/mfvanek/pg-index-health-sql/blob/master/sql/tables_with_incrementing_columns.sql) |
5859

5960
### Raw SQL queries to use with other languages
6061

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Check for tables with incrementing column names
2+
3+
Tables that contain groups of columns sharing a common base name followed by a sequential integer
4+
suffix (for example, `phone1`, `phone2`, `phone3` or `address1`, `address2`, `address3`) are a sign
5+
of de-normalization: multiple values of the same concept are stored as separate columns instead of
6+
being extracted into a dedicated child table linked by a foreign key.
7+
8+
This pattern is often called a **repeating group** and violates the First Normal Form (1NF).
9+
It creates several practical problems:
10+
11+
- Adding a new value (e.g., `phone4`) requires a schema migration.
12+
- Querying all values requires enumerating column names explicitly.
13+
- Constraints (uniqueness, not-null, references) must be repeated for each numbered column.
14+
- Searching across all values requires `OR` or `UNION` constructs instead of a simple index lookup.
15+
16+
The check reports any table where two or more columns share the same non-numeric prefix.
17+
18+
Similar to [SchemaCrawler's `LinterTableWithIncrementingColumns`](https://www.schemacrawler.com/lint.html).
19+
20+
## SQL query
21+
22+
- [tables_with_incrementing_columns.sql](https://github.com/mfvanek/pg-index-health-sql/blob/master/sql/tables_with_incrementing_columns.sql)
23+
24+
## Check type
25+
26+
- **static** (can be performed on an empty database in component/integration tests)
27+
28+
## Support for partitioned tables
29+
30+
Supports partitioned tables.
31+
The check is performed on the partitioned table itself (the parent one). Individual sections (descendants) are ignored.
32+
33+
## Reproduction script
34+
35+
```sql
36+
create schema if not exists demo;
37+
38+
create table if not exists demo.orders (
39+
id bigint generated always as identity primary key,
40+
phone1 text,
41+
phone2 text,
42+
address1 text,
43+
address2 text,
44+
address3 text,
45+
created_at timestamptz,
46+
created_by text,
47+
sku1 text,
48+
"updatedAt" timestamptz,
49+
"updatedBy" text
50+
);
51+
52+
create table if not exists demo.events (
53+
id bigint not null,
54+
event_date date not null,
55+
tag1 text,
56+
tag2 text
57+
) partition by range (event_date);
58+
59+
create table if not exists demo.events_2024
60+
partition of demo.events
61+
for values from ('2024-01-01') to ('2025-01-01');
62+
```
63+
64+
## How to fix
65+
66+
Extract the repeating group into a separate child table and link it back with a foreign key.
67+
68+
```sql
69+
-- Before: repeating columns in the parent table
70+
-- orders.phone1, orders.phone2, ...
71+
72+
-- After: a dedicated child table
73+
create table demo.order_phones (
74+
id bigint generated always as identity primary key,
75+
order_id bigint not null references demo.orders (id) on delete cascade,
76+
phone text not null
77+
);
78+
```
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Проверка наличия таблиц с нумерованными именами колонок
2+
3+
Таблицы, в которых есть группы колонок с одинаковым базовым именем и числовым суффиксом
4+
(например, `phone1`, `phone2`, `phone3` или `address1`, `address2`, `address3`) — признак денормализации:
5+
несколько значений одного и того же понятия хранятся в виде отдельных колонок вместо того,
6+
чтобы быть вынесенными в отдельную дочернюю таблицу, связанную внешним ключом.
7+
8+
Такой паттерн часто называют **повторяющейся группой** (repeating group), и он нарушает Первую нормальную форму (1НФ).
9+
Это порождает ряд практических проблем:
10+
11+
- Добавление нового значения (например, `phone4`) требует миграции схемы.
12+
- Для получения всех значений необходимо явно перечислять имена колонок.
13+
- Ограничения (уникальность, NOT NULL, ссылки) приходится повторять для каждой нумерованной колонки.
14+
- Поиск по всем значениям требует конструкций `OR` или `UNION` вместо простого поиска по индексу.
15+
16+
Проверка выявляет таблицы, в которых две и более колонок имеют одинаковый нечисловой префикс.
17+
18+
Аналог [SchemaCrawler `LinterTableWithIncrementingColumns`](https://www.schemacrawler.com/lint.html).
19+
20+
## SQL запрос
21+
22+
- [tables_with_incrementing_columns.sql](https://github.com/mfvanek/pg-index-health-sql/blob/master/sql/tables_with_incrementing_columns.sql)
23+
24+
## Тип проверки
25+
26+
- **static** (может выполняться на пустой БД в компонентных\интеграционных тестах)
27+
28+
## Поддержка секционированных таблиц
29+
30+
Поддерживает секционированные таблицы.
31+
Проверка выполняется на самой секционированной таблице (родительской). Отдельные секции (потомки) игнорируются.
32+
33+
## Скрипт для воспроизведения
34+
35+
```sql
36+
create schema if not exists demo;
37+
38+
create table if not exists demo.orders (
39+
id bigint generated always as identity primary key,
40+
phone1 text,
41+
phone2 text,
42+
address1 text,
43+
address2 text,
44+
address3 text,
45+
created_at timestamptz,
46+
created_by text,
47+
sku1 text,
48+
"updatedAt" timestamptz,
49+
"updatedBy" text
50+
);
51+
52+
create table if not exists demo.events (
53+
id bigint not null,
54+
event_date date not null,
55+
tag1 text,
56+
tag2 text
57+
) partition by range (event_date);
58+
59+
create table if not exists demo.events_2024
60+
partition of demo.events
61+
for values from ('2024-01-01') to ('2025-01-01');
62+
```
63+
64+
## Как исправить
65+
66+
Вынесите повторяющуюся группу в отдельную дочернюю таблицу и свяжите её внешним ключом.
67+
68+
```sql
69+
-- До: повторяющиеся колонки в родительской таблице
70+
-- orders.phone1, orders.phone2, ...
71+
72+
-- После: отдельная дочерняя таблица
73+
create table demo.order_phones (
74+
id bigint generated always as identity primary key,
75+
order_id bigint not null references demo.orders (id) on delete cascade,
76+
phone text not null
77+
);
78+
```

pg-index-health-core/src/main/java/io/github/mfvanek/pg/core/checks/common/Diagnostic.java

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,11 @@ public enum Diagnostic implements CheckInfo {
196196
/**
197197
* Check for columns with {@code oid} or {@code lo} blob type.
198198
*/
199-
COLUMNS_WITH_BLOB_TYPE;
199+
COLUMNS_WITH_BLOB_TYPE,
200+
/**
201+
* Check for tables with incrementing column names (e.g., {@code phone1}, {@code phone2}) indicating de-normalization.
202+
*/
203+
TABLES_WITH_INCREMENTING_COLUMNS;
200204

201205
private final CheckInfo inner;
202206

pg-index-health-core/src/main/java/io/github/mfvanek/pg/core/checks/host/StandardChecksOnHost.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,8 @@ public List<DatabaseCheckOnHost<? extends DbObject>> apply(final PgConnection pg
8585
new ForeignKeysWithNullValuesCheckOnHost(pgConnection),
8686
new TablesWithNoDataCheckOnHost(pgConnection),
8787
new SelfReferencedForeignKeysCheckOnHost(pgConnection),
88-
new ColumnsWithBlobTypeCheckOnHost(pgConnection)
88+
new ColumnsWithBlobTypeCheckOnHost(pgConnection),
89+
new TablesWithIncrementingColumnsCheckOnHost(pgConnection)
8990
);
9091
}
9192
}
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
/*
2+
* Copyright (c) 2019-2026. Ivan Vakhrushev and others.
3+
* https://github.com/mfvanek/pg-index-health
4+
*
5+
* This file is a part of "pg-index-health" - an embeddable schema linter for PostgreSQL
6+
* that detects common anti-patterns and promotes best practices.
7+
*
8+
* Licensed under the Apache License 2.0
9+
*/
10+
11+
package io.github.mfvanek.pg.core.checks.host;
12+
13+
import io.github.mfvanek.pg.connection.PgConnection;
14+
import io.github.mfvanek.pg.core.checks.common.Diagnostic;
15+
import io.github.mfvanek.pg.core.checks.extractors.TableWithColumnsExtractor;
16+
import io.github.mfvanek.pg.model.table.TableWithColumns;
17+
18+
/**
19+
* Check for tables with incrementing column names (e.g., {@code phone1}, {@code phone2}) on a specific host.
20+
* Such columns indicate de-normalization that could be replaced with a separate child table and a foreign key.
21+
*
22+
* @author Ivan Vakhrushev
23+
* @see <a href="https://www.schemacrawler.com/lint.html">SchemaCrawler LinterTableWithIncrementingColumns</a>
24+
* @since 0.41.1
25+
*/
26+
public class TablesWithIncrementingColumnsCheckOnHost extends AbstractCheckOnHost<TableWithColumns> {
27+
28+
/**
29+
* Constructs a new instance of {@code TablesWithIncrementingColumnsCheckOnHost}.
30+
*
31+
* @param pgConnection the connection to the PostgreSQL database; must not be null
32+
*/
33+
public TablesWithIncrementingColumnsCheckOnHost(final PgConnection pgConnection) {
34+
super(TableWithColumns.class, pgConnection, Diagnostic.TABLES_WITH_INCREMENTING_COLUMNS, TableWithColumnsExtractor.of());
35+
}
36+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
/*
2+
* Copyright (c) 2019-2026. Ivan Vakhrushev and others.
3+
* https://github.com/mfvanek/pg-index-health
4+
*
5+
* This file is a part of "pg-index-health" - an embeddable schema linter for PostgreSQL
6+
* that detects common anti-patterns and promotes best practices.
7+
*
8+
* Licensed under the Apache License 2.0
9+
*/
10+
11+
package io.github.mfvanek.pg.core.checks.host;
12+
13+
import io.github.mfvanek.pg.core.checks.common.DatabaseCheckOnHost;
14+
import io.github.mfvanek.pg.core.checks.common.Diagnostic;
15+
import io.github.mfvanek.pg.core.fixtures.support.DatabaseAwareTestBase;
16+
import io.github.mfvanek.pg.core.fixtures.support.DatabasePopulator;
17+
import io.github.mfvanek.pg.model.column.Column;
18+
import io.github.mfvanek.pg.model.context.PgContext;
19+
import io.github.mfvanek.pg.model.table.Table;
20+
import io.github.mfvanek.pg.model.table.TableWithColumns;
21+
import org.jspecify.annotations.NonNull;
22+
import org.junit.jupiter.api.Test;
23+
import org.junit.jupiter.params.ParameterizedTest;
24+
import org.junit.jupiter.params.provider.ValueSource;
25+
26+
import java.util.List;
27+
28+
import static io.github.mfvanek.pg.core.support.AbstractCheckOnHostAssert.assertThat;
29+
30+
class TablesWithIncrementingColumnsCheckOnHostTest extends DatabaseAwareTestBase {
31+
32+
private final DatabaseCheckOnHost<@NonNull TableWithColumns> check = new TablesWithIncrementingColumnsCheckOnHost(getPgConnection());
33+
34+
@Test
35+
void shouldSatisfyContract() {
36+
assertThat(check)
37+
.hasType(TableWithColumns.class)
38+
.hasDiagnostic(Diagnostic.TABLES_WITH_INCREMENTING_COLUMNS)
39+
.hasHost(getHost())
40+
.isStatic();
41+
}
42+
43+
@ParameterizedTest
44+
@ValueSource(strings = {PgContext.DEFAULT_SCHEMA_NAME, "custom"})
45+
void onDatabaseWithThem(final String schemaName) {
46+
executeTestOnDatabase(schemaName, DatabasePopulator::withIncrementingColumns, ctx ->
47+
assertThat(check)
48+
.executing(ctx)
49+
.hasSize(1)
50+
.usingRecursiveFieldByFieldElementComparator()
51+
.containsExactly(
52+
TableWithColumns.of(
53+
Table.of(ctx, "orders", 8_192L),
54+
List.of(
55+
Column.ofNullable(ctx, "orders", "phone1"),
56+
Column.ofNullable(ctx, "orders", "phone2"),
57+
Column.ofNullable(ctx, "orders", "address1"),
58+
Column.ofNullable(ctx, "orders", "address2"),
59+
Column.ofNullable(ctx, "orders", "address3")
60+
)
61+
)
62+
));
63+
}
64+
65+
@ParameterizedTest
66+
@ValueSource(strings = {PgContext.DEFAULT_SCHEMA_NAME, "custom"})
67+
void shouldWorkWithPartitionedTables(final String schemaName) {
68+
executeTestOnDatabase(schemaName, DatabasePopulator::withIncrementingColumnsInPartitionedTable, ctx ->
69+
assertThat(check)
70+
.executing(ctx)
71+
.hasSize(1)
72+
.usingRecursiveFieldByFieldElementComparator()
73+
.containsExactly(
74+
TableWithColumns.of(
75+
Table.of(ctx, "events"),
76+
List.of(
77+
Column.ofNullable(ctx, "events", "tag1"),
78+
Column.ofNullable(ctx, "events", "tag2")
79+
)
80+
)
81+
));
82+
}
83+
}

pg-index-health-core/src/testFixtures/java/io/github/mfvanek/pg/core/fixtures/support/DatabasePopulator.java

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableForBloatStatement;
5858
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableWithBlobTypeColumnStatement;
5959
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableWithDroppedColumnStatement;
60+
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableWithIncrementingColumnsStatement;
6061
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableWithJsonAndSerialColumnsStatement;
6162
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableWithNoDataStatement;
6263
import io.github.mfvanek.pg.core.fixtures.support.statements.CreatePartitionedTableWithNullableFieldsStatement;
@@ -75,6 +76,7 @@
7576
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithColumnOfBigSerialTypeStatement;
7677
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithFixedLengthVarcharStatement;
7778
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithIdentityPrimaryKeyStatement;
79+
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithIncrementingColumnsStatement;
7880
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithInheritanceStatement;
7981
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithNaturalKeyStatement;
8082
import io.github.mfvanek.pg.core.fixtures.support.statements.CreateTableWithSerialPrimaryKeyReferencesToAnotherTableStatement;
@@ -425,6 +427,14 @@ public DatabasePopulator withBlobTypeColumnInPartitionedTable() {
425427
return register(154, new CreatePartitionedTableWithBlobTypeColumnStatement());
426428
}
427429

430+
public DatabasePopulator withIncrementingColumns() {
431+
return register(155, new CreateTableWithIncrementingColumnsStatement());
432+
}
433+
434+
public DatabasePopulator withIncrementingColumnsInPartitionedTable() {
435+
return register(156, new CreatePartitionedTableWithIncrementingColumnsStatement());
436+
}
437+
428438
public void populate() {
429439
try (SchemaNameHolder ignored = SchemaNameHolder.with(schemaName)) {
430440
ExecuteUtils.executeInTransaction(dataSource, statementsToExecuteInSameTransaction.values());

0 commit comments

Comments
 (0)