Skip to content

Commit f07b38d

Browse files
authored
Merge pull request #83 from cbrianpace/main
Updates to improve floating data type compare
2 parents 3a5fd23 + 7b3eabb commit f07b38d

46 files changed

Lines changed: 1792 additions & 1900 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,5 @@ target/*
33
database/local_test.sql
44
.DS_Store
55
._*
6-
test_local.txt
6+
test_local.txt
7+
TODO.md

README.md

Lines changed: 45 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ Before initiating the build and installation process, ensure the following prere
4646
- Unsupported data types: blob, long, longraw, bytea.
4747
- Cross-platform comparison limitations with boolean type.
4848
- Low precission types (float, real) cannot be compared to high precission types (double).
49+
- All low precission types are cast using a scale of 3. If a higher scale is required consider using the map-expression override option.
4950
- Different databases cast float to different values. Use float-cast option to switch between char and notation (scientific notation) if there are compare problems with float data types.
5051

5152
# Getting Started
@@ -72,37 +73,62 @@ At a minimal the `repo-xxxxx` parameters are required in the properties file (or
7273
Run the script or use the command below to set up the PostgreSQL repository:
7374

7475
```shell
75-
java -jar pgcompare.jar --init
76+
java -jar pgcompare.jar init
7677
```
7778

7879
## 5. Discover Tables
7980

8081
Discover and map tables in specified schemas:
8182

8283
```shell
83-
java -jar pgcompare.jar --discovery
84+
java -jar pgcompare.jar discover
8485
```
8586

8687
# Usage
8788

89+
## Command Line
90+
91+
```shell
92+
java -jar pgcompare.jar <action> <options>
93+
```
94+
95+
Actions:
96+
- **check**: Recompare the out of sync rows from previous compare
97+
- **compare**: Perform database compare
98+
- **copy-table**: Copy pgCompare metadata for table. Must specify table alias to copy using --table option
99+
- **discover**: Disocver tables and columns
100+
- **init**: Initialize the repository database
101+
102+
Options:
103+
104+
-b|--batch {batch nbr}
105+
106+
-p|--project Project ID
107+
108+
-r|--report {file} Create html report of compare
109+
110+
-t|--table {target table}
111+
112+
--help
113+
88114
## Define Table Mapping
89115

90116
1. Automatic Discovery
91117

92118
Discover and map tables in specified schemas:
93119

94120
```shell
95-
java -jar pgcompare.jar --discover
121+
java -jar pgcompare.jar discover
96122
```
97123

98124
2. Manual Registration
99125

100-
Insert mappings into `dc_table` and `dc_table_map` tables in the repository.
126+
Insert mappings into `dc_table`, `dc_table_map`, `dc_table_column`, and `dc_table_column_map` tables in the repository.
101127

102128
## Run Data Comparison
103129

104130
```shell
105-
java -jar pgcompare.jar --batch 0
131+
java -jar pgcompare.jar compare --batch 0
106132
```
107133

108134
Batch 0 processes all data. Use `PGCOMPARE-BATCH` or specify the batch number using the `--batch` argument to specify a batch number.
@@ -112,11 +138,20 @@ Batch 0 processes all data. Use `PGCOMPARE-BATCH` or specify the batch number us
112138
Revalidate flagged rows:
113139

114140
```shell
115-
java -jar pgcompare.jar --batch 0 --check
141+
java -jar pgcompare.jar check --batch 0
116142
```
117143

118144
# Upgrading
119145

146+
## Version 0.4.0 Enhancements
147+
148+
- Improved casting of low precision data types.
149+
- Added html report generation.
150+
- Refactored code for efficiency.
151+
- Modified arguments and added 'verb' clause to command line.
152+
153+
**Note:** Drop and recreate the repository to upgrade to 0.4.0.
154+
120155
## Version 0.3.0 Enhancements
121156

122157
- DB2 support.
@@ -188,11 +223,11 @@ FROM dc_source s
188223
Properties are categorized into four sections: system, repository, source, and target. Each section has specific properties, as described in detail in the documentation. The properties can be specified via a configuration file, environment variables or a combination of both. To use environment variables, the environment variable will be the name of the property in upper case with dashes '-' converted to underscore '_' and prefixed with PGCOMPARE_. For example, batch-fetch-size can be set by using the environment variable PGCOMPARE_BATCH_FETCH_SIZE.
189224

190225
### System
226+
191227
- batch-fetch-size: Sets the fetch size for retrieving rows from the source or target database.
192228
- batch-commit-size: The commit size controls the array size and number of rows concurrently inserted into the dc_source/dc_target staging tables.
193229
- batch-progress-report-size: Defines the number of rows used in mod to report progress.
194-
- database-source: Determines if the sorting of the rows based on primary key occurs on the source/target database. If set to true, the default, the rows will be sorted before being compared. If set to false, the sorting will take place in the repository database.
195-
- float-cast: Defines how float and double data types are cast for hash function (notation|char). Default is char.
230+
- database-sort: Determines if the sorting of the rows based on primary key occurs on the source/target database. If set to true, the default, the rows will be sorted before being compared. If set to false, the sorting will take place in the repository database.
196231
- loader-threads: Sets the number of threads to load data into the temporary tables. Default is 4. Set to 0 to disable loader threads.
197232
- log-level: Level to determine the amount of log messages written to the log destination.
198233
- log-destination: Location where log messages will be written. Default is stdout.
@@ -201,6 +236,8 @@ Properties are categorized into four sections: system, repository, source, and t
201236
- observer-throttle: Set to true or false, instructs the loader threads to pause and wait for the observer thread to catch up before continuing to load more data into the staging tables.
202237
- observer-throttle-size: Number of rows loaded before the loader thread will sleep and wait for clearance from the observer thread.
203238
- observer-vacuum: Set to true or false, instructs the observer whether to perform a vacuum on the staging tables during checkpoints.
239+
- stage-table-parallel: Default parallel degree to set on staging table (default: 0)
240+
- standard-number-format: Format used to cast numbers (default:0000000000000000000000.0000000000000000000000)
204241

205242
### Repository
206243

@@ -214,7 +251,6 @@ Properties are categorized into four sections: system, repository, source, and t
214251

215252
### Source
216253

217-
- source-database-hash: True or false, instructs the application where the hash should be computed (on the database or by the application).
218254
- source-dbname: Database or service name.
219255
- source-host: Database server name.
220256
- source-password: Database password.
@@ -226,7 +262,6 @@ Properties are categorized into four sections: system, repository, source, and t
226262

227263
### Target
228264

229-
- target-database-hash: True or false, instructs the application where the hash should be computed (on the database or by the application).
230265
- target-dbname: Database or service name.
231266
- target-host: Database server name.
232267
- target-password: Database password.

database/pgCompare.sql

Lines changed: 59 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ CREATE TABLE dc_table (
4949
pid int8 DEFAULT 1 NOT NULL,
5050
tid int8 GENERATED ALWAYS AS IDENTITY( INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
5151
table_alias text NULL,
52-
status varchar(10) DEFAULT 'disabled'::character varying NULL,
52+
enabled boolean default true,
53+
--status varchar(10) DEFAULT 'disabled'::character varying NULL,
5354
batch_nbr int4 DEFAULT 1 NULL,
5455
parallel_degree int4 DEFAULT 1 NULL,
5556
CONSTRAINT dc_table_pk PRIMARY KEY (tid)
@@ -61,7 +62,7 @@ CREATE TABLE dc_table_column (
6162
tid int8 NOT NULL,
6263
column_id int8 GENERATED ALWAYS AS IDENTITY( INCREMENT BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1 NO CYCLE) NOT NULL,
6364
column_alias text NOT NULL,
64-
status varchar(15) DEFAULT 'compare'::character varying NULL,
65+
enabled boolean default true,
6566
CONSTRAINT dc_table_column_pk PRIMARY KEY (column_id)
6667
);
6768

@@ -91,12 +92,10 @@ CREATE TABLE dc_table_column_map (
9192

9293
CREATE TABLE dc_table_history (
9394
tid int8 NOT NULL,
94-
load_id varchar(100) NULL,
9595
batch_nbr int4 NOT NULL,
9696
start_dt timestamptz NOT NULL,
9797
end_dt timestamptz NULL,
9898
action_result jsonb NULL,
99-
action_type varchar(20) NOT NULL,
10099
row_count int8 NULL
101100
);
102101

@@ -107,7 +106,6 @@ CREATE TABLE dc_table_map (
107106
dest_type varchar(20) DEFAULT 'target'::character varying NOT NULL,
108107
schema_name text NOT NULL,
109108
table_name text NOT NULL,
110-
parallel_degree int4 DEFAULT 1 NULL,
111109
mod_column varchar(200) NULL,
112110
table_filter varchar(200) NULL,
113111
schema_preserve_case bool DEFAULT false NULL,
@@ -150,3 +148,59 @@ ALTER TABLE dc_table_map ADD CONSTRAINT dc_table_map_fk FOREIGN KEY (tid) REFERE
150148
--
151149
INSERT INTO pgcompare.dc_project (project_name,project_config) VALUES
152150
('default',NULL);
151+
152+
--
153+
-- Functions
154+
--
155+
-- DROP FUNCTION pgcompare.dc_copy_table(int4, int4);
156+
157+
CREATE OR REPLACE FUNCTION pgcompare.dc_copy_table(p_pid integer, p_tid integer)
158+
RETURNS bigint
159+
LANGUAGE plpgsql
160+
AS $function$
161+
DECLARE
162+
v_new_tid bigint;
163+
v_new_cid bigint;
164+
r_column record;
165+
BEGIN
166+
-- Duplicate dc_table
167+
INSERT INTO dc_table (pid, tid, table_alias, enabled, batch_nbr, parallel_degree)
168+
SELECT pid, tid, table_alias, enabled, batch_nbr, parallel_degree
169+
FROM dc_table
170+
WHERE pid = p_pid AND tid = p_tid
171+
RETURNING tid INTO v_new_tid;
172+
173+
-- Duplicate dc_table_map
174+
INSERT INTO dc_table_map (tid, dest_type, schema_name, table_name, mod_column, table_filter, schema_preserve_case, table_preserve_case)
175+
SELECT v_new_tid, dest_type, schema_name, table_name, mod_column, table_filter, schema_preserve_case, table_preserve_case
176+
FROM dc_table_map
177+
WHERE tid = p_tid;
178+
179+
-- Duplicate dc_table_column and dc_table_column_map
180+
FOR r_column IN
181+
SELECT tid, column_id, column_alias, enabled
182+
FROM dc_table_column
183+
WHERE tid = p_tid
184+
LOOP
185+
-- Insert into dc_table_column with new tid and potentially new cid
186+
INSERT INTO dc_table_column (tid, column_id, column_alias, enabled)
187+
SELECT v_new_tid, cid, tid, column_id, column_alias, enabled
188+
FROM dc_table_column
189+
WHERE tid = p_tid
190+
AND column_id = r_column.column_id
191+
RETURNING cid INTO v_new_cid;
192+
193+
-- Duplicate dc_table_column_map
194+
INSERT INTO dc_table_column_map (tid, column_id, column_origin, column_name, data_type, data_class, data_length, number_precision,
195+
number_scale, column_nullable, column_primarykey, map_expression, supported, preserve_case, map_type)
196+
SELECT v_new_tid, v_new_cid, column_origin, column_name, data_type, data_class, data_length, number_precision,
197+
number_scale, column_nullable, column_primarykey, map_expression, supported, preserve_case, map_type
198+
FROM dc_table_column_map
199+
WHERE tid = p_tid
200+
AND column_id = r_column.column_id;
201+
END LOOP;
202+
203+
RETURN v_new_tid;
204+
END;
205+
$function$
206+
;

pgcompare.properties.sample

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -103,9 +103,6 @@ repo-sslmode=disable
103103
# Database user to use for authentication.
104104
# source|target-password:
105105
# Database password.
106-
# source|target-database-hash:
107-
# Determines whether to generate the hash on the database (true) or by the application (false).
108-
# Default: true
109106
# source|target-sslmode:
110107
# SSL mode to use when connecting to the database (disable|prefer|require).
111108
# Default: disable
@@ -121,7 +118,6 @@ source-port=1521
121118
source-dbname=hr
122119
source-user=appuser
123120
source-password=welcome1
124-
source-database-hash=false
125121
source-sslmode=disable
126122
source-schema=appuser
127123
source-schema=appuser
@@ -132,7 +128,6 @@ source-schema=appuser
132128
#source-dbname=hr
133129
#source-user=postgres
134130
#source-password=welcome1
135-
#source-database-hash=true
136131
#source-sslmode=disable
137132
#source-schema=appuser
138133

@@ -146,7 +141,6 @@ target-port=5432
146141
target-dbname=hr
147142
target-user=postgres
148143
target-password=welcome1
149-
target-database-hash=true
150144
target-sslmode=disable
151145
target-schema=public
152146
# Oracle Example
@@ -156,7 +150,6 @@ target-schema=public
156150
#target-dbname=hr
157151
#target-user=appuser
158152
#target-password=welcome1
159-
#target-database-hash=true
160153
#target-sslmode=disable
161154
#target-schema=appuser
162155

pom.xml

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,13 @@
2121
<dependency>
2222
<groupId>com.ibm.db2</groupId>
2323
<artifactId>jcc</artifactId>
24-
<version>11.5.8.0</version>
24+
<version>12.1.2.0</version>
2525
</dependency>
2626
<!-- https://mvnrepository.com/artifact/org.postgresql/postgresql -->
2727
<dependency>
2828
<groupId>org.postgresql</groupId>
2929
<artifactId>postgresql</artifactId>
30-
<version>42.7.4</version>
30+
<version>42.7.7</version>
3131
</dependency>
3232
<!-- https://mvnrepository.com/artifact/commons-cli/commons-cli -->
3333
<dependency>
@@ -45,25 +45,25 @@
4545
<dependency>
4646
<groupId>com.oracle.database.jdbc</groupId>
4747
<artifactId>ojdbc11</artifactId>
48-
<version>23.4.0.24.05</version>
48+
<version>23.8.0.25.04</version>
4949
</dependency>
5050
<!-- https://mvnrepository.com/artifact/org.json/json -->
5151
<dependency>
5252
<groupId>org.json</groupId>
5353
<artifactId>json</artifactId>
54-
<version>20241224</version>
54+
<version>20250517</version>
5555
</dependency>
5656
<!-- https://mvnrepository.com/artifact/com.mysql/mysql-connector-j -->
5757
<dependency>
5858
<groupId>com.mysql</groupId>
5959
<artifactId>mysql-connector-j</artifactId>
60-
<version>8.4.0</version>
60+
<version>9.3.0</version>
6161
</dependency>
6262
<!-- https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc -->
6363
<dependency>
6464
<groupId>com.microsoft.sqlserver</groupId>
6565
<artifactId>mssql-jdbc</artifactId>
66-
<version>12.6.2.jre11</version>
66+
<version>12.10.0.jre11</version>
6767
</dependency>
6868
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-lang3 -->
6969
<dependency>
@@ -75,7 +75,7 @@
7575
<dependency>
7676
<groupId>org.mariadb.jdbc</groupId>
7777
<artifactId>mariadb-java-client</artifactId>
78-
<version>3.5.2</version>
78+
<version>3.5.3</version>
7979
</dependency>
8080
</dependencies>
8181
</dependencyManagement>
@@ -127,6 +127,12 @@
127127
<groupId>org.mariadb.jdbc</groupId>
128128
<artifactId>mariadb-java-client</artifactId>
129129
</dependency>
130+
<!-- https://mvnrepository.com/artifact/net.snowflake/snowflake-jdbc -->
131+
<dependency>
132+
<groupId>net.snowflake</groupId>
133+
<artifactId>snowflake-jdbc</artifactId>
134+
<version>3.24.2</version>
135+
</dependency>
130136
</dependencies>
131137

132138
<build>

0 commit comments

Comments
 (0)