Skip to content

Commit a53bad4

Browse files
committed
Handle duplicate tars (#378)
* Handle duplicate tars * Test database inspection * Clean up code * Get tests on Perlmutter passing * Address comments * Improve wording in docs
1 parent 620c622 commit a53bad4

File tree

8 files changed

+573
-59
lines changed

8 files changed

+573
-59
lines changed

docs/source/usage.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ Additional optional arguments:
5656
than MAXSIZE except when individual input files exceed MAXSIZE (as
5757
individual files are never split up between different tar files).
5858
* ``--non-blocking`` Zstash will submit a Globus transfer and immediately create a subsequent tarball. That is, Zstash will not wait until the transfer completes to start creating a subsequent tarball. On machines where it takes more time to create a tarball than transfer it, each Globus transfer will have one file. On machines where it takes less time to create a tarball than transfer it, the first transfer will have one file, but the number of tarballs in subsequent transfers will grow finding dynamically the most optimal number of tarballs per transfer. NOTE: zstash is currently always non-blocking.
59+
* ``--error-on-duplicate-tar`` FOR ADVANCED USERS ONLY: Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash's behavior will depend on whether or not the --overwrite-duplicate-tar flag is set.
60+
* ``--overwrite-duplicate-tars`` FOR ADVANCED USERS ONLY: If a duplicate tar is encountered, overwrite the existing database record with the new one (i.e., it will assume the latest tar is the correct one). If this flag is not set, zstash will permit multiple entries for the same tar in its database.
5961
* ``-v`` increases output verbosity.
6062

6163
Local tar files as well as the sqlite3 index database (index.db) will be stored
@@ -153,6 +155,7 @@ where
153155
an incomplete tar file, then the archive you're checking
154156
must have been created using ``zstash >= v1.1.0``.
155157
* ``--tars`` to specify specific tars to check. See below for example usage.
158+
* ``--error-on-duplicate-tar`` FOR ADVANCED USERS ONLY: Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash will check if the size matches the *most recent* entry.
156159
* ``-v`` increases output verbosity.
157160
* ``[files]`` is a list of files to check (standard wildcards supported).
158161

@@ -240,6 +243,8 @@ where
240243
they have been extracted from the archive. Normally, they are deleted after
241244
successful transfer.
242245
* ``--non-blocking`` Zstash will submit a Globus transfer and immediately create a subsequent tarball. That is, Zstash will not wait until the transfer completes to start creating a subsequent tarball. On machines where it takes more time to create a tarball than transfer it, each Globus transfer will have one file. On machines where it takes less time to create a tarball than transfer it, the first transfer will have one file, but the number of tarballs in subsequent transfers will grow finding dynamically the most optimal number of tarballs per transfer. NOTE: zstash is currently always non-blocking.
246+
* ``--error-on-duplicate-tar`` FOR ADVANCED USERS ONLY: Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash's behavior will depend on whether or not the --overwrite-duplicate-tar flag is set.
247+
* ``--overwrite-duplicate-tars`` FOR ADVANCED USERS ONLY: If a duplicate tar is encountered, overwrite the existing database record with the new one (i.e., it will assume the latest tar is the correct one). If this flag is not set, zstash will permit multiple entries for the same tar in its database.
243248
* ``-v`` increases output verbosity.
244249

245250
Note: in the event that an update includes revisions to files previously archived, ``zstash update``
@@ -319,6 +324,7 @@ where
319324
an incomplete tar file, then the archive you're extracting from
320325
must have been created using ``zstash >= v1.1.0``.
321326
* ``--tars`` to specify specific tars to extract. See "Check" above for example usage.
327+
* ``--error-on-duplicate-tar`` FOR ADVANCED USERS ONLY: Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash will check if the size matches the *most recent* entry.
322328
* ``-v`` increases output verbosity.
323329
* ``[files]`` is a list of files to be extracted (standard wildcards supported).
324330

Lines changed: 325 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,325 @@
1+
setup()
2+
{
3+
echo "##########################################################################################################"
4+
local case_name="${1}"
5+
local src_prefix="${2}"
6+
echo "Testing: ${case_name}"
7+
rm -rf ${src_prefix}_check
8+
mkdir -p ${src_prefix}_check
9+
rm -rf ${src_prefix}_create
10+
mkdir -p ${src_prefix}_create
11+
cd ${src_prefix}_create
12+
13+
mkdir zstash_demo
14+
mkdir zstash_demo/empty_dir
15+
mkdir zstash_demo/dir
16+
echo -n '' > zstash_demo/file_empty.txt
17+
echo 'file0 stuff' > zstash_demo/dir/file0.txt
18+
}
19+
20+
run_test_cases()
21+
{
22+
# Before running the first time:
23+
# globus.org
24+
# Authenticate into LCRC Improv DTN, NERSC Perlmutter
25+
# Run toy problem:
26+
#
27+
# source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh
28+
# cd /lcrc/group/e3sm/ac.forsyth2/zstash_testing/test_20250729
29+
# mkdir zstash_demo; echo 'file0 stuff' > zstash_demo/file0.txt
30+
# Chrysalis: 15288284-7006-4041-ba1a-6b52501e49f1
31+
# Perlmutter: 6bdc7956-fc0f-4ad2-989c-7aa5ee643a79
32+
# zstash create --hpss=globus://6bdc7956-fc0f-4ad2-989c-7aa5ee643a79/global/homes/f/forsyth/zstash/tests/manual_run zstash_demo
33+
# Will prompt for LCRC AND NERSC authentication, and then paste generated auth code one time.
34+
35+
# Before each run:
36+
# Perlmutter:
37+
# cd /global/homes/f/forsyth/zstash/tests/
38+
# rm -rf test_database_corruption
39+
#
40+
# Chrysalis:
41+
# cd ~/ez/zstash/
42+
# conda activate zstash-377-20250728
43+
# pre-commit run --all-files
44+
# python -m pip install .
45+
# cd tests/scripts
46+
# ./database_corruption.bash
47+
48+
SRC_DIR=/lcrc/group/e3sm/ac.forsyth2/zstash_testing/test_database_corruption # Chrysalis
49+
DST_DIR=globus://6bdc7956-fc0f-4ad2-989c-7aa5ee643a79/global/homes/f/forsyth/zstash/tests/test_database_corruption # Perlmutter
50+
51+
success_count=0
52+
fail_count=0
53+
review_str=""
54+
55+
# Test case explanations ##################################################
56+
# 1.`zstash create`, then run `zstash_check` from a different directory.
57+
# 2. `zstash create`, then run `zstash_check` from a directory that already has `zstash/index.db`.
58+
# 3. `zstash_create` with `--for-developers-force-database-corruption="simulate_row_existing" --error-on-duplicate-tar`. Errors out on create, so we don't even get to check.
59+
# 4. `zstash create` with `--for-developers-force-database-corruption="simulate_row_existing_bad_size" --overwrite-duplicate-tars`. We see there's a duplicate tar and we overwrite it with the latest data. `zstash check` confirms the tar is correct.
60+
# 5. `zstash create` with `--for-developers-force-database-corruption="simulate_row_existing"`. We simply add a duplicate tar, but `zstash check` with `--error-on-duplicate-tar` errors out because it finds two entries for the same tar.
61+
# 6. `zstash create` with `--for-developers-force-database-corruption="simulate_no_correct_size"` to construct a very bad database: two entries for the same tar, both with incorrect sizes. `zstash check` confirms that no entries match the actual file size.
62+
# 7. `zstash create` with `--for-developers-force-database-corruption="simulate_row_existing_bad_size"`. We add a duplicate tar, but with the wrong size. `zstash check` confirms that the other entry matches the actual file size, so it succeeds.
63+
# 8. `zstash create` with `--for-developers-force-database-corruption="simulate_bad_size_for_most_recent"` to construct two entries for the same tar, the most recent of which has an incorrect size. `zstash check` fails because the most recent size does not match, but it does log that one of the entries matches the actual file size.
64+
65+
# Standard cases ##########################################################
66+
67+
68+
# Case 1: zstash create, check from different directory
69+
case_name="normal"
70+
src_prefix=${SRC_DIR}/${case_name}
71+
setup ${case_name} ${src_prefix}
72+
cd ${src_prefix}_create
73+
zstash create --hpss=${DST_DIR}/${case_name} zstash_demo 2>&1 | tee create.log
74+
grep "INFO: Adding 000000.tar to the database." create.log
75+
if [ $? != 0 ]; then
76+
((fail_count++))
77+
review_str+="${case_name}_create/create.log,"
78+
else
79+
((success_count++))
80+
fi
81+
cd ${src_prefix}_check
82+
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
83+
grep "INFO: 000000.tar: Found a single database entry." check.log
84+
if [ $? != 0 ]; then
85+
((fail_count++))
86+
review_str+="${case_name}_check/check.log,"
87+
else
88+
((success_count++))
89+
fi
90+
91+
92+
# Case 2: zstash create, check from same directory (i.e., an index.db already exists)
93+
case_name="check_from_same_dir"
94+
src_prefix=${SRC_DIR}/${case_name}
95+
setup ${case_name} ${src_prefix}
96+
cd ${src_prefix}_create
97+
zstash create --hpss=${DST_DIR}/${case_name} zstash_demo 2>&1 | tee create.log
98+
grep "INFO: Adding 000000.tar to the database." create.log
99+
if [ $? != 0 ]; then
100+
((fail_count++))
101+
review_str+="${case_name}_create/create.log,"
102+
else
103+
((success_count++))
104+
fi
105+
cd zstash_demo # Use a directory that already has a zstash/index.db!
106+
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
107+
grep "INFO: 000000.tar: Found a single database entry." check.log
108+
if [ $? != 0 ]; then
109+
((fail_count++))
110+
review_str+="${case_name}_create/zstash_demo/check.log," # Notice this is a different path!
111+
else
112+
((success_count++))
113+
fi
114+
115+
116+
# Corrupted database cases ################################################
117+
# --for-developers-force-database-corruption is set on `zstash create`
118+
119+
120+
# Case 3: Duplicates detected! Error out on create. Don't even get to check.
121+
case_name="error_on_create"
122+
src_prefix=${SRC_DIR}/${case_name}
123+
setup ${case_name} ${src_prefix}
124+
cd ${src_prefix}_create
125+
zstash create --hpss=${DST_DIR}/${case_name} --for-developers-force-database-corruption="simulate_row_existing" --error-on-duplicate-tar zstash_demo 2>&1 | tee create.log
126+
grep "INFO: TESTING/DEBUGGING ONLY: Simulating row existing for 000000.tar." create.log
127+
if [ $? != 0 ]; then
128+
((fail_count++))
129+
review_str+="${case_name}_create/create.log,"
130+
else
131+
((success_count++))
132+
fi
133+
grep "ERROR: Database corruption detected! 000000.tar is already in the database." create.log
134+
if [ $? != 0 ]; then
135+
((fail_count++))
136+
else
137+
((success_count++))
138+
fi
139+
140+
141+
# Case 4: Duplicates detected! Overwrite them. Proceed with check, as usual.
142+
case_name="overwrite_duplicate"
143+
src_prefix=${SRC_DIR}/${case_name}
144+
setup ${case_name} ${src_prefix}
145+
cd ${src_prefix}_create
146+
zstash create --hpss=${DST_DIR}/${case_name} --for-developers-force-database-corruption="simulate_row_existing_bad_size" --overwrite-duplicate-tars zstash_demo 2>&1 | tee create.log
147+
grep "INFO: TESTING/DEBUGGING ONLY: Simulating row existing with bad size for 000000.tar." create.log
148+
if [ $? != 0 ]; then
149+
((fail_count++))
150+
review_str+="${case_name}_create/create.log,"
151+
else
152+
((success_count++))
153+
fi
154+
grep "WARNING: Database corruption detected! 000000.tar is already in the database." create.log
155+
if [ $? != 0 ]; then
156+
((fail_count++))
157+
else
158+
((success_count++))
159+
fi
160+
grep "WARNING: Updating existing tar 000000.tar to proceed." create.log
161+
if [ $? != 0 ]; then
162+
((fail_count++))
163+
else
164+
((success_count++))
165+
fi
166+
cd ${src_prefix}_check
167+
# We should have ovewritten the wrong size with the real size, so check should pass.
168+
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
169+
grep "INFO: 000000.tar: Found a single database entry." check.log
170+
if [ $? != 0 ]; then
171+
((fail_count++))
172+
review_str+="${case_name}_check/check.log,"
173+
else
174+
((success_count++))
175+
fi
176+
177+
178+
# Case 5: Duplicates detected! Allow them. Error out on check because duplicates are present.
179+
case_name="check_detects_duplicate"
180+
src_prefix=${SRC_DIR}/${case_name}
181+
setup ${case_name} ${src_prefix}
182+
cd ${src_prefix}_create
183+
zstash create --hpss=${DST_DIR}/${case_name} --for-developers-force-database-corruption="simulate_row_existing" zstash_demo 2>&1 | tee create.log
184+
grep "INFO: TESTING/DEBUGGING ONLY: Simulating row existing for 000000.tar." create.log
185+
if [ $? != 0 ]; then
186+
((fail_count++))
187+
review_str+="${case_name}_create/create.log,"
188+
else
189+
((success_count++))
190+
fi
191+
grep "WARNING: Database corruption detected! 000000.tar is already in the database." create.log
192+
if [ $? != 0 ]; then
193+
((fail_count++))
194+
else
195+
((success_count++))
196+
fi
197+
grep "WARNING: Adding a new entry for 000000.tar." create.log
198+
if [ $? != 0 ]; then
199+
((fail_count++))
200+
else
201+
((success_count++))
202+
fi
203+
cd ${src_prefix}_check
204+
zstash check --hpss=${DST_DIR}/${case_name} --error-on-duplicate-tar 2>&1 | tee check.log
205+
grep "ERROR: Database corruption detected! Found 2 database entries for 000000.tar, with sizes" check.log
206+
if [ $? != 0 ]; then
207+
((fail_count++))
208+
review_str+="${case_name}_check/check.log,"
209+
else
210+
((success_count++))
211+
fi
212+
213+
214+
# Case 6: Duplicates detected! Allow them. Error out on check because none of the sizes match.
215+
case_name="check_finds_no_matching_size"
216+
src_prefix=${SRC_DIR}/${case_name}
217+
setup ${case_name} ${src_prefix}
218+
cd ${src_prefix}_create
219+
zstash create --hpss=${DST_DIR}/${case_name} --for-developers-force-database-corruption="simulate_no_correct_size" zstash_demo 2>&1 | tee create.log
220+
grep "INFO: TESTING/DEBUGGING ONLY: Simulating no correct size for 000000.tar." create.log
221+
if [ $? != 0 ]; then
222+
((fail_count++))
223+
review_str+="${case_name}_create/create.log,"
224+
else
225+
((success_count++))
226+
fi
227+
cd ${src_prefix}_check
228+
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
229+
grep "WARNING: Database corruption detected! Found 2 database entries for 000000.tar, with sizes" check.log
230+
if [ $? != 0 ]; then
231+
((fail_count++))
232+
review_str+="${case_name}_check/check.log,"
233+
else
234+
((success_count++))
235+
fi
236+
grep "INFO: 000000.tar: No database entry matches the actual file size:" check.log
237+
if [ $? != 0 ]; then
238+
((fail_count++))
239+
else
240+
((success_count++))
241+
fi
242+
243+
244+
# Case 7: Duplicates detected! Allow them. Pass check because the most recent size matches.
245+
case_name="check_finds_most_recent_size_matches"
246+
src_prefix=${SRC_DIR}/${case_name}
247+
setup ${case_name} ${src_prefix}
248+
cd ${src_prefix}_create
249+
zstash create --hpss=${DST_DIR}/${case_name} --for-developers-force-database-corruption="simulate_row_existing_bad_size" zstash_demo 2>&1 | tee create.log
250+
grep "INFO: TESTING/DEBUGGING ONLY: Simulating row existing with bad size for 000000.tar." create.log
251+
if [ $? != 0 ]; then
252+
((fail_count++))
253+
review_str+="${case_name}_create/create.log,"
254+
else
255+
((success_count++))
256+
fi
257+
grep "WARNING: Database corruption detected! 000000.tar is already in the database." create.log
258+
if [ $? != 0 ]; then
259+
((fail_count++))
260+
else
261+
((success_count++))
262+
fi
263+
grep "WARNING: Adding a new entry for 000000.tar." create.log
264+
if [ $? != 0 ]; then
265+
((fail_count++))
266+
else
267+
((success_count++))
268+
fi
269+
cd ${src_prefix}_check
270+
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
271+
grep "WARNING: Database corruption detected! Found 2 database entries for 000000.tar, with sizes" check.log
272+
if [ $? != 0 ]; then
273+
((fail_count++))
274+
review_str+="${case_name}_check/check.log,"
275+
else
276+
((success_count++))
277+
fi
278+
grep "INFO: 000000.tar: The most recent database entry has the same size as the actual file size:" check.log
279+
if [ $? != 0 ]; then
280+
((fail_count++))
281+
else
282+
((success_count++))
283+
fi
284+
285+
# Case 8: Duplicates detected! Allow them. Error out on check because the most recent size doesn't match.
286+
case_name="check_finds_most_recent_size_does_not_match"
287+
src_prefix=${SRC_DIR}/${case_name}
288+
setup ${case_name} ${src_prefix}
289+
cd ${src_prefix}_create
290+
zstash create --hpss=${DST_DIR}/${case_name} --for-developers-force-database-corruption="simulate_bad_size_for_most_recent" zstash_demo 2>&1 | tee create.log
291+
grep "INFO: TESTING/DEBUGGING ONLY: Simulating bad size for most recent entry for 000000.tar." create.log
292+
if [ $? != 0 ]; then
293+
((fail_count++))
294+
review_str+="${case_name}_create/create.log,"
295+
else
296+
((success_count++))
297+
fi
298+
cd ${src_prefix}_check
299+
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
300+
grep "WARNING: Database corruption detected! Found 2 database entries for 000000.tar, with sizes" check.log
301+
if [ $? != 0 ]; then
302+
((fail_count++))
303+
review_str+="${case_name}_check/check.log,"
304+
else
305+
((success_count++))
306+
fi
307+
grep "INFO: 000000.tar: A database entry matches the actual file size," check.log
308+
if [ $? != 0 ]; then
309+
((fail_count++))
310+
else
311+
((success_count++))
312+
fi
313+
314+
315+
# Summary
316+
echo "Success count: ${success_count}"
317+
echo "Fail count: ${fail_count}"
318+
echo "Review: ${review_str}"
319+
}
320+
321+
run_test_cases
322+
323+
# Success count: 25
324+
# Fail count: 0
325+
# Review:

tests/test_check.py

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -184,15 +184,18 @@ def testCheckKeepTars(self):
184184
"{}zstash extract --hpss={}".format(zstash_path, self.hpss_path)
185185
)
186186
# Run `zstash check`
187-
output, err = run_cmd(
188-
"{}zstash check --hpss={}".format(zstash_path, self.hpss_path)
189-
)
190-
self.assertEqualOrStop(
191-
output + err,
192-
'For help, please see https://e3sm-project.github.io/zstash. Ask questions at https://github.com/E3SM-Project/zstash/discussions/categories/q-a.\nINFO: zstash/000000.tar exists. Checking expected size matches actual size.\nINFO: Opening tar archive {}/000000.tar\nINFO: Checking file1.txt\nINFO: Checking file2.txt\nINFO: No failures detected when checking the files. If you have a log file, run "grep -i Exception <log-file>" to double check.\n'.format(
193-
self.cache
194-
),
195-
)
187+
zstash_cmd: str = f"{zstash_path}zstash check --hpss={self.hpss_path}"
188+
output, err = run_cmd(zstash_cmd)
189+
expected_present = [
190+
"For help, please see https://e3sm-project.github.io/zstash. Ask questions at https://github.com/E3SM-Project/zstash/discussions/categories/q-a.",
191+
"INFO: zstash/000000.tar exists. Checking expected size matches actual size.",
192+
f"INFO: Opening tar archive {self.cache}/000000.tar",
193+
"INFO: Checking file1.txt",
194+
"INFO: Checking file2.txt",
195+
'INFO: No failures detected when checking the files. If you have a log file, run "grep -i Exception <log-file>" to double check.',
196+
]
197+
expected_absent = []
198+
self.check_strings(zstash_cmd, output + err, expected_present, expected_absent)
196199
# Check that tar and db files were not deleted
197200
files = os.listdir("{}/".format(self.cache))
198201
if not compare(files, ["000000.tar", "index.db"]):

0 commit comments

Comments
 (0)