Skip to content

Commit f9c63b3

Browse files
committed
Clean up code
1 parent 2a2b845 commit f9c63b3

File tree

3 files changed

+32
-7
lines changed

3 files changed

+32
-7
lines changed

docs/source/usage.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ Additional optional arguments:
5656
than MAXSIZE except when individual input files exceed MAXSIZE (as
5757
individual files are never split up between different tar files).
5858
* ``--non-blocking`` Zstash will submit a Globus transfer and immediately create a subsequent tarball. That is, Zstash will not wait until the transfer completes to start creating a subsequent tarball. On machines where it takes more time to create a tarball than transfer it, each Globus transfer will have one file. On machines where it takes less time to create a tarball than transfer it, the first transfer will have one file, but the number of tarballs in subsequent transfers will grow finding dynamically the most optimal number of tarballs per transfer. NOTE: zstash is currently always non-blocking.
59+
* ``--error-on-duplicate-tar`` Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash's behavior will depend on whether or not the --overwrite-duplicate-tar flag is set.
60+
* ``--overwrite-duplicate-tars`` If a duplicate tar is encountered, overwrite the existing tar file with the new one (i.e., it will assume the latest tar is the correct one). If this flag is not set, zstash will permit multiple entries for the same tar in its database.
5961
* ``-v`` increases output verbosity.
6062

6163
Local tar files as well as the sqlite3 index database (index.db) will be stored
@@ -153,6 +155,7 @@ where
153155
an incomplete tar file, then the archive you're checking
154156
must have been created using ``zstash >= v1.1.0``.
155157
* ``--tars`` to specify specific tars to check. See below for example usage.
158+
* ``--error-on-duplicate-tar`` Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash will check if the sizes and md5sums match *at least one* of the tars.
156159
* ``-v`` increases output verbosity.
157160
* ``[files]`` is a list of files to check (standard wildcards supported).
158161

@@ -240,6 +243,8 @@ where
240243
they have been extracted from the archive. Normally, they are deleted after
241244
successful transfer.
242245
* ``--non-blocking`` Zstash will submit a Globus transfer and immediately create a subsequent tarball. That is, Zstash will not wait until the transfer completes to start creating a subsequent tarball. On machines where it takes more time to create a tarball than transfer it, each Globus transfer will have one file. On machines where it takes less time to create a tarball than transfer it, the first transfer will have one file, but the number of tarballs in subsequent transfers will grow finding dynamically the most optimal number of tarballs per transfer. NOTE: zstash is currently always non-blocking.
246+
* ``--error-on-duplicate-tar`` Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash's behavior will depend on whether or not the --overwrite-duplicate-tar flag is set.
247+
* ``--overwrite-duplicate-tars`` If a duplicate tar is encountered, overwrite the existing tar file with the new one (i.e., it will assume the latest tar is the correct one). If this flag is not set, zstash will permit multiple entries for the same tar in its database.
243248
* ``-v`` increases output verbosity.
244249

245250
Note: in the event that an update includes revisions to files previously archived, ``zstash update``
@@ -319,6 +324,7 @@ where
319324
an incomplete tar file, then the archive you're extracting from
320325
must have been created using ``zstash >= v1.1.0``.
321326
* ``--tars`` to specify specific tars to extract. See "Check" above for example usage.
327+
* ``--error-on-duplicate-tar`` Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash will check if the sizes and md5sums match *at least one* of the tars.
322328
* ``-v`` increases output verbosity.
323329
* ``[files]`` is a list of files to be extracted (standard wildcards supported).
324330

tests/scripts/database_corruption.bash

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,4 @@
1-
# TODO: Commit these changes
2-
# TODO: Factor out functions in this script
3-
# TODO: Run the unit tests too
4-
# TODO: Add any functionality added to create to update as well.
5-
# TODO: Add any new parameters to the usage docs
1+
# TODO: Run the unit tests on Perlmutter
62

73
setup()
84
{
@@ -58,6 +54,15 @@ run_test_cases()
5854
fail_count=0
5955
review_str=""
6056

57+
# Test case explanations ##################################################
58+
# 1.`zstash create`, then run `zstash_check` from a different directory.
59+
# 2. `zstash create`, then run `zstash_check` from a directory that already has `zstash/index.db`.
60+
# 3. `zstash_create` with `--for-developers-force-database-corruption="simulate_row_existing" --error-on-duplicate-tar`. Errors out on create, so we don't even get to check.
61+
# 4. `zstash create` with `--for-developers-force-database-corruption="simulate_row_existing_bad_size" --overwrite-duplicate-tars`. We see there's a duplicate tar and we overwrite it with the latest data. `zstash check` confirms the tar is correct.
62+
# 5. `zstash create` with `--for-developers-force-database-corruption="simulate_row_existing"`. We simply add a duplicate tar, but `zstash check` with `--error-on-duplicate-tar` errors out because it finds two entries for the same tar.
63+
# 6. `zstash create` with `--for-developers-force-database-corruption="simulate_no_correct_size"` to construct a very bad database: two entries for the same tar, both with incorrect sizes. `zstash check` confirms that no entries match the actual file size.
64+
# 7. `zstash create` with `--for-developers-force-database-corruption="simulate_row_existing_bad_size"`. We add a duplicate tar, but with the wrong size. `zstash check` confirms that the other entry matches the actual file size, so it succeeds.
65+
6166
# Standard cases ##########################################################
6267

6368

@@ -98,12 +103,12 @@ run_test_cases()
98103
else
99104
((success_count++))
100105
fi
101-
# Do NOT change directory
106+
cd zstash_demo # Use a directory that already has a zstash/index.db!
102107
zstash check --hpss=${DST_DIR}/${case_name} 2>&1 | tee check.log
103108
grep "INFO: 000000.tar: Found a single database entry." check.log
104109
if [ $? != 0 ]; then
105110
((fail_count++))
106-
review_str+="${case_name}_check/check.log,"
111+
review_str+="${case_name}_create/zstash_demo/check.log," # Notice this is a different path!
107112
else
108113
((success_count++))
109114
fi

zstash/update.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,16 @@ def setup_update() -> Tuple[argparse.Namespace, str]:
105105
action="store_true",
106106
help="do not wait for each Globus transfer until it completes.",
107107
)
108+
optional.add_argument(
109+
"--error-on-duplicate-tar",
110+
action="store_true",
111+
help="Raise an error if a tar file with the same name already exists in the database. If this flag is set, zstash will exit if it sees a duplicate tar. If it is not set, zstash's behavior will depend on whether or not the --overwrite-duplicate-tar flag is set.",
112+
)
113+
optional.add_argument(
114+
"--overwrite-duplicate-tars",
115+
action="store_true",
116+
help="If a duplicate tar is encountered, overwrite the existing tar file with the new one (i.e., it will assume the latest tar is the correct one). If this flag is not set, zstash will permit multiple entries for the same tar in its database.",
117+
)
108118
optional.add_argument(
109119
"-v", "--verbose", action="store_true", help="increase output verbosity"
110120
)
@@ -265,6 +275,8 @@ def update_database( # noqa: C901
265275
keep,
266276
args.follow_symlinks,
267277
non_blocking=args.non_blocking,
278+
error_on_duplicate_tar=args.error_on_duplicate_tar,
279+
overwrite_duplicate_tars=args.overwrite_duplicate_tars,
268280
)
269281
except FileNotFoundError:
270282
raise Exception("Archive update failed due to broken symlink.")
@@ -279,6 +291,8 @@ def update_database( # noqa: C901
279291
keep,
280292
args.follow_symlinks,
281293
non_blocking=args.non_blocking,
294+
error_on_duplicate_tar=args.error_on_duplicate_tar,
295+
overwrite_duplicate_tars=args.overwrite_duplicate_tars,
282296
)
283297

284298
# Close database

0 commit comments

Comments
 (0)