Skip to content

Commit 10a5b8f

Browse files
authored
Merge pull request #355 from E3SM-Project/non-block-testing
Non block testing
2 parents 082afc7 + 4a2a8df commit 10a5b8f

File tree

12 files changed

+372
-26
lines changed

12 files changed

+372
-26
lines changed

tests/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -325,7 +325,7 @@ def add_files(self, use_hpss, zstash_path, keep=False, cache=None):
325325
expected_present = ["Transferring file to HPSS"]
326326
else:
327327
expected_present = ["put: HPSS is unavailable"]
328-
expected_present += ["INFO: Creating new tar archive"]
328+
expected_present += ["Creating new tar archive"]
329329
# Make sure none of the old files or directories are moved.
330330
expected_absent = ["ERROR", "file0", "file_empty", "empty_dir"]
331331
self.check_strings(cmd, output + err, expected_present, expected_absent)

tests3/README_TEST_BLOCKING

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
2+
This document outlines the procedures conducted to test the zstash bloclking
3+
and non-blocking behavior.
4+
5+
Note: As it was intended to test blocking with regard to archive tar-creations
6+
vs Globus transfers, it wsa convenient to have both source snd destination be
7+
the same Globus endpoint. Effectively, we are employing Globus merely to move
8+
tar archive files from one directory to another on the same file system.
9+
10+
The core intent in implementing zstash blocking is to address a potential
11+
"low-disk" condition, where tar-files created to archive source files could
12+
add substantially to the disk load. To avoid disk exhaustion, "blocking"
13+
("--non-blocking" is absent on the command line), tar-file creation will
14+
pause to wait for the previous tarfile globus transfer to complete, so that
15+
the local copy can be deleted before the next tar-file is created.
16+
17+
I. File System Setup
18+
====================
19+
20+
s one may want, or need to re-conduct testing under varied conditions, the
21+
test script:
22+
23+
test_zstash_blocking.sh
24+
25+
will establish the following directory structure in the operator's current
26+
working directory:
27+
28+
[CWD]/src_data/
29+
30+
- contains files to be tar-archived. One can experiment
31+
with different sizes of files to trigger behaviors.
32+
33+
[CWD]/src_data/zstash/
34+
35+
- default location of tarfiles produced. This directory is
36+
created automatically by zstash unless "--cache" indicates
37+
an alternate location.
38+
39+
[CWD]/dst_data/
40+
41+
- destination for Globus transfer of archives.
42+
43+
[CWD]/tmp_cache/
44+
45+
- [Optional] alternative location for tar-file generation.
46+
47+
Note: It may be convenient to create a "hold" directory to store files of
48+
various sizes that can be easily produced by running the supplied scripts.
49+
50+
gen_data.sh
51+
gen_data_runner.sh
52+
53+
The files to be used for a given test must be moved or copied to the src_data
54+
directory before a test is initiated.
55+
56+
Note: It never hurts to run the supplied script:
57+
58+
reset_test.sh
59+
60+
before a test run. This will delete any archives in the src_data/zstash
61+
cache and the receiving dst_data directories, and delete the src_data/zstash
62+
directory itself if it exists. This ensures a clean restart for testing.
63+
The rad data files placed into src_data are not affected.
64+
65+
II. Running the Test Script
66+
===========================
67+
68+
The test script "test_zstash_blocking.sh" accepts two positional parameters:
69+
70+
test_zstash_blocking.sh (BLOCKING|NON_BLOCKING) [NEW_CREDS]
71+
72+
On an initial run, or whenever Globus complains of authentication failures,
73+
add "NEW_CREDS" as the second parameter. This will act to delete your
74+
cached Globus credentials and trigger prompts for you to paste login URLs
75+
to your browser (generally one per endpoint) which requires that you conduct
76+
a login sequence, and then paste a returned key-value at the bash command
77+
prompt. After both keys are accepted, you can re-run the test script
78+
without "NEW_CREDS", until the credentials expire (usually 24 hours.)
79+
80+
If "BLOCKING" is selected, zstash will run in default mode, waiting for
81+
each tar file to complete transfer before generating another tar file.
82+
83+
If "NON_BLOCKING" is selected, the zstash flag "--non-blocking" is supplied
84+
to the zstash command line, and tar files continue to be created in parallel
85+
to running Globus transfers.
86+
87+
It is suggested that you reun the test script with
88+
89+
test_zstash_blocking.sh (BLOCKING|NON_BLOCKING) > your_logfile 2>&1
90+
91+
so that your command prompt returns and you can monitor progress with
92+
93+
snapshot.sh
94+
95+
which will provide a view of both the tarfile cache and the destination
96+
directory for delivred tar files. It is also suugested that you name your
97+
logfile to reflect the date, and whether BLOCKING or not.
98+
99+
100+
FINAL NOTE: In the zstash code, the tar file "MINSIZE" parameter is taken
101+
to be (int) multiples of 1 GB. During testing, this had been changed to
102+
"multiple of 100K" for rapid testing. It may be useful to expose this as
103+
a command line parameter for debugging purposes.

tests3/gen_data.sh

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/bin/bash
2+
3+
if [[ $# -lt 2 ]]; then
4+
echo "Usage: gen_data.sh <bytes> <outputfile>"
5+
exit 0
6+
fi
7+
8+
len=$1
9+
out=$2
10+
11+
head -c $len </dev/urandom >$out

tests3/gen_data_runner.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
#!/bin/bash
2+
3+
i=1
4+
5+
while [[ $i -lt 12 ]]; do
6+
./gen_data.sh 1000000 small_0${i}_1M
7+
i=$((i+1))
8+
done

tests3/reset_test.sh

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/bin/bash
2+
3+
rm -rf src_data/zstash/
4+
rm -f dst_data/*
5+
rm -f tmp_cache/*

tests3/snapshot.sh

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#!/bin/bash
2+
3+
echo "dst_data:"
4+
ls -l dst_data
5+
6+
echo ""
7+
echo "src_data/zstash:"
8+
ls -l src_data/zstash
9+
10+
echo ""
11+
echo "tmp_cache:"
12+
ls -l tmp_cache
13+
14+

tests3/test_zstash_blocking.sh

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
#!/bin/bash
2+
3+
if [[ $# -lt 1 ]]; then
4+
echo "Usage: text_zstash_blocking.sh (BLOCKING|NON_BLOCKING) [NEW_CREDS]"
5+
echo " One of \"BLOCKING\" or \"NON_BLOCKING\" must be supplied as the"
6+
echo " first parameter."
7+
echo " Add \"NEW_CREDS\" if Globus credentials have expired."
8+
echo " This will cause Globus to prompt for new credentials."
9+
exit 0
10+
fi
11+
12+
NON_BLOCKING=1
13+
14+
if [[ $1 == "BLOCKING" ]]; then
15+
NON_BLOCKING=0
16+
elif [[ $1 == "NON_BLOCKING" ]]; then
17+
NON_BLOCKING=1
18+
else
19+
echo "ERROR: Must supply \"BLOCKING\" or \"NON_BLOCKING\" as 1st argument."
20+
exit 0
21+
fi
22+
23+
# remove old auth data, if exists, so that globus will prompt us
24+
# for new auth credentials in case they have expired:
25+
if [[ $# -gt 1 ]]; then
26+
if [[ $2 == "NEW_CREDS" ]]: then
27+
rm -f ~/.globus-native-apps.cfg
28+
fi
29+
fi
30+
31+
32+
base_dir=`pwd`
33+
base_dir=`realpath $base_dir`
34+
35+
36+
# See if we are running the zstash we THINK we are:
37+
echo "CALLING zstash version"
38+
zstash version
39+
echo ""
40+
41+
# Selectable Endpoint UUIDs
42+
ACME1_GCSv5_UUID=6edb802e-2083-47f7-8f1c-20950841e46a
43+
LCRC_IMPROV_DTN_UUID=15288284-7006-4041-ba1a-6b52501e49f1
44+
NERSC_HPSS_UUID=9cd89cfd-6d04-11e5-ba46-22000b92c6ec
45+
46+
# 12 piControl ocean monthly files, 49 GB
47+
SRC_DATA=$base_dir/src_data
48+
DST_DATA=$base_dir/dst_data
49+
50+
SRC_UUID=$LCRC_IMPROV_DTN_UUID
51+
DST_UUID=$LCRC_IMPROV_DTN_UUID
52+
53+
# Optional
54+
TMP_CACHE=$base_dir/tmp_cache
55+
56+
mkdir -p $SRC_DATA $DST_DATA $TMP_CACHE
57+
58+
# Make maxsize 1 GB. This will create a new tar after every 1 GB of data.
59+
# (Since individual files are 4 GB, we will get 1 tarfile per datafile.)
60+
61+
if [[ $NON_BLOCKING -eq 1 ]]; then
62+
echo "TEST: NON_BLOCKING:"
63+
zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize 1 --non-blocking $SRC_DATA
64+
else
65+
echo "TEST: BLOCKING:"
66+
zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize 1 $SRC_DATA
67+
# zstash create -v --hpss=globus://$DST_UUID --maxsize 1 --non-blocking --cache $TMP_CACHE $SRC_DATA
68+
fi
69+
70+
echo "Testing Completed"
71+
72+
exit 0
73+

zstash/create.py

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
get_files_to_archive,
2020
run_command,
2121
tars_table_exists,
22+
ts_utc,
2223
)
2324

2425

@@ -37,7 +38,7 @@ def create():
3738
raise TypeError("Invalid config.hpss={}".format(config.hpss))
3839

3940
# Start doing actual work
40-
logger.debug("Running zstash create")
41+
logger.debug(f"{ts_utc()}: Running zstash create")
4142
logger.debug("Local path : {}".format(path))
4243
logger.debug("HPSS path : {}".format(hpss))
4344
logger.debug("Max size : {}".format(config.maxsize))
@@ -54,11 +55,13 @@ def create():
5455
if hpss != "none":
5556
url = urlparse(hpss)
5657
if url.scheme == "globus":
58+
# identify globus endpoints
59+
logger.debug(f"{ts_utc()}:Calling globus_activate(hpss)")
5760
globus_activate(hpss)
5861
else:
5962
# config.hpss is not "none", so we need to
6063
# create target HPSS directory
61-
logger.debug("Creating target HPSS directory")
64+
logger.debug(f"{ts_utc()}: Creating target HPSS directory {hpss}")
6265
mkdir_command: str = "hsi -q mkdir -p {}".format(hpss)
6366
mkdir_error_str: str = "Could not create HPSS directory: {}".format(hpss)
6467
run_command(mkdir_command, mkdir_error_str)
@@ -71,7 +74,7 @@ def create():
7174
run_command(ls_command, ls_error_str)
7275

7376
# Create cache directory
74-
logger.debug("Creating local cache directory")
77+
logger.debug(f"{ts_utc()}: Creating local cache directory")
7578
os.chdir(path)
7679
try:
7780
os.makedirs(cache)
@@ -84,11 +87,14 @@ def create():
8487
# TODO: Verify that cache is empty
8588

8689
# Create and set up the database
90+
logger.debug(f"{ts_utc()}: Calling create_database()")
8791
failures: List[str] = create_database(cache, args)
8892

8993
# Transfer to HPSS. Always keep a local copy.
94+
logger.debug(f"{ts_utc()}: calling hpss_put() for {get_db_filename(cache)}")
9095
hpss_put(hpss, get_db_filename(cache), cache, keep=True)
9196

97+
logger.debug(f"{ts_utc()}: calling globus_finalize()")
9298
globus_finalize(non_blocking=args.non_blocking)
9399

94100
if len(failures) > 0:
@@ -145,7 +151,7 @@ def setup_create() -> Tuple[str, argparse.Namespace]:
145151
optional.add_argument(
146152
"--non-blocking",
147153
action="store_true",
148-
help="do not wait for each Globus transfer until it completes.",
154+
help="do not wait for each Globus transfer to complete before creating additional archive files. This option will use more intermediate disk-space, but can increase throughput.",
149155
)
150156
optional.add_argument(
151157
"-v", "--verbose", action="store_true", help="increase output verbosity"
@@ -185,7 +191,7 @@ def setup_create() -> Tuple[str, argparse.Namespace]:
185191

186192
def create_database(cache: str, args: argparse.Namespace) -> List[str]:
187193
# Create new database
188-
logger.debug("Creating index database")
194+
logger.debug(f"{ts_utc()}:Creating index database")
189195
if os.path.exists(get_db_filename(cache)):
190196
# Remove old database
191197
os.remove(get_db_filename(cache))
@@ -254,6 +260,7 @@ def create_database(cache: str, args: argparse.Namespace) -> List[str]:
254260
args.keep,
255261
args.follow_symlinks,
256262
skip_tars_md5=args.no_tars_md5,
263+
non_blocking=args.non_blocking,
257264
)
258265
except FileNotFoundError:
259266
raise Exception("Archive creation failed due to broken symlink.")
@@ -268,6 +275,7 @@ def create_database(cache: str, args: argparse.Namespace) -> List[str]:
268275
args.keep,
269276
args.follow_symlinks,
270277
skip_tars_md5=args.no_tars_md5,
278+
non_blocking=args.non_blocking,
271279
)
272280

273281
# Close database

0 commit comments

Comments
 (0)