Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,93 +3,87 @@ This document outlines the procedures conducted to test the zstash bloclking
and non-blocking behavior.

Note: As it was intended to test blocking with regard to archive tar-creations
vs Globus transfers, it wsa convenient to have both source snd destination be
the same Globus endpoint. Effectively, we are employing Globus merely to move
vs Globus transfers, it was convenient to have both source and destination be
the same Globus endpoint. Effectively, we are employing Globus merely to move
tar archive files from one directory to another on the same file system.

The core intent in implementing zstash blocking is to address a potential
"low-disk" condition, where tar-files created to archive source files could
add substantially to the disk load. To avoid disk exhaustion, "blocking"
("--non-blocking" is absent on the command line), tar-file creation will
add substantially to the disk load. To avoid disk exhaustion, when "blocking"
(`--non-blocking` is absent on the command line), tar-file creation will
pause to wait for the previous tarfile globus transfer to complete, so that
the local copy can be deleted before the next tar-file is created.

I. File System Setup
====================

As one may want, or need to re-conduct testing under varied conditions, the
test script:

test_zstash_blocking.sh

will establish the following directory structure in the operator's current
test script `test_zstash_blocking.sh` will establish the following directory structure in the operator's current
working directory:

[CWD]/src_data/

- contains files to be tar-archived. One can experiment
with different sizes of files to trigger behaviors.

[CWD]/src_data/zstash/

- default location of tarfiles produced. This directory is
created automatically by zstash unless "--cache" indicates
an alternate location.

[CWD]/dst_data/
```
[CWD]/src_data/
# Contains files to be tar-archived.
# One can experiment with different sizes of files to trigger behaviors.

- destination for Globus transfer of archives.
[CWD]/src_data/zstash/
# Default location of tarfiles produced.
# This directory is created automatically by zstash unless
# "--cache" indicates an alternate location.

[CWD]/tmp_cache/
[CWD]/dst_data/
# Destination for Globus transfer of archives.

- [Optional] alternative location for tar-file generation.
[CWD]/tmp_cache/
# [Optional] alternative location for tar-file generation.
```

Note: It may be convenient to create a "hold" directory to store files of
various sizes that can be easily produced by running the supplied scripts.

gen_data.sh
gen_data_runner.sh
```
gen_data.sh
gen_data_runner.sh
```

The files to be used for a given test must be moved or copied to the src_data
directory before a test is initiated.

Note: It never hurts to run the supplied script:

reset_test.sh

before a test run. This will delete any archives in the src_data/zstash
Note: It never hurts to run the supplied script `reset_test.sh` before a test run. This will delete any archives in the src_data/zstash
cache and the receiving dst_data directories, and delete the src_data/zstash
directory itself if it exists. This ensures a clean restart for testing.
directory itself if it exists. This ensures a clean restart for testing.
The raw data files placed into src_data are not affected.

II. Running the Test Script
===========================

The test script "test_zstash_blocking.sh" accepts two positional parameters:

test_zstash_blocking.sh (BLOCKING|NON_BLOCKING)
```
test_zstash_blocking.sh (BLOCKING|NON_BLOCKING)
```

If "BLOCKING" is selected, zstash will run in default mode, waiting for
If `BLOCKING` is selected, zstash will run in default mode, waiting for
each tar file to complete transfer before generating another tar file.

If "NON_BLOCKING" is selected, the zstash flag "--non-blocking" is supplied
If `NON_BLOCKING` is selected, the zstash flag `--non-blocking` is supplied
to the zstash command line, and tar files continue to be created in parallel
to running Globus transfers.

It is suggested that you run the test script with

test_zstash_blocking.sh (BLOCKING|NON_BLOCKING) > your_logfile 2>&1

so that your command prompt returns and you can monitor progress with
```
./test_zstash_blocking.sh (BLOCKING|NON_BLOCKING) 2>&1 | tee your_logfile
```

snapshot.sh

which will provide a view of both the tarfile cache and the destination
so that your command prompt returns and you can monitor progress with `snapshot.sh`, which will provide a view of both the tarfile cache and the destination
directory for delivered tar files. It is also suggested that you name your
logfile to reflect the date, and whether BLOCKING or not was specified.

logfile to reflect the date, and whether BLOCKING or not was specified. Example:
```bash
./test_zstash_blocking.sh BLOCKING 2>&1 | tee test_zstash_blocking_20251020.log
```

FINAL NOTE: In the zstash code, the tar file "MINSIZE" parameter is taken
to be (int) multiples of 1 GB. During testing, this had been changed to
FINAL NOTE: In the zstash code, the tar file `MINSIZE` parameter is taken
to be (int) multiples of 1 GB. During testing, this had been changed to
"multiple of 100K" for rapid testing. It may be useful to expose this as
a command line parameter for debugging purposes.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,62 +1,115 @@
#!/bin/bash

if [[ $# -lt 1 ]]; then
echo "Usage: text_zstash_blocking.sh (BLOCKING|NON_BLOCKING)"
echo " One of \"BLOCKING\" or \"NON_BLOCKING\" must be supplied as the"
echo " first parameter."
exit 0
fi
# base_dir=`pwd`
# base_dir=`realpath $base_dir`
BASE_DIR="/home/ac.forsyth2/ez/zstash/tests/utils/test_blocking"

NON_BLOCKING=1

if [[ $1 == "BLOCKING" ]]; then
NON_BLOCKING=0
elif [[ $1 == "NON_BLOCKING" ]]; then
NON_BLOCKING=1
else
echo "ERROR: Must supply \"BLOCKING\" or \"NON_BLOCKING\" as 1st argument."
exit 0
fi


base_dir=`pwd`
base_dir=`realpath $base_dir`


# See if we are running the zstash we THINK we are:
echo "CALLING zstash version"
zstash version
echo ""
# Set up Globus Endpoint UUIDs ################################################

# Selectable Endpoint UUIDs
ACME1_GCSv5_UUID=6edb802e-2083-47f7-8f1c-20950841e46a
LCRC_IMPROV_DTN_UUID=15288284-7006-4041-ba1a-6b52501e49f1
NERSC_HPSS_UUID=9cd89cfd-6d04-11e5-ba46-22000b92c6ec

# 12 piControl ocean monthly files, 49 GB
SRC_DATA=$base_dir/src_data
DST_DATA=$base_dir/dst_data

SRC_UUID=$LCRC_IMPROV_DTN_UUID
DST_UUID=$LCRC_IMPROV_DTN_UUID

# Optional
TMP_CACHE=$base_dir/tmp_cache

mkdir -p $SRC_DATA $DST_DATA $TMP_CACHE
# Test assertion functions ####################################################
check_log_has()
{
local expected_grep="${1}"
local log_file="${2}"
grep -q "${expected_grep}" ${log_file}
if [ $? != 0 ]; then
echo "Expected grep '${expected_grep}' not found in ${log_file}. Test failed."
exit 2
fi
}

check_log_does_not_have()
{
local not_expected_grep="${1}"
local log_file="${2}"
grep "${not_expected_grep}" ${log_file}
if [ $? == 0 ]; then
echo "Not-expected grep '${expected_grep}' was found in ${log_file}. Test failed."
exit 2
fi
}

# Helper functions ############################################################
make_test_dirs() {
# 12 piControl ocean monthly files, 49 GB
SRC_DATA=$BASE_DIR/src_data
DST_DATA=$BASE_DIR/dst_data

mkdir -p $SRC_DATA $DST_DATA

echo "src_data: $SRC_DATA"
echo "dst_data: $DST_DATA"
}

generate_test_data() {
i=1
len=1000000 # in bytes
while [[ $i -lt 12 ]]; do
out=$SRC_DATA/small_0${i}_1M
head -c $len </dev/urandom >$out
i=$((i+1))
done
}

snapshot() {
echo "dst_data:"
ls -l $DST_DATA

echo ""
echo "src_data/zstash:"
ls -l $SRC_DATA/zstash
}

remove_test_dirs() {
echo "Attempting to remove $SRC_DATA/zstash/ and $DST_DATA/*"
# SRC_DATA
if [[ -z "$SRC_DATA" ]]; then
echo "Error: SRC_DATA must be defined to delete its zstash subdirectory."
else
rm -rf "$SRC_DATA/zstash/"
fi
# DST_DATA
if [[ -z "$DST_DATA" ]]; then
echo "Error: DST_DATA must be defined to delete it."
else
rm -f "$DST_DATA/*"
fi
}

# Run tests ###################################################################

# Make maxsize 1 GB. This will create a new tar after every 1 GB of data.
# (Since individual files are 4 GB, we will get 1 tarfile per datafile.)

if [[ $NON_BLOCKING -eq 1 ]]; then
echo "TEST: NON_BLOCKING:"
zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize 1 --non-blocking $SRC_DATA
else
echo "TEST: BLOCKING:"
zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize 1 $SRC_DATA
# zstash create -v --hpss=globus://$DST_UUID --maxsize 1 --non-blocking --cache $TMP_CACHE $SRC_DATA
fi
MAXSIZE=1 # GB

remove_test_dirs # Start fresh

echo "TEST: NON_BLOCKING"
make_test_dirs
generate_test_data
case_name="zstash_create_non_blocking"
time zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize ${MAXSIZE} --non-blocking $SRC_DATA 2>&1 | tee ${case_name}.log
check_log_does_not_have "A transfer with identical paths has not yet completed" ${case_name}.log
snapshot
remove_test_dirs

# echo "TEST: BLOCKING"
# make_test_dirs
# generate_test_data
# time zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize ${MAXSIZE} $SRC_DATA
# snapshot
# remove_test_dirs

echo "Testing Completed"

echo "Go to https://app.globus.org/activity to confirm Globus transfers completed successfully."
# TODO:
# Currently getting dst_data index and dst_data 000000 still in progress afer test completes in ~5 seconds...
exit 0
Loading