This is a Python app designed for finding out whether files on a drive have already been ingested.
- Go to the Releases page
- Under the latest release, click on the
da-holding-verification-windows-64.zip
link to download the app - It should open up a "Save" prompt; save it to your computer
- Locate the
.zip
file in your file explorer and extract the contents from it - You should now see a folder named
holding_verification
- Copy the
checksums_of_files_in_dri.db
database file to the folder - Go inside the
holding_verification
folder and double-click theholding_verification.exe
file to start the app - Click here for guidance on how to use the app
This app consists of 3 files:
Which, with the env variables: CHECKSUM_DB_NAME, CHECKSUM_TABLE_NAME and CSV_FILE_WITH_CHECKSUMS:
- Takes a CSV with the headings:
- FILEREF
- FIXITYVALUE
- ALGORITHMNAME
- Creates an SQLite Table
- Converts each CSV row into an SQLite row
- Creates an index with the fixity value
This script is only necessary if you only have the CSV version of the DB, otherwise, skip to the holding_verification.py with the DB or generate a new DB with the headings mentioned in step 1
Which, with the env variables: CHECKSUM_DB_NAME, CHECKSUM_TABLE_NAME and CSV_FILE_WITH_CHECKSUMS:
- Allows you to select 1 or more files or a folder, via GUI or CLI
- Opens each file and generates a checksum hash (fixity value) on the content
- Looks for that checksum hash in the DB
- If not found, it will generate a checksum hash using another algorithm, if not found, it will generate a checksum hash using another algorithm
- At most, it will generate 3 hashes: SHA256, SHA1 and MD5 and then give up
- If a file was found, the next file's checksum hash will be generated using the checksum hash algorithm of the file that preceded it
- If found, it will return the file reference(s) associated with the checksum, fixity value, algorithm name from the DB
- If not found, it will generate a checksum hash using another algorithm, if not found, it will generate a checksum hash using another algorithm
- It will write the information obtained from the DB as well as the path, file size and
True
orFalse
value for whether the checksum was found
(called by 'holding_verification.py') Allows you to select 1 or more files or a folder, via GUI or (Command Line Interface) CLI.
- It will ask the user if they would like to use the GUI or CLI to select the file(s)/folder
- If the user:
- Presses the "Enter" button, they will choose the GUI, a GUI appears with an option to:
- Select file(s) - button
- Clicking the button will open a dialog box where you can select 1 or more files
- Once selected, the window will close
- Select folder - button
- Clicking the button will open a dialog box where you can select a folder
- Once selected, the window will close
- Drag and Drop either file(s) or a folder (but not both types) - empty box
- Drag and drop 1 or more files or a single folder from your file explorer onto the box area
- What you've dropped should be displayed on the box
- Note: There is no ability to remove particular items; if you would like to remove the items dropped, you'd have to drop new items or close the app
- Once you're happy with your selection, press the "confirm" button to confirm that these files/folder should be validated
- Select file(s) - button
- Types
c
and then presses "Enter", the CLI option will be selected, which will then give an option to:- Type
f
for a single file (only one file supported at the moment) ord
for a single directory - Once the user types either
f
ord
and presses "Enter", they will be asked to add the full path to the file/folder
- Type
- Presses the "Enter" button, they will choose the GUI, a GUI appears with an option to:
- What you've selected will appear in the command line window and the processing of the file(s) will start
The tests are located here test/test_holding_verification.py
. In order to run the tests, run python3 -m unittest
or
python -m unittest
from the root folder. If running from PyCharm, you might have to change the "Working Directory" to the root folder,
as it might default to the test
folder.
- You'd need to run this project with Python 3.12 or higher
- Just because a checksum was matched, doesn't necessarily mean the file that is ingested had the same name
- Files that encountered errors are printed at the end but will look normal in the CSV
- The holding_verification.py is transformed into a .exe via GitHub Actions (check build.yml file) and added to the releases page so that there is no need to install Python on Windows