This project is actively under development. Reach out to [email protected] for questions and for credentials to run this code.
FAIReSheets generates the FAIRe eDNA Data Template in Google Sheets. The FAIRe template is a collaborative effort in the eDNA research field to standardize its complicated data and metadata. FAIReSheets replicates the template creation from the FAIRe-ator Repository from Dr. Miwa Takahashi and Dr. Stephen Formel, except FAIReSheets outputs the template to Google Sheets rather than a Microsoft Excel spreadsheet.
TLDR: Email [email protected] to be added to the user list and receive the link to the credentials file, create a blank Google Sheet, configure the .env
file with your Google Sheet ID and the Git Gist URL, specify your parameters in config.yaml
, run FAIReSheets and follow the authentication workflow.
python run.py
Before using FAIReSheets, you'll need to request access. This only needs to happen once:
- Email [email protected]
- Include your Google account email in the request (the one you'll use to access Google Sheets)
- You'll receive an email with a Gist URL to add to your
.env
file and confirmation that you've been added to the user list.
This project requires Python and a few additional Python libraries. We strongly recommend using the Anaconda method for installation as it handles dependencies automatically and avoids many common issues.
First, install Git to download the project:
- Download and install Git
- During installation, select "Git from the command line and also from 3rd-party software"
- We recommend using the "bundled OpenSSH" option during installation
Then, download the project code:
git clone https://github.com/baydenwillms/FAIReSheets.git
cd FAIReSheets
Now, let's setup the Python dependencies using one of the two options.
Anaconda is a scientific Python distribution that handles package dependencies and environment management automatically, helping users avoid common installation issues:
-
- IMPORTANT: During installation, check the box that says "Add Anaconda to PATH"
- This ensures Anaconda commands work from any directory
- If you don't check this box, you'll be limited to using only the Anaconda Prompt
-
After installation, you have two options for running commands:
Option A: Using Command Prompt (Windows)
- Open Command Prompt (not PowerShell)
- Navigate to the FAIReSheets directory
- Run the following commands:
conda init cmd.exe
- IMPORTANT: Close and reopen Command Prompt after running conda init
- Then run:
conda env create -f environment.yml conda activate fairesheets
Option B: Using Anaconda Prompt (Windows)
- Open Anaconda Prompt
- Navigate to the FAIReSheets directory
- Run:
conda env create -f environment.yml conda activate fairesheets
If you prefer not to use Anaconda, you can use pip (Python's package installer):
-
- IMPORTANT: During installation, check the box that says "Add Python to PATH"
- This ensures Python and pip commands work from any directory
- Pip will not work if Python is not added to PATH
-
Open Command Prompt (Windows) or Terminal (Mac/Linux)
-
Navigate to the FAIReSheets directory
-
Install the required packages:
pip install -r requirements.txt
If you see errors, try:
python -m pip install -r requirements.txt
- Create a new, empty Google Sheet in your account
- Copy the Spreadsheet ID from the URL:
- The ID is the long string between
/d/
and/edit
in the URL - For example:
https://docs.google.com/spreadsheets/d/1ABC123XYZ/edit
- The ID would be
1ABC123XYZ
- The ID is the long string between
.env
files store sensitive variables necessary for running the script. If these variables are properly stored in the .env
file, they will not be visible on GitHub if you push your local repository publically. Run the script once and it will create a template .env
file:
python run.py
Then edit the .env
file to add:
- Your Google Sheet ID
- The Gist URL provided to you by email
The .env
file should look like:
SPREADSHEET_ID=your_spreadsheet_id_here
GIST_URL=https://gist.githubusercontent.com/user/hash/raw/file.json
The GIST_URL will be sent to you when you've been granted access to FAIReSheets (see first section).
Edit the config.yaml
file to set your run parameters. Here you can specify things like project name, assays used, metabarcoding vs targeted approach, etc.
Refer to the config_TEMPLATE.yaml
for information on all the available options for input parameters. Just remember, config.yaml
is the file actually used by the code.
Before running FAIReSheets, configure the config.yaml
file given your desired parameters. Please refer to the config_TEMPLATE.yaml
for the available parameter options. These parameters determine the structure of your generated FAIRe template. You can also specify additional terms to add if you have relevant fields in your data that are not included in the FAIRe template. Just remember, config_TEMPLATE.yaml
is just for you to know all the parameter options, and config.yaml
is what the code actually uses.
Run FAIReSheets from the root project directory using:
python run.py
Or alternatively you can run the run.py
script in your IDE using the Play button. If you are missing things like a .env
file, authentication credentials, or your spreadsheet ID to edit, the script will prompt you to add those, and/or create a sample .env
file for you to edit.
Please note that you must run this on your own computer, and not on a virtual machine or remote server. This code opens a login page in your web browser, where you sign in to your Google Drive account. If this code is run on a virtual machine, there is no available browser to open that login page.
If you want to generate a new FAIRe template after you've already created one, you have two options:
-
Option 1: Restore the Google Sheet to blank
- Open your Google Sheet
- Click on "File" > "Version history" > "See version history"
- Find the version from before you ran FAIReSheets (when the sheet was blank)
- Click on that version and select "Restore this version"
- Run FAIReSheets again with the same spreadsheet ID
-
Option 2: Create a new Google Sheet
- Create a new, empty Google Sheet
- Copy the new Spreadsheet ID from the URL
- Update the
SPREADSHEET_ID
in your.env
file with the new ID - Run FAIReSheets again
When you run FAIReSheets for the first time, the following will happen:
- The tool will download the authentication credentials from the Git Gist URL sent to you via email
- A browser window will open asking you to sign in with your Google account
- You'll be asked to grant permission to FAIReSheets to access your Google Sheets
- After granting permission, the tool will save a token for future use
- Future runs won't require reauthentication
If you see a message saying "Google hasn't verified this app", click "Advanced" and then "Go to FAIReSheets (unsafe)" to proceed. This is normal for specialized tools that haven't gone through Google's verification process.
IMPORTANT: You must run this on your own computer, and NOT on a virtual machine or remote server. This code opens a login page in your web browser, where you sign in to your Google Drive account. If this code is run on a virtual machine, there is no available browser to open that login page.
If you're using MobaXterm or similar tools that create virtual Linux environments:
- These tools open in a fake Linux home directory (
/home/<username>
) that doesn't map to a real Windows folder - Instead, navigate to a real Windows path like
/drives/c/Users/<username>/Documents
before cloning - Avoid cloning into
/home/<username>
since it won't show up in File Explorer
-
Problem:
conda
command not recognized in Command Prompt- Solution: Make sure you checked "Add Anaconda to PATH" during installation
- If you didn't, reinstall Anaconda and check this option
-
Problem:
conda activate
fails with "conda init" message- Solution: Run
conda init cmd.exe
in Command Prompt, then close and reopen Command Prompt
- Solution: Run
-
Problem: Python version mismatch (e.g., you installed Anaconda with Python 3.12 but environment.yml requires 3.9)
- Solution: This is okay! Conda will create an environment with Python 3.9 as specified in the environment.yml file
- Problem:
git
command not recognized- Solution: Make sure Git is installed system-wide from git-scm.com
- During installation, select "Git from the command line and also from 3rd-party software"
- I highly recommend using Visual Studio Code, it also has a UI for using git
- Problem: Can't edit the Google Sheet
- Solution: Make sure you're using the Google account email you provided when requesting access, and that you checked the boxes to allow FAIReSheets to edit Google Sheets in your Google Drive.
- Problem: Errors when running FAIReSheets
- Solution: Make sure the Google Sheet that FAIReSheets is editing is empty. You can use Google Drive's built in Restore History button before running FAIReSheets again, or, make a new Google Sheet and replace the Spreadsheet ID in the
.env
file.
- Solution: Make sure the Google Sheet that FAIReSheets is editing is empty. You can use Google Drive's built in Restore History button before running FAIReSheets again, or, make a new Google Sheet and replace the Spreadsheet ID in the
To automatically track modifications in your Google Sheet, you can use the following Apps Script. This script updates the modification timestamp and user email in the README sheet whenever an edit is made to any sheet except "README" and "Drop-down values".
-
Open your Google Sheet.
-
Click on
Extensions
in the menu, then selectApps Script
. -
Delete any code in the script editor and copy-paste the following code:
function onEdit(e) { var sheet = e.source.getActiveSheet(); var sheetName = sheet.getName(); // Skip README and Drop-down values sheets if (sheetName === "README" || sheetName === "Drop-down values") { return; } var readmeSheet = e.source.getSheetByName("README"); if (!readmeSheet) { return; } // Find the Modification Timestamp section var data = readmeSheet.getDataRange().getValues(); var timestampRowStart = -1; for (var i = 0; i < data.length; i++) { if (data[i][0] === "Modification Timestamp:") { timestampRowStart = i + 2; // +2 to skip the header row break; } } if (timestampRowStart === -1) { return; } // Find the row for the current sheet var sheetRow = -1; for (var i = timestampRowStart; i < data.length; i++) { if (data[i][0] === sheetName) { sheetRow = i + 1; // +1 because arrays are 0-indexed but sheets are 1-indexed break; } } if (sheetRow === -1) { return; } // Get current time in ISO format var now = new Date(); var timestamp = now.toISOString(); // This gives format like "2025-01-29T13:49:09.123Z" // Update the timestamp and email readmeSheet.getRange(sheetRow, 2).setValue(timestamp); readmeSheet.getRange(sheetRow, 3).setValue(Session.getActiveUser().getEmail()); }
-
Click the disk icon or
File > Save
to save the script. -
Close the Apps Script editor.
Now, whenever you make an edit to any sheet (except "README" and "Drop-down values"), the modification timestamp and your email will be automatically updated in the README sheet.
This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an 'as is' basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.