-
Notifications
You must be signed in to change notification settings - Fork 10
Add a new dataset
Ben Bond-Lamberty edited this page Apr 9, 2020
·
1 revision
- Open up the submission in the Google Form
- Make sure they've clicked "Yes" to the first four questions
- Create a branch with the dataset name: generally "d" + date + last name (e.g.
d20200407_WANG
) - Create a new issue with the dataset name, and give it a "data" label
- Copy (not move) the
/inst/extdata/TEMPLATE/
folder intoinst/extdata/datasets
and rename it to the dataset name - If using Outlook open the email template is in
/misc/COSORE submission - [questions and] upload link for $DATASET.emltpl
. Open it; dataset name goes in subject line
- As you're working through the following, fill in any questions in the email. Common question: complex experimental design (ask them to explain fully), no publications listed (confirm), measurement instrument not given or unclear,
- Open the
DESCRIPTION.txt
,CONTRIBUTORS.txt
andPORTS.txt
files - Fill in the
CONTRIBUTORS.txt
entries from the "Contributor(s)" section of the form - Fill in the first part of
DESCRIPTION.txt
from the entries in the "Site" section of the form. Note exceptions: "Primary species present" goes intoPORTS.txt
, and "Ecosystem age" goes intoANCILLARY.csv
- Fill in last part of
DESCRIPTION.txt
from the entries in the "Publications" section of the form - Under "Measurement protocols", "Measurement instrument", "Measurement length", and the two timestamp questions go into
DESCRIPTION.txt
- Fill in various fields in
PORTS.txt
from the rest of the "Measurement protocols" section. Currently Yes/No map to TRUE/FALSE - should standardize this - Look at ancillary data questions and keep in mind
- Create a new Dropbox file request with the dataset name; copy the upload link to the email
- Make sure email has PI name, email, site name, etc., correctly filled in and send
- Note in GitHub issue "Metadata, upload link sent." or similar language
- Open RStudio project file. Build package. The new dataset should be listed at the end of the
list_datasets()
output. Confirm thatread_dataset()
will parse it; fix any errors if not - Make a commit message of "Metadata for #xxx" (fill in the issue number)
- Push and open a PR if you want
- Note at this point you can use
csr_report_dataset()
to generate a report-may be handy to check map, etc.; but I usually do this only after data ingest