Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
.Rhistory
.Rapp.history

venv/

# Session Data files
.RData

Expand Down Expand Up @@ -46,8 +48,4 @@ great_expectations/
**/**/.DS_Store

# MKDocs
scripts/__pycache__

# Working files and folders
mc2_all_attributes_map.csv
cds_vocab/
scripts/__pycache__
93 changes: 86 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,12 +76,12 @@ When a new valid value needs to be added to the data model:
CSV in `./modules`. If not:

2. Add the valid value in the "attribute" column of the applicable csv in
the appropriate module folder. E.g. if a new tumor type needs to be added
go to `tumorType.csv` and add the new term in the attribute column). Fill
out the rest of the columns as completely as possible, this includes the
description, the required column, parent column, source column, non-preferred
terms column, the ontology identifier, url, NCIt Code, and any notes.
Please make a note of who added it and the date.
the appropriate module folder. E.g. if a new tumor type needs to be added
go to `tumorType.csv` and add the new term in the attribute column). Fill
out the rest of the columns as completely as possible, this includes the
description, the required column, parent column, source column, non-preferred
terms column, the ontology identifier, url, NCIt Code, and any notes.
Please make a note of who added it and the date.

3. Be sure to look up any synonyms and add to the "non preferred terms"
column. This will make annotating easier in the future.
Expand All @@ -103,7 +103,86 @@ Thank you helping us continuously improve the MC2 Center data models! To
contribute, please read our [contributing guidelines] on the docs site.


## Setup and Deployment Instructions

To get started with this project, follow these steps:

### 1. Install Python and Pip
Ensure you have Python installed on your system. If you don’t have it installed:

- Visit [Python's official website](https://www.python.org/) and download the latest stable release.
- During installation, ensure you check the option to **Add Python to PATH**.

Next, ensure that `pip` is also installed and configured. You can check by running:

```bash
python --version
pip --version
```

If `pip` is missing, you can install it by downloading and running `get-pip.py` from [pip's official site](https://pip.pypa.io/en/stable/installation/).

### 2. Set Up a Virtual Environment
Create a virtual environment to isolate dependencies for this project.

```bash
python -m venv venv
```

This command creates a virtual environment named `venv` in your project directory.

Activate the virtual environment:

- On macOS/Linux:
```bash
source venv/bin/activate
```
- On Windows:
```cmd
venv\Scripts\activate
```

Once activated, your terminal prompt should show the environment name (`venv`).

### 3. Install Dependencies
With the virtual environment activated, install the necessary packages:

```bash
pip install mkdocs
pip install mkdocs-material
pip install mkdocs-table-reader-plugin
```

### 4. Preview the Documentation Site
Run the following command to start a local server and preview the documentation site:

```bash
mkdocs serve
```

Open the displayed URL (usually `http://127.0.0.1:8000/`) in your browser to view the site.

### 5. Submit Built Pages to GitHub
To publish the documentation site on GitHub Pages, follow these steps:

1. Build the static site files:
```bash
mkdocs build
```
This will generate a `site/` directory containing the built HTML files.

2. Commit the built files to the `gh-pages` branch:
```bash
git add site/
git commit -m "Build site for deployment"
git push origin `git subtree split --prefix site master`:gh-pages --force
```

3. Verify that the site is live at your GitHub Pages URL (e.g., `https://<username>.github.io/<repository>`).



[Cancer Complexity Knowledge Portal]: https://cancercomplexity.synapse.org/
[open a ticket]: https://github.com/mc2-center/data-models/issues/new?assignees=aditigopalan&labels=bug&projects=&template=bug-report.md&title=%5Bbug%5D+
[Data Curator App (DCA)]: https://dca.app.sagebionetworks.org/
[Contributing guidelines]: https://mc2-center.github.io/data-models/contributing/
[Contributing guidelines]: https://mc2-center.github.io/data-models/contributing/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dca-tutorial/select_dcc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dca-tutorial/select_folder.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dca-tutorial/select_project.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/dca-tutorial/select_template.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
80 changes: 80 additions & 0 deletions docs/home/how-to-use-multiple-templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# How to Use Multiple Templates

This tutorial will guide you through using the templates provided in the Data Models section with the **Data Curator App (DCA)** to organize and submit your data. The Data Curator App simplifies data formatting and ensures that submissions meet metadata standards.

Access the Data Curator App here: [Data Curator App](https://dca.app.sagebionetworks.org/)


## **Step 1: Select a Data Coordination Center (DCC)**
After launching the app, choose the relevant DCC from the dropdown menu. For example, select **DCA Demo**.

![Step 1: Select a DCC](../assets/dca-tutorial/select_dcc.png)

## **Step 2: Choose a Project**
Select the project you want to work with, such as **FAIR Demo Data**.

![Step 2: Select a Project](../assets/dca-tutorial/select_project.png)


## **Step 3: Select a Folder**
Choose the appropriate folder for your data. For example, select **A Biospecimen**.

![Step 3: Select a Folder](../assets/dca-tutorial/select_folder.png)

## **Step 4: Select a Template**
Choose a template that matches the type of data you're working with. The templates available in this app align with the templates provided in the Data Models section of this documentation. Download the selected template to your computer.

![Step 4: Select a Template](../assets/dca-tutorial/select_template.png)


This will open a CSV file in a Google Sheet on your computer. You can save this to your device and use the template to add your metadata to it.

![Step 4: Biospecimen Template](../assets/dca-tutorial/dca_demo_biospecimen_template.png)



## **Step 5: Populate the Template with Your Data**
Open the downloaded CSV template and fill in your data. Use the descriptions and examples provided in the Data Models section to guide you. Below is an example of a completed **Dataset** template:

| **Dataset Name** | **Dataset Alias** | **Dataset Description** | **Dataset Url** | **Dataset Assay** | **Dataset Species** | **Dataset Tumor Type** | **Dataset Tissue** | **Dataset File Formats** | **Dataset Grant Number** | **Dataset Pubmed Id** | **Dataset View** | **DatasetView_id** |
|-------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|------------------|---------------------|-------------------------|-------------------|---------------------------|--------------------------|------------------------|-----------------|---------------------------|
| RNA Sequencing of Lung Cancer Samples 2021 | GSE56789 | This dataset contains RNA sequencing data from 200 lung cancer samples... | [https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56789](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56789) | RNA Sequencing | Homo sapiens | Glioblastoma | Lung | CSV, PDF | CA209971 | Not applicable | Table | DatasetView_12345 |

!!!important
Please note that to successfully upload this template, you will need to input the following for steps 1-4:

1. DCC = Cancer Complexity Knowledge Portal - Database
2. *Your Project name*
3. *Your folder name*
4. (optional) Skip to validation



## **Step 6: Upload and Validate Metadata**
Upload the completed CSV template back into the Data Curator App and proceed to validate your metadata. The app will check for errors and inconsistencies.

![Step 6: Upload CSV](../assets/dca-tutorial/validate_submit_metadata.png)

## **Benefits of Using the Data Curator App**
- **Ease of Use:** Templates are pre-structured, making it easy to input your data correctly.
- **Metadata Consistency:** The app validates your data to ensure compliance with predefined models and metadata standards.
- **Efficient Submission:** By using the templates and app, you reduce the risk of errors during submission to data portals like the [**Cancer Complexity Knowledge Portal**](https://www.cancercomplexity.synapse.org/).

## **Next Steps**

Now that you've learned how to use the Data Curator App with templates, explore other data models to match the specific type of data you're working with. Each template is designed to help you organize and submit structured metadata for various research elements.

| **Data Model** | **When to Use This Template** | **Link** |
|---------------------------|---------------------------------------------------------------|------------------------------------------|
| Dataset Data Model | Use this to describe and organize datasets, including key details like study purpose, data type, and access links. | [Dataset Data Model](../model/dataset.md) |
| Dataset Sharing Plan | Use this to define your plan for sharing datasets, including permissions, licensing, and compliance with policies. | [Dataset Sharing Plan](../model/DataDSP.md) |
| Education Resource Model | Use this for metadata related to learning materials, training datasets, or educational tools shared with your research. | [Education Resource Data Model](../model/education.md) |
| File Data Model | Use this to catalog individual files within a dataset, including file format, processing level, and storage location. | [File Data Model](../model/file.md) |
| Grant Data Model | Use this to document funding sources and grants that support the research, including grant IDs and sponsors. | [Grant Data Model](../model/grant.md) |
| Person Data Model | Use this to capture details about individuals involved in the project, such as researchers, collaborators, or data submitters. | [Person Data Model](../model/person.md) |
| Publication Data Model | Use this to track publications related to the research, including journal articles, white papers, or reports. | [Publication Data Model](../model/publication.md) |
| Study Data Model | Use this to provide an overview of a research study, including objectives, design, and related datasets. | [Study Data Model](../model/study.md) |
| Tool Data Model | Use this to describe tools, software, or resources used for data collection, analysis, or visualization. | [Tool Data Model](../model/tool.md) |

Explore these links to find the right template for your data and continue with your submissions.

80 changes: 80 additions & 0 deletions docs/home/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Using Data Models with the Data Curator App

This tutorial will guide you through using the templates provided in the Data Models section with the **Data Curator App (DCA)** to organize and submit your data. The Data Curator App simplifies data formatting and ensures that submissions meet metadata standards.

Access the Data Curator App here: [Data Curator App](https://dca.app.sagebionetworks.org/)


## **Step 1: Select a Data Coordination Center (DCC)**
After launching the app, choose the relevant DCC from the dropdown menu. For example, select **DCA Demo**.

![Step 1: Select a DCC](../assets/dca-tutorial/select_dcc.png)

## **Step 2: Choose a Project**
Select the project you want to work with, such as **FAIR Demo Data**.

![Step 2: Select a Project](../assets/dca-tutorial/select_project.png)


## **Step 3: Select a Folder**
Choose the appropriate folder for your data. For example, select **A Biospecimen**.

![Step 3: Select a Folder](../assets/dca-tutorial/select_folder.png)

## **Step 4: Select a Template**
Choose a template that matches the type of data you're working with. The templates available in this app align with the templates provided in the Data Models section of this documentation. Download the selected template to your computer.

![Step 4: Select a Template](../assets/dca-tutorial/select_template.png)


This will open a CSV file in a Google Sheet on your computer. You can save this to your device and use the template to add your metadata to it.

![Step 4: Biospecimen Template](../assets/dca-tutorial/dca_demo_biospecimen_template.png)



## **Step 5: Populate the Template with Your Data**
Open the downloaded CSV template and fill in your data. Use the descriptions and examples provided in the Data Models section to guide you. Below is an example of a completed **Dataset** template:

| **Dataset Name** | **Dataset Alias** | **Dataset Description** | **Dataset Url** | **Dataset Assay** | **Dataset Species** | **Dataset Tumor Type** | **Dataset Tissue** | **Dataset File Formats** | **Dataset Grant Number** | **Dataset Pubmed Id** | **Dataset View** | **DatasetView_id** |
|-------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|------------------|---------------------|-------------------------|-------------------|---------------------------|--------------------------|------------------------|-----------------|---------------------------|
| RNA Sequencing of Lung Cancer Samples 2021 | GSE56789 | This dataset contains RNA sequencing data from 200 lung cancer samples... | [https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56789](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56789) | RNA Sequencing | Homo sapiens | Glioblastoma | Lung | CSV, PDF | CA209971 | Not applicable | Table | DatasetView_12345 |

!!!important
Please note that to successfully upload this template, you will need to input the following for steps 1-4:

1. DCC = Cancer Complexity Knowledge Portal - Database
2. *Your Project name*
3. *Your folder name*
4. (optional) Skip to validation



## **Step 6: Upload and Validate Metadata**
Upload the completed CSV template back into the Data Curator App and proceed to validate your metadata. The app will check for errors and inconsistencies.

![Step 6: Upload CSV](../assets/dca-tutorial/validate_submit_metadata.png)

## **Benefits of Using the Data Curator App**
- **Ease of Use:** Templates are pre-structured, making it easy to input your data correctly.
- **Metadata Consistency:** The app validates your data to ensure compliance with predefined models and metadata standards.
- **Efficient Submission:** By using the templates and app, you reduce the risk of errors during submission to data portals like the [**Cancer Complexity Knowledge Portal**](https://www.cancercomplexity.synapse.org/).

## **Next Steps**

Now that you've learned how to use the Data Curator App with templates, explore other data models to match the specific type of data you're working with. Each template is designed to help you organize and submit structured metadata for various research elements.

| **Data Model** | **When to Use This Template** | **Link** |
|---------------------------|---------------------------------------------------------------|------------------------------------------|
| Dataset Data Model | Use this to describe and organize datasets, including key details like study purpose, data type, and access links. | [Dataset Data Model](../model/dataset.md) |
| Dataset Sharing Plan | Use this to define your plan for sharing datasets, including permissions, licensing, and compliance with policies. | [Dataset Sharing Plan](../model/DataDSP.md) |
| Education Resource Model | Use this for metadata related to learning materials, training datasets, or educational tools shared with your research. | [Education Resource Data Model](../model/education.md) |
| File Data Model | Use this to catalog individual files within a dataset, including file format, processing level, and storage location. | [File Data Model](../model/file.md) |
| Grant Data Model | Use this to document funding sources and grants that support the research, including grant IDs and sponsors. | [Grant Data Model](../model/grant.md) |
| Person Data Model | Use this to capture details about individuals involved in the project, such as researchers, collaborators, or data submitters. | [Person Data Model](../model/person.md) |
| Publication Data Model | Use this to track publications related to the research, including journal articles, white papers, or reports. | [Publication Data Model](../model/publication.md) |
| Study Data Model | Use this to provide an overview of a research study, including objectives, design, and related datasets. | [Study Data Model](../model/study.md) |
| Tool Data Model | Use this to describe tools, software, or resources used for data collection, analysis, or visualization. | [Tool Data Model](../model/tool.md) |

Explore these links to find the right template for your data and continue with your submissions.

Loading