HTAN Data Portal

This repo contains the code for the Human Tumor Atlas Network Data Portal

Framework

This is a Next.js project bootstrapped with create-next-app

Backend

All data is coming from Synapse. We have a Python script that generates a JSON file that contains all the metadata. There is currently no backend, it's a fully static site i.e. all filtering happens on the frontend.

Data Updates

Category	Update Event	Process
Publication	HTAN center has new publication	Update Publications JSON. Alex has script to generate this
Tool	HTAN center has new tool	Update Tools JSON. Add entry manually
Data	New major or point data release	Follow steps in Update Data Files section. Dar'ya manages BigQuery tables
Data Access	CRDC-GC releases level1-2 data	Follow steps in Update Data Files section (particularly crdcgc_drs mapping update)
Viewers	CellXGene releases single cell data	Update cellxgene viewers JSON. Add entry manually
Viewers	New Minerva Viewers generated	Adam generates the Minerva viewers. Follow steps in Update Data Files section

Update Data Files

Update release information

Only certain metadata rows and data files on Synapse are released. We keep track of this information in Google BigQuery. One can get the latest dump of that using these commands (requires access to the htan-dcc google project):

bq extract --destination_format CSV released.entities_v7_0 gs://htan-release-files/entities_v7_0.csv
bq extract --destination_format CSV released.metadata_v7_0 gs://htan-release-files/metadata_v7_0.csv
bq extract --destination_format NEWLINE_DELIMITED_JSON released.cds_drs_mapping_V2 gs://htan-release-files/crdcgc_drs_mapping.json
gsutil cp gs://htan-release-files/entities_v7_0.csv data/entities_v7_0.csv
gsutil cp gs://htan-release-files/metadata_v7_0.csv data/metadata_v7_0.csv
gsutil cp gs://htan-release-files/crdcgc_drs_mapping.json packages/data-portal-commons/src/assets/crdcgc_drs_mapping.json

Pull files from Synapse and Process for ingestion

# Run the script that pulls all the HTAN metadata
# It outputs a JSON in public/syn_data.json and a JSON with links to metadata in data/syn_metadata.json
python get_syn_data.py
cd ..
# Do additional data transformations and generated the processed JSON
yarn run updateData

Export to bucket

At the moment metadata is hosted on S3 for production. To update it:

gzip all the files in the metadata directory (Note that these files are not stored in the repo)
Remove ".gz" extension from the gzipped files so they're just csv files
Upload files to the s3 bucket "htanfiles/metadata" (part of schultz AWS org)
The file needs two meta settings: Content-Encloding=gzip and Content-Type=text/csv

Or step 1-4 as command:

MY_AWS_PROFILE=203403084713
MY_AWS_USERNAME=htan_service_account
yarn gzipMetadata 
saml2aws login --force --session-duration=28800 -a ${MY_AWS_PROFILE} --username=${MY_AWS_USERNAME}
aws s3 cp metadata_gzip s3://sc-203403084713-pp-5kti2c6hsoc5c-bucket-qarb8wed4umr/metadata --recursive --profile=${MY_AWS_PROFILE} --content-encoding gzip --content-type=text/csv

Testing

There are currently no automated tests, other than building the project, so be careful when merging to master

Getting Started

First, make sure you have the latest processed json file:

yarn gunzip

Run the development server:

npm run dev
# or
yarn dev

Open http://localhost:3000 with your browser to see the result.

You can start editing any page. The page auto-updates as you edit the file.

Debugging processSynapseJSON

Add debugger; somewhere in the code. Then run:

node --openssl-legacy-provider ./node_modules/.bin/ncc build --source-map --no-source-map-register  data/processSynapseJSON.ts

Followed by:

node  --inspect-brk dist/index.js

Now you can attach to it in e.g. VSCode

Learn More about Next.js

To learn more about Next.js, take a look at the following resources:

Next.js Documentation - learn about Next.js features and API.
Learn Next.js - an interactive Next.js tutorial.

Deployment

The app is deployed using the ZEIT Now Platform from the creators of Next.js.

Name		Name	Last commit message	Last commit date
Latest commit History 1,051 Commits
build_db		build_db
components		components
data		data
e2e/screenshots/reference		e2e/screenshots/reference
lib		lib
packages		packages
pages		pages
public		public
styles		styles
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
README.md		README.md
missing.d.ts		missing.d.ts
next-env.d.ts		next-env.d.ts
next.config.js		next.config.js
package.json		package.json
tsconfig.base.json		tsconfig.base.json
tsconfig.json		tsconfig.json
tsconfig.lib.base.json		tsconfig.lib.base.json
tsconfig.lib.node.json		tsconfig.lib.node.json
types.ts		types.ts
vite.lib.config.ts		vite.lib.config.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HTAN Data Portal

Framework

Backend

Data Updates

Update Data Files

Update release information

Pull files from Synapse and Process for ingestion

Export to bucket

Testing

Getting Started

Debugging processSynapseJSON

Learn More about Next.js

Deployment

About

Uh oh!

Releases

Uh oh!

Contributors 15

Uh oh!

Languages

ncihtan/htan-portal

Folders and files

Latest commit

History

Repository files navigation

HTAN Data Portal

Framework

Backend

Data Updates

Update Data Files

Update release information

Pull files from Synapse and Process for ingestion

Export to bucket

Testing

Getting Started

Debugging processSynapseJSON

Learn More about Next.js

Deployment

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors 15

Uh oh!

Languages