-
Notifications
You must be signed in to change notification settings - Fork 1
Add a dataset
Not necessarily in this order,
-
Make sure that your annotations file exists on the LIS data store.
-
Make sure your genetic marker files (in .gff3.gz and .gff3.gz.tbi format), and raw GWAS and/or QTL data files (in .tsv.gz format), exist on DSCensor, with a canonical_type of mrk/gwas/qtl respectively. ZZBrowse will use these to generate datasets from your raw data.
[To do: add something about the file specifications and how to list files on DSCensor.] -
If your organism file does not already exist, create it in the organisms subdirectory.
Line 1 - the organism display name
Line 2 - its chromosome lengths, either numeric or in the form name:length
Line 3 - forms of the organism name: Genus species,G.species,Gensp
Line 4 - URL or local file path of the annotations file (from step 1)
Line 5 - chromosome name formats: (1) format for display, (2) full format in annotations file, (3) regex format for validating chromosome names returned by the Genome Context Viewer
Line 6 - base URL for Services API genomic linkage queries
Line 7 - tags for constructing annotations table: strand column name, forward strand code, reverse strand code, start-of-gene column name, end-of-gene column name, URL format for returning gene links, gene id column name (to plug into URL format), gene name column name, chromosome column name, gene description column name -
In www/config/datasetProperties.csv, add a line for each of your new GWAS and/or QTL datasets.
dataset = the dataset's display name.
chrColumn = which column in the dataset contains the chromosome name. Note that this must begin with "chr" (case-insensitive).
bpColumn = which column contains the SNP position (for GWAS data) or interval center position (for QTL data).
traitCol = which column contains the trait or phenotype.
yAxisColumn = which column contains the p-value (or other significance value or score).
logP = whether to use -log10(yAxisColumn) in the charts (generally TRUE for p-values, FALSE for others).
axisLim = whether to specify hard y-axis limits on the charts (always FALSE for our data).
axisMin = hard bottom of y-axis (or 0 if axisLim = FALSE).
axisMax = hard top of y-axis (or 1 if axisLim = FALSE).
organism = the species to which the dataset refers.
plotAll = whether all data are for the same trait (probably always FALSE for our data).
supportInterval = whether to support interval data, as for QTL data. Set the remaining columns to something meaningful if supportInterval is TRUE:
SIyAxisColumn = which column contains the significance value for interval data ("val" for those we generate on the fly).
SIbpStart = which column contains the start position for interval data.
SIbpEnd = which column contains the end position for interval data.
SIaxisLimBool = whether to specify hard y-axis limits for interval data (always FALSE for our data).
SIaxisMin = hard bottom of interval y-axis (or 0 if SIaxisLimBool = FALSE).
SIaxisMax = hard bottom of interval y-axis (or 1 if SIaxisLimBool = FALSE). -
Tell ZZBrowse where to find your data:
buildGWAS.R - add its lis.datastore.info
buildQTL.R - add its lis.datastore.info
server.R - add it to lis.datastore.gwas or lis.datastore.qtl
For GWAS data that live elsewhere than DSCensor: in buildGWAS.R, add any remote GWAS URLs, specify their column names, and do any special handling.
- Other notes
To do: buildQTL.R needs some generalization.
Combined GWAS-QTL datasets: manually for now, though I am working on a script (merge-gwas-and-qtl.R) to do it.
Also to do: investigate eliminating legumeInfo.organisms (unused?) from server.R