Skip to content
Draft
Changes from 5 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
8f4469e
Initial sketch for the mriqc/fmriprep singularity based workflow
yarikoptic Jul 18, 2019
de0442f
DOC: added a few comments
yarikoptic Jul 18, 2019
a18521d
DOC: note on execution of mriqc
yarikoptic Jul 18, 2019
92388ed
ENH: make possible to quickly switch from reproman to datalad + some …
yarikoptic Aug 6, 2019
1588f76
ENH: text2git for mriqc output
yarikoptic Aug 6, 2019
5b95ded
BF: make get_participants_ids work + set -x for debugging
yarikoptic Aug 15, 2019
79c3a99
another perspective one on kwyk
yarikoptic Aug 21, 2019
57b6b6c
ENH/RF: shellcheck, group common bids-app logic into run_bids_app, ex…
yarikoptic Oct 10, 2019
fedb8b0
moved datalad install containers before working in containers subdir
chaselgrove Dec 12, 2019
dc5dc0b
Merge branch 'master' into doc-usecases
chaselgrove Dec 17, 2019
76450af
RF: compose proper call for fmriprep, inline querying participant labels
yarikoptic Dec 19, 2019
850b36f
RF: reordered commands so settings come first and then all the actions
yarikoptic Dec 19, 2019
7110ec6
BF+RF: improve handling of fs license, more TODO comments (seems to w…
yarikoptic Dec 19, 2019
a81e457
ENH: do not create bids app results dataset if directory exists already
yarikoptic Dec 20, 2019
d5e9028
defaulting RM_RESOURCE and RM_SUB to local but allowing overrides
chaselgrove Jan 3, 2020
924980c
Merge remote-tracking branch 'origin/master' into doc-usecases
yarikoptic Jan 8, 2020
d05430d
Merge branch 'doc-usecases' of https://github.com/yarikoptic/ReproNim…
yarikoptic Jan 8, 2020
5ab5a8b
clean the containers repo after freezing versions
chaselgrove Jan 14, 2020
7ae46ab
Revert "clean the containers repo after freezing versions"
chaselgrove Jan 28, 2020
a4af6ba
Merge branch 'master' into doc-usecases
chaselgrove Apr 27, 2020
5a2f0f3
Fix runscript regexp to work on Mac OS
chaselgrove Apr 29, 2020
b70144e
ENH: always set -x, add env vars to not require patching for FS licen…
yarikoptic May 21, 2020
82df248
Comments on which images must be prepropulated in containers repo and…
yarikoptic May 22, 2020
6009dd0
ENH: Script for reproducible rerun of the demo script
yarikoptic May 22, 2020
295d9d2
kyle1-ps4 setup
yarikoptic May 25, 2020
fc91ec2
Merge remote-tracking branch 'origin/master' into doc-usecases
yarikoptic May 25, 2020
8905cc4
BF: Fix failure on unset BIDS_APP with set -u
chaselgrove May 26, 2020
cffa3a3
BF: Add mac workaround in get_participant_ids
chaselgrove May 27, 2020
85a41dc
ENH: Update parallel install message for mac users
chaselgrove May 27, 2020
baab991
BF: export PS1 within -reproduce.sh
yarikoptic May 27, 2020
75fa815
ENH: Use temporary HOME, cp .gitconfig and .freesurfer-license, confi…
yarikoptic May 27, 2020
7920234
reproman-master setup for -reproduce and min datalad 0.12.7
yarikoptic May 27, 2020
29500a7
master reproman now has [datalad] installation target
yarikoptic May 28, 2020
3bf4e38
Merge remote-tracking branch 'origin/master' (needs datalad 0.13.0rc1…
yarikoptic May 28, 2020
d895cc4
ENH: add containers/licenses into --input, specify data/bids explicit…
yarikoptic May 28, 2020
e28d011
DOC: note that datalad runner group analysis probably does nothing
yarikoptic May 28, 2020
795eed8
ENH: point to subject specific input data
yarikoptic May 28, 2020
52c19fc
Merge remote-tracking branch 'yarik/doc-usecases' into doc-usecases
chaselgrove Jun 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 201 additions & 0 deletions docs/usecases/bids-fmriprep-workflow-NP.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
#!/bin/bash
#emacs: -*- mode: shell-script; c-basic-offset: 4; tab-width: 4; indent-tabs-mode: t -*-
#ex: set sts=4 ts=4 sw=4 noet:
#
# This script is intended to demonstrate a sample workflow on a BIDS
# dataset using mriqc, fmriprep, and custom analysis pipeline, mimicing the
# steps presented in an fmriprep paper currently under review but using
# DataLad, ReproNim/containers, and ReproNim.
#
# COPYRIGHT: Yaroslav Halchenko 2019
#
# LICENSE: MIT
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# Description
#
# Environment variables
# - RUNNER - datalad or reproman
# - CONTAINERS_REPO - an alternative (could be local) location for containers
# repository
# - INPUT_DATASET_REPO - an alternative (could be local) location for input
# BIDS dataset
#
# Sample invocations
# - Pointing to the existing local clones of input repositories for faster
# "get"
# RUNNER=datalad CONTAINERS_REPO=~/proj/repronim/containers \
# INPUT_DATASET_REPO=$PWD/bids-fmriprep-workflow-NP/ds000003-demo \
# ./bids-fmriprep-workflow-NP.sh bids-fmriprep-workflow-NP/out2

set -eu

# $STUDY is a variable used in a paper this workflow mimics
STUDY="$1"

# Create study dataset
datalad create -c text2git "$STUDY"
cd "$STUDY"

#
# Install containers dataset for guaranteed/unambigous containers versioning
# and datalad containers-run
#
# TODO: specific version, TODO - reference datalad issue

# Local copy to avoid heavy network traffic while testing locally could be
# referenced in CONTAINERS_REPO env var
datalad install -d . -s ${CONTAINERS_REPO:-///repronim/containers}

# possibly downgrade versions to match the ones used in the "paper"
# TODO see https://github.com/ReproNim/containers/issues/8 for relevant discussion
# and possibly providing some helper to accomplish that more easily
cd containers
echo -n "\
poldracklab-ds003-example 0.0.3
bids-mriqc 0.15.0
bids-fmriprep 1.4.1
"| while read img ver; do
git config -f .datalad/config --replace-all datalad.containers.$img.image images/${img%%-*}/${img}--${ver}.sing;
done
datalad save -d^ -m "Possibly downgraded containers versions to the ones in the paper" $PWD/.datalad/config
cd ..

#
# Install dataset to be analyzed (no data - analysis might run in the cloud or on HPC)
#
# In original paper name for the dataset was used as is, and placed at the
# top level. Here, to make this demo easier to apply to other studies,
# and also check on other datasets, we install input dataset under a generic
# "data/bids" path. "data/" will also collect all other derivatives etc
mkdir data

# For now we will work with minimized version with only 2 subjects
# datalad install -d . -s ///openneuro/ds000003 data/bids
datalad install -d . -s ${INPUT_DATASET_REPO:-https://github.com/ReproNim/ds000003-demo} data/bids


#
# Licenses
#

# we will not prepopulate this one
mkdir licenses/
echo freesurfer.txt > licenses/.gitignore

cat > licenses/README.md <<EOF

Freesurfer
----------

Place your FreeSurfer license into freesurfer.txt file in this directory.
Visit https://surfer.nmr.mgh.harvard.edu/registration.html to obtain one if
you don't have it yet - it is free.

EOF
datalad save -m "DOC: licenses/ directory stub" licenses/


#
# Execution.
#
# That is where access to the powerful resource (HPC) etc would be useful.
# Every of those containerized apps might need custom options to be added.
#
#

# Define common parameters for the reproman run

# ReproMan orchestrator to be used - determines how data/results would be
# transferred and execution protocoled
# Use reproman run --list orchestrators to get an updated list
RM_ORC=datalad-pair-run # ,plain,datalad-pair,datalad-local-run

# Which batch processing system supported by ReproMan will be used
# Use reproman run --list submitters to get an updated list
# RM_SUB=condor,pbs,local

# Which resource to use
# It would require (if was not done before) to configure
# a resource where execution will happen. For now will just use smaug below.
# TODO: provide pointers to doc ( ;-) )
# RM_RESOURCE=

#RM_RESOURCE=discovery
#RM_SUB=PBS
#
# Necessary modules to be loaded in that session:
# - singularity/2.4.2
# Necessary installations/upgrades to be done (TODO: contact John)
# - datalad (0.11.6, TODO: release first)
# - datalad-container

RM_RESOURCE=smaug
RM_SUB=condor
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for local execution it could be

RM_RESOURCE=localshell
RM_SUB=local


# TODO: at reproman level allow to specify ORC and SUB for a resource, so there would
# be no need to specify for each invocation. Could be a new (meta) resource such as
# "smaug-condor" which would link smaug physical resource with those parameters
# TODO: point to the issue in ReproMan

# 1. bids-mriqc -- QA

# Q/TODO: Is there a way to execute/reference the container?
# for now doing manually
datalad create -d . -c text2git data/mriqc

: ${RUNNER:=reproman}

unknown_runner () {
echo "ERROR: Unknown runner $RUNNER. Known reproman and datalad" >&2
exit 1
}

# Sample run without any parallelization, and doing both levels (participant and group)
RUNNER_ARGS=( --input 'data/bids' --output data/mriqc )
MRIQC_ARGS=( "{inputs}" "{outputs}" participant group )
case "$RUNNER" in
reproman)
reproman run --follow -r "${RM_RESOURCE}" --sub "${RM_SUB}" --orc "${RM_ORC}" \
--jp container=containers/bids-mriqc "${RUNNER_ARGS[@]}" "${MRIQC_ARGS[@]}";;
datalad)
datalad containers-run -n containers/bids-mriqc \
"${RUNNER_ARGS[@]}" "${MRIQC_ARGS[@]}";;
*) unknown_runner;;
esac

exit 0 # done for now

# ultimately we should be able to parallelize across subjects. Here is the sample invocation for subj 02
# singularity run containers/images/bids/bids-mriqc--0.15.0.sing \
# data/bids/ data/mriqc/ participant --participant_label 02
# and at the "group" level should have no --participant_label option


reproman run --follow -r "${RM_RESOURCE}" --sub "${RM_SUB}" --orc "${RM_ORC}" \
--bp 'thing=thing-*' \
--input '{p[thing]}' \
sh -c 'cat {p[thing]} {p[thing]} >doubled-{p[thing]}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the latest push to run-subjobs (ac14277) checked out, try

reproman run --follow -r "${RM_RESOURCE}" --sub "${RM_SUB}" --orc "${RM_ORC}" \
  --jp container=containers/bids-mriqc \
  --bp 'pl=02,13' \
  --input data/bids \
  data/bids data/mriqc participant --participant_label '{p[pl]}'

I was able to get that [*] to successfully run via condor on smaug. As you've already experienced, the management of existing datasets is a bit rough, so you may want to use a fresh dataset.

[*] Or more specifically, this script:

script
#!/bin/sh
set -eu

cd $(mktemp -d --tmpdir=. ds-XXXX)
datalad create -c text2git .
datalad install -d . ///repronim/containers
datalad install -d . -s https://github.com/ReproNim/ds000003-demo data/bids

mkdir licenses/
echo freesurfer.txt > licenses/.gitignore
cat > licenses/README.md <<EOF

Freesurfer
----------

Place your FreeSurfer license into freesurfer.txt file in this directory.
Visit https://surfer.nmr.mgh.harvard.edu/registration.html to obtain one if
you don't have it yet - it is free.

EOF
datalad save -m "DOC: licenses/ directory stub" licenses/

datalad create -d . data/mriqc

reproman run --resource sm --follow \
         --sub condor --orc datalad-pair-run \
         --jp container=containers/bids-mriqc --bp 'pl=02,13' \
         -i data/bids \
         data/bids data/mriqc participant --participant_label '{p[pl]}'


# 2. bids-fmriprep -- preprocessing

# 3. poldracklab-ds003-example -- analysis

# X. Later? visualization etc - used nilearn