Skip to content

Establish/document DataLad workflow for updating datasets on rolando with curation changes #17

@yarikoptic

Description

@yarikoptic

HeudiConv/ReproIn organized BIDS datasets are not curated. Labs ideally should fetch them using DataLad as described in instructions and curate and enhance them. As long as DataLad (or git/git-annex directly) are used, we can establish a nice and unambigous workflow to propagate those changes back to rolando's centralized collection of datasets. Curation entails

  • removal or renaming of __dups with potential changes to IntendedFor of fmaps .json files (under git) if populated, and changes to _scans.tsv files (under annex).
  • addressing TODOs as documented through out the dataset
  • populate _events.tsv files

To make it work we need to

  • establish git-annex storage per each dataset git-repo available to both DBIC personnel curating BIDS datasets (well, me ATM), and researchers. We could
    - use github for that with private repos, while automatically creating and populating repositories here. LFS for storing those _scans.tsv annexed files. That would also give us issues tracker etc. But even though private, not certain if "kosher". Also mapping rolando users to github might show to be pain
    - "mirror" entire hierarchy on rolando where we give write access to researchers to push changes. E.g. could be /inbox/BIDS-curated or alike (e.g. just a nearby bare git-annex repo with needed permissions for each dataset with -curated suffix in the name. The simplest way ATM and could be done on 'case by case' basis). Cons: no issue tracker. Pros: simple/easy
    - have gitlab DBIC instance provided somewhere internally. Pros: very featurefull, supports hierarchical organization, Cons: no git-annex support so we would still need the "mirror"
    - have gin (https://gin.g-node.org/) DBIC instance. Pros: has all needed features of github and supports git-annex .

@andycon WDYT about gitlab or gin instance for DBIC? In both gitlab and gin cases not sure on how easy to integrate with user accounts/permissions already out there. Needs "research"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions