Skip to content

"[ERROR ] Remote repository ... is dirty" should may be checked earlier? #447

@yarikoptic

Description

@yarikoptic

Trying to run #438 (as of v0.1.0-519-g1588f76) using #408 (as of v0.1.0-546-gac14277) to ERROR out with

2019-08-07 23:34:04,969 [ERROR  ] Remote repository /home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2 is dirty [orchestrators.py:_assert_clean_repo:673] (OrchestratorError) 
full log
(git)hopa:~/proj/repronim/reproman-master[doc-usecases]docs/usecases
$> ./bids-fmriprep-workflow-NP.sh bids-fmriprep-workflow-NP/out3           
[INFO   ] Creating a new annex repo at /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3 
create(ok): /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3 (dataset)      
[INFO   ] Running procedure cfg_text2git 
[INFO   ] == Command start (output follows) ===== 
[INFO   ] == Command exit (modification check follows) ===== 
[INFO   ] Cloning http://datasets.datalad.org/repronim/containers [1 other candidates] into '/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/containers' 
install(ok): containers (dataset)                                                                               
action summary:
  add (ok: 2)
  install (ok: 1)
  save (ok: 1)
add(ok): .datalad/config (file)
save(ok): containers (dataset)
add(ok): containers (file)
save(ok): . (dataset)
action summary:
  add (ok: 2)
  save (ok: 2)
[INFO   ] Cloning https://github.com/ReproNim/ds000003-demo [1 other candidates] into '/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/bids' 
[INFO   ]   Remote origin not usable by git-annex; setting annex-ignore                                         
[INFO   ] access to 1 dataset sibling s3-PRIVATE not auto-enabled, enable with:
| 		datalad siblings -d "/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/bids" enable -s s3-PRIVATE 
install(ok): data/bids (dataset)
action summary:
  add (ok: 2)
  install (ok: 1)
  save (ok: 1)
add(ok): licenses/.gitignore (file)                                                                             
add(ok): licenses/README.md (file)
save(ok): . (dataset)
action summary:
  add (ok: 2)
  save (ok: 1)
[INFO   ] Creating a new annex repo at /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/mriqc 
create(ok): data/mriqc (dataset)                                                                                
[INFO   ] Running procedure cfg_text2git 
[INFO   ] == Command start (output follows) ===== 
[INFO   ] == Command exit (modification check follows) ===== 
action summary:
  add (ok: 2)
  create (ok: 1)
  save (ok: 1)
2019-08-07 23:33:47,131 [INFO   ] No root directory supplied for smaug; using '/home/yoh/.reproman/run-root' 
[INFO   ] Connecting ... 
ECDSA host key for IP address '129.170.233.9' not in list of known hosts.
[INFO   ] Considering to create a target dataset /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3 at /home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2 of smaug 
[INFO   ] Fetching updates for <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3> 
.: smaug(+) [ssh://smaug/home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2 (git)]                
[INFO   ] Adjusting remote git configuration 
[INFO   ] Considering to create a target dataset /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/containers at /home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2/containers of smaug 
[INFO   ] Fetching updates for <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/containers> 
.: smaug(+) [ssh://smaug/home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2/containers (git)]     
[INFO   ] Adjusting remote git configuration 
[INFO   ] Considering to create a target dataset /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/bids at /home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2/data/bids of smaug 
[INFO   ] Fetching updates for <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/bids> 
.: smaug(+) [ssh://smaug/home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2/data/bids (git)]      
[INFO   ] Adjusting remote git configuration 
[INFO   ] Considering to create a target dataset /home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/mriqc at /home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2/data/mriqc of smaug 
[INFO   ] Fetching updates for <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/mriqc> 
.: smaug(+) [ssh://smaug/home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2/data/mriqc (git)]     
[INFO   ] Adjusting remote git configuration 
[INFO   ] Running post-update hooks in all created siblings 
[INFO   ] Publishing <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/mriqc> to smaug 
[INFO   ] Publishing <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/data/bids> to smaug 
[INFO   ] Publishing <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3/containers> to smaug 
[INFO   ] Publishing <Dataset path=/home/yoh/proj/repronim/reproman-master/docs/usecases/bids-fmriprep-workflow-NP/out3> to smaug 
2019-08-07 23:34:04,969 [ERROR  ] Remote repository /home/yoh/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2 is dirty [orchestrators.py:_assert_clean_repo:673] (OrchestratorError) 

which is apparently due to datalad/datalad#3591 - leaving super-dataset dirty, if subdataset was created with -c procedure call.

So there is indeed uncomitted difference in the state of mriqc even locally which also show up remotely, but reproman run complains about them only after "assuring" the same state on remote end. I wonder if complain should be issued upon analysis of the local space if there is a requirement to be "clean"?

the diff
(git)smaug:~/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2[master]
$> git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   data/mriqc (new commits)

no changes added to commit (use "git add" and/or "git commit -a")
1 10154.....................................:Wed 07 Aug 2019 11:40:01 PM EDT:.
(git)smaug:~/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2[master]
$> datalad diff -r
   modified(dataset): data/mriqc
      modified(file): data/mriqc/.gitattributes
1 10155.....................................:Wed 07 Aug 2019 11:40:18 PM EDT:.
(git)smaug:~/.reproman/run-root/4b3524f6-b98d-11e9-95c1-8019340ce7f2[master]
$> git -C data/mriqc show
commit 8df32f9288d31c58f3bcf12195f87527e491c662 (HEAD -> master)
Author: Yaroslav Halchenko <[email protected]>
Date:   Wed Aug 7 23:33:44 2019 -0400

    Instruct annex to add text files to Git

diff --git a/.gitattributes b/.gitattributes
index c3aaefe..8e9a246 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,3 +1,4 @@
 
 * annex.backend=MD5E
-**/.git* annex.largefiles=nothing
\ No newline at end of file
+**/.git* annex.largefiles=nothing
+* annex.largefiles=(not(mimetype=text/*))
\ No newline at end of file

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions