http://docs.datalad.org/projects/metalad/en/latest/user_guide/writing-extractors.html recommends using git ls-file under Tips.
I think this should be replaced with git ls-tree. I cannot come up with a use case where an extractor would want to process uncommitted work tree content. Recommending ls-file communicates something that likely leads to faulty implementations.