Feature idea, relatively low priority:
Right now you have to modify the source in order to add a new extractor class. There should be a way to do this dynamically, i.e. by calling TextExtractor.register(SomeCustomFileHandlerClass) or similar. We will have to find a way to determine the ordering / precedence of extractors since the list will be dynamic.