Skip to content

Sketch of using rayon for parallel directory traversal #1178

Open
@jessegrosjean

Description

@jessegrosjean

You've mentioned a few times that you'd like to move directory traversal to rayon. Here's a working/incomplete sketch of one approach to doing that:

https://github.com/jessegrosjean/walk

Maybe all obvious, but maybe a useful resource (and was good learning for me) for later development. The basic design is to create a channel to use as work queue. And then use rayon's par_bridge() to process that channel of work in parallel.

Performance is comparable to ignore crate walking linux source:

par_ignore_walk         time:   [72.235 ms 73.115 ms 73.753 ms]                           
                        change: [-3.4721% -2.1156% -0.7586%] (p = 0.01 < 0.05)
                        Change within noise threshold.

rayon_walk              time:   [60.600 ms 60.734 ms 60.917 ms]                      
                        change: [+0.0896% +0.5440% +1.0160%] (p = 0.04 < 0.05)
                        Change within noise threshold.

In this run I have ignore crate not doing any filtering, but it's still likely doing more then my rayon_walk, so I expect performance is about the same between the two even though rayon walk is running faster. Another benefit of this design is that it's fairly strait forward to get sorted results also computed in parallel. But that adds some complexity so I figured I would post this without that code for now.

Last you also mentioned wanting to rethink the ignore create API. Currently it's callback based. I wonder if it would make more sense (again if you are thinking long term about a redesign) to make it iterator based. So ignore crate would just be responsible for generating an iterator over DirEntry results as quickly as possible. If you also wanted to do heavy processing on those entries you could call par_bridge again on those results and do your heavy processing (such as perform ripgrep search) there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAn enhancement to the functionality of the software.iceboxA feature that is recognized as possibly desirable, but is unlikely to implemented any time soon.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions