Skip to content

Control output order #1706

Open
Open
@krlmlr

Description

@krlmlr

I was looking for an alternative to gfind to walk a large tree, I love the performance of fd .

For repeated searches, there is value in having a consistent output order. I can pipe the output of fd to gsort to achieve this, but this means I need to wait for the entire traversal until the first item is printed or can be processed by subsequent steps in the pipeline.

I'd like fd -O to emit the search results ordered according to the current locale, while still emitting output as it is produced. A simple implementation could look like this:

  • All jobs write their output to a local priority queue, sorted by locale
  • Repeat until all jobs have terminated
    • Wait until all jobs either have at least one item in their PQ or have terminated
    • Find the PQ with the globally smallest item
    • Pop and print the item from that PQ

This assumes that, once an item is in the PQ, no smaller item is going to be added there. This might require finer control at the job level.

GNU parallel has this on by default. How do the do it?

Unlike #1305, this is not asking to control the order of the traversal, just the output order.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions