Skip to content

First-class parallelism #1760

Open
Open
@mitchgrout

Description

@mitchgrout

🤔 What's the problem you're trying to solve?

I have a suite of tests which take a long time to execute, but have no common resources that would prevent them from running in parallel. While I can use parallel_tests to speed up execution, one irritating limitation is that each process will have its own formatted output. This means I cannot easily see information like total executed/failed steps, the re-run list of failed scenarios, or have a singular HTML report generated. Further, since it randomly assigns features/scenarios to spawned processes, its possible for one or more processes to be stuck with a large number of slow-to-run features, which can lead to sub-optimal test run times.

✨ What's your proposed solution?

A work-stealing parallelism mode, which schedules at scenario-level, and can provide a cohesive output from a single process. This should address the issue of having multiple independent outputs, and sub-optimal splits.

⛏ Have you considered any alternatives or workarounds?

I have had success with parallel_tests and a custom tool I wrote to merge .ndjson data files, which allowed me to continue using the standard HTML reporter. However, this requires some conditional config in my cucumber.yml to ensure .ndjson is emitted when running in parallel, and to suppress all other outputs. I have not yet found a way to resolve the issue of sub-optimal splits.

📚 Any additional context?

There are a few outstanding questions I'm not sure of, which would limit the possibility of this feature:

  1. How would this interact with the wire protocol plugin? Typically this assumes a singular connection which all tests could be streamed through. Should parallelism be disabled if a wire connection is configured?
  2. How much rework would be required to implement this? Since tests are orchestrated around the event-bus, which as I understand assumes a sequential workflow, how could this be achieved?
  3. If parallelism is implemented, what would this mean for steps which require interactivity such as aruba? Could these steps assert something like not parallel?, or would this not be acceptable?

While I may be misreading the documentation, it appears that the Java implementation may also support some rudimentary parallel test execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions