Skip to content

Asynchronous Execution of Kernels #7

@muellren

Description

@muellren

The current grCUDA prototype adds an synchronization barriers after every kernel execution (cudaDeviceSynchronize()). In CUDA, kernels are executed asynchronously with respect to host code and kernels or memory operations in other streams.

  • Implement asynchronous but non-deferred execution.
  • Track read and write dependencies in DeviceArray and automatically insert synchronization points.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions