Skip to content

Investigate the performance issues and consider moving to GemmKernels.jl #2

@GiggleLiu

Description

@GiggleLiu
          Sorry for the previous chaos, I thought these parts will not be publish as part of the package.

The following changes have been made:

  • The .so file is uploaded to gist as an artifact, so that there no more binary in the repo now.
  • I relocated all the files into folder src, test and benchmark.
  • Scripts used for benchmarks are given, including the fall back implementation in CUDA.jl. However I found something strange: it seems that CUDA.@sync do not work when using the function from a .so lib, so I failed the benchmark our code in julia.

The new benchmark result is show here:
image

Originally posted by @ArrogantGao in #1 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions