Skip to content

Conversation

gislan
Copy link

@gislan gislan commented Oct 10, 2025

For some reason libtorch is very slow when constructing Tensor objects by assigning values one by one. On my box, it could take 200ms to handle a batch of 128 connect-four states (just creating Tensors, not doing inference) on CUDA and 30ms on CPU. With this change, it's <100us.

For some reason libtorch is very slow when constructing Tensor objects
by assigning values one by one. On my box, it could take 200ms to handle
a batch of 128 connect-four states (just creating Tensors, not doing
inference) on CUDA and 30ms on CPU. With this change, it's <100us.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant