-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid unnecessary copy in TensorSource #8849
base: master
Are you sure you want to change the base?
Conversation
2b3b31f
to
636a787
Compare
636a787
to
e483f51
Compare
Hi @ysiraichi, just follow up on offline discussion on the copy operation. PTAL at the PR, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, we can use the DLPack machinery for doing the CUDA to XLA:CUDA transfer (that wasn't implemented at the time I worked on this). I will open an issue for this.
// The purposes of copy are: | ||
// 1. Ensure the memory is contiguous, which is expected by PJRT. | ||
// 2. Move CUDA tensor to CPU since we cannot pass CUDA memory to PJRT now. | ||
// 3. Cast data type. | ||
// We can avoid if copy is not needed. | ||
if (tensor.device() == at::kCPU && tensor.is_contiguous() && | ||
tensor.dtype() == target_torch_type) { | ||
tensor_ = std::move(tensor); | ||
} else { | ||
// TODO(ysiraichi): check, first, if tensor lives in a device that the | ||
// current PjRt client has access. If so, we don't need to go through the | ||
// CPU. | ||
tensor_ = std::move(tensor.to( | ||
at::TensorOptions().device(at::kCPU).dtype(target_torch_type), | ||
/*non_blocking=*/false, | ||
/*copy=*/true, at::MemoryFormat::Contiguous)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand it, tensor.to(...)
(without the copy
argument) already checks whether it should actually copy or not. So, what do you think of reverting to the old tensor.to(...)
usage, but removing the copy
argument, instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ysiraichi, I didn't find a tensor.to(...)
without the copy
arg in C++, is it only in python?
Avoid
at::Tensor
copy inTensorSource
if it's not necessary.The copy operations are needed under 2 cases:
The copy operation needs to be blocking, since the transfer operation depends on the copied tensor.