-
Notifications
You must be signed in to change notification settings - Fork 62
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Please provide a detailed description of your question or the information you seek.
I encountered the following warning while using chakra link:
[2025-06-04 04:00:01,109] trace_linker.py:679 [WARNING]: No CUDA runtime operator found for correlation ID 17297502. This is not a common case, and there should be a corresponding CUDA runtime operator for a given GPU kernel operator. It can be a case where CUDA runtime operators are not properly identified and added to the map, kineto_correlation_cuda_runtime_map. Please manually check if the corresponding CUDA runtime operator with the correlation is dropped by mistake. It is likely that it is because of incomplete map, cuda_launch_operations, in is_kernel_launch_op. Please update the map properly to cover all CUDA runtime launch operators.
[2025-06-04 04:00:01,109] trace_linker.py:625 [WARNING]: Missing parent CPU operator for GPU op 'void at::native::(anonymous namespace)::multi_tensor_apply_kernel<at::native::(anonymous namespace)::TensorListScalarListMetadata<float, 3>, at::native::(anonymous namespace)::PointwiseOpScalarListFunctor<float, 3, 3, 0>, std::divides<float> >(at::native::(anonymous namespace)::TensorListScalarListMetadata<float, 3>, at::native::(anonymous namespace)::PointwiseOpScalarListFunctor<float, 3, 3, 0>, std::divides<float>)'. Orphaned GPU operator.
I run distributed training with dp=4 on 4XA6000 machines. Below is the repository where I train repo, I'm not sure if this warning is due to me doing something wrong, I didn't run into this problem while collecting traces while training on megatron.
I also raised this issue in the pytorch community issue
The trace of rank0 will not encounter this problem, but some kernels of other ranks will encounter this problem, I am not sure if this is normal.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested