[python] Unify specialize and launch in ModuleLauncher#4052
[python] Unify specialize and launch in ModuleLauncher#4052lmondada wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
1034d51 to
e28d96c
Compare
|
Thanks @Renaud-K for the early review! I have settled for a design of To keep it easy to construct instances of this type, I've created a factory function that needs MLIR and lives in My hope is that over time we can start using this class in other parts of the runtime, and merge execution paths between C++ and Python. |
Signed-off-by: Luca Mondada <luca@mondada.net>
|
CUDA Quantum Docs Bot: A preview of the documentation can be found here. |
Signed-off-by: Luca Mondada <luca@mondada.net>
|
@Renaud-K You were right that Thanks a lot for the prototyping! I've made that change in the latest commit. Does this go in the right direction? I would look into merging JitEngine and CompiledKernel in a separate PR, as that would involve changes to QPUs etc |
|
I am comfortable with the CompiledKernel changes, however the QPU.cpp changes maybe should be reviewed by @schweitzpgi . Note that I would still like to break-up That will also be a task for the next cycle. |
|
@schweitzpgi Hey Eric, I'm starting to remove duplication between the After talking to Renaud, the I think there is also an overlap with your "universal cache" idea. In the future, the compiled kernel type could be expanded to keep a hash of all launch arguments and be the main data type that we cache. Of course, more thinking will be required on this. In terms of observable functionality, this PR is a no-op. The code that has been de-duplicated was identical across the two execution paths. |
|
|
||
| #include "CompiledKernel.h" | ||
|
|
||
| namespace cudaq { |
There was a problem hiding this comment.
| namespace cudaq { |
We should move to the LLVM style so as to better catch bugs, etc.
There was a problem hiding this comment.
Understood, I'm removing the namespace {} block.
|
|
||
| namespace cudaq { | ||
|
|
||
| CompiledKernel::CompiledKernel(JitEngine engine, std::string kernelName, |
There was a problem hiding this comment.
| CompiledKernel::CompiledKernel(JitEngine engine, std::string kernelName, | |
| cudaq::CompiledKernel::CompiledKernel(JitEngine engine, std::string kernelName, |
| void *buff = const_cast<void *>(rawArgs.back()); | ||
| return reinterpret_cast<KernelThunkResultType (*)(void *, bool)>(funcPtr)( | ||
| buff, /*client_server=*/false); | ||
| } else { |
There was a problem hiding this comment.
nit: else after return.
https://clang.llvm.org/extra/clang-tidy/checks/readability/else-after-return.html
|
|
||
| void (*CompiledKernel::getEntryPoint() const)() { return entryPoint; } | ||
|
|
||
| const JitEngine CompiledKernel::getEngine() const { return engine; } |
There was a problem hiding this comment.
Why const on the return value? It's rather useless, eh?
const int f() { return 0; }
int i = f();is perfectly valid C++.
|
|
||
| // TODO: remove these two methods once the CompiledKernel is returned to | ||
| // Python. | ||
| void (*getEntryPoint() const)(); |
There was a problem hiding this comment.
I'm dubious of this. CUDA-Q does not define entry point kernels as always having void(void) signatures. So this seems like a cul-de-sac.
| std::string kernelName, | ||
| bool hasResult) { | ||
| std::string fullName = cudaq::runtime::cudaqGenPrefixName + kernelName; | ||
| std::string entryName = hasResult ? kernelName + ".thunk" : fullName; |
There was a problem hiding this comment.
This code looks like the efficiency hack used in Python. i.e., unrelated to the C++ implementation. But it is appearing here in what looks like common code?
I'm trying to play catchup here, so I appreciate your patience.
Thanks for combining the mostly duplicative paths. That seems like a good thing for sure. |
We have moved the compiler |
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com> Signed-off-by: Luca Mondada <72734770+lmondada@users.noreply.github.com>
|
CUDA Quantum Docs Bot: A preview of the documentation can be found here. |
This PR is the start of a line of work aiming to unify the "specialize kernel" and "launch kernel" execution paths for kernels defined (and executed from) Python.
I have started by creating a
CompiledKernelclass that can be shared by the two execution paths at the bottom of their respective call stacks (see diagram). As we progress, we should be able to progressively unify the two call stacks into one.flowchart TB subgraph unify["This PR"] module["ModuleLauncher::compileModule"] module --> compiled[CompiledKernel] end qpu["QPU::specializeModule"] qpu2["QPU::launchModule"] platform["quantum_platform::specializeModule"] platform2["quantum_platform::launchModule"] strm["cudaq::streamlinedSpecializeModule"] strm2["cudaq::streamlinedLaunchModule"] pyLaunchModule ~~~ strm pyLaunchModule["cudaq::pyLaunchModule"] clm["cudaq::clean\_launch\_module"] marshal["marshal\_and\_retain\_module"] marshal2["marshal\_and\_launch\_module"] subgraph cls["PyKernelDecorator"] python["beta_reduction"] python2["\_\_call\_\_"] end python --> marshal --> strm --> platform --> qpu --> module python2 --> marshal2 --> clm --> pyLaunchModule -->strm2 --> platform2 --> qpu2 --> module platform2 --> otherqpu subgraph other[other QPUs] otherqpu[...] end