-
Notifications
You must be signed in to change notification settings - Fork 402
Open
Description
@chentong319 has started looking at refactoring our runtime support as part of supporting the "light runtime" that enables us to execute a model with minimal dependences on LLVM and other projects.
To refresh my memory, I just got a look at our current onnx-mlir/Runtime. Currently we have the following directories:
Runtime/jni: the JNI support
Runtime/omp: cmake support to rebuild omp library.
Runtime/python: where all the python support is located
Runtime: code for supporting the .so (compiler generated calls) and
the ExecutionSession that is used in C++/Python interfaces
src/Compiler: has code to compile and even a python interface file there.
include/OnnxMlirRuntime.h which includes all the header files from include/onnx-mlir/Runtime
include/OnnxMlirCompiler.h
My observations:
- In the
Runtimedir, we have versions for C and C++ of everything, but most of the functionality is only needed in C. This is because many of the file/functions here provide support ONLY for the generated .so; these are C exclusively. - In
Runtimedirect, we also have files/functions that are only needed to support ExecutionSession (used in C++/python interfaces). Some functions are also only needed for our debugging purposes (such asomTensorCreateWithRandomDataandomTensorAreTwoOmtsClose). - Because the
cruntimeis always used, and if we useExecutionSessionwe also include the C++ version of every functions in theRuntimedirectory, plus we don't use theexternal "c"very much, we are likely to have 2 versions of the same functions (one in C, one in C++). In addition, many of the support functions are not using thestaticprefix. - The infrastructure to compile is found in
src/Compiler/OnnxMlirCompiler.cppand includes onnx/protobuf and file inputs. There are 2 distinct (but nearly identical) python versions, one inRuntime/python/PyOMCompileExecutionSession.hppand one insrc/Compiler/PyOMCompilerSession.hpp
Looking forward to hear what @chentong319 will suggest. I gathered below my initial thoughts.
Runtime/CRuntimedir that includes all that is required to run a model, which includes every functions (and helper) generated by the .so and introduced by the compiler. We may want to add all of the functionality needed to use the model in C (namely the creation/destruction/introspection ofOMTensorandOMTensorList. The idea of the C interface is that a model .so is linked at compile time to a C binary, and so there is no need to dynamically compile and/or dynamically load a model. This what we have here. This should be a "lightweight" version only.Runtime/CPPRuntimedir where we add support for the C++ExecutionSession. This add the convenience of having a class where we can store pointers to a dynamically loaded entry points. Technically, this could also be provided using a structure in the C interface... but we don't use it at this point, and it's probably ok to only have it in C++. Note that currentlyOMTensor.incandOMTensorList.incinclude both C and C++ functionality, so we would probably need to retain this common file in theCRuntimesubdirectory, but only use it as a C++ version here. This should be a "lightweight" version only. We indirect support to compile but not as a class. Such a class is at this time in the Runtime/python subdirectory. Maybe it could be pulled here for more general usability.Runtime/CPPRuntimeDebugwhere we add the support for debugging (create random, compare OMTensors,...) this could be added to 2) above, but really it is only used by us in tests/CIs... so I think it can be in a different directory. Or same directory but a separate library (maybe better). Note that there is some debug support inCRuntimealready as far as the compiler can generate calls toprintfunctions that are helpful to debugging/performance monitoring. This should be a "lightweight" version only.Runtime/jnifor java related stuff. Mostly Java interface for Runtime basic calls, plus some handling of the input/output data. This should be a "lightweight" version only.Runtime/pythonfor python interfaces. Right now, we have support to the typical "execution session" whose functionality is implemented inCPPRuntimeaugmented by conversion to/from python data structures. Right now, we also have there support to compile a model, and compile and run. We could have a "lightweight" version of compile that invoke docker, and in a different package the one that invoke it locally.- We could have a subdirectory to support ORT more directly, and/or PyTorch based on the python interface.
Note that we may want to introduce more "OM" prefixed names for our runtime libraries as CRuntime is very very generic.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels