Skip to content

Cleanup of runtime support #3368

@AlexandreEichenberger

Description

@AlexandreEichenberger

@chentong319 has started looking at refactoring our runtime support as part of supporting the "light runtime" that enables us to execute a model with minimal dependences on LLVM and other projects.

To refresh my memory, I just got a look at our current onnx-mlir/Runtime. Currently we have the following directories:

Runtime/jni:  the JNI support
Runtime/omp: cmake support to rebuild omp library.
Runtime/python: where all the python support is located
Runtime: code for supporting the .so (compiler generated calls) and
         the ExecutionSession that is used in C++/Python interfaces

src/Compiler: has code to compile and even a python interface file there.

include/OnnxMlirRuntime.h which includes all the header files from include/onnx-mlir/Runtime
include/OnnxMlirCompiler.h 

My observations:

  1. In the Runtime dir, we have versions for C and C++ of everything, but most of the functionality is only needed in C. This is because many of the file/functions here provide support ONLY for the generated .so; these are C exclusively.
  2. In Runtime direct, we also have files/functions that are only needed to support ExecutionSession (used in C++/python interfaces). Some functions are also only needed for our debugging purposes (such as omTensorCreateWithRandomData and omTensorAreTwoOmtsClose).
  3. Because the cruntime is always used, and if we use ExecutionSession we also include the C++ version of every functions in the Runtime directory, plus we don't use the external "c" very much, we are likely to have 2 versions of the same functions (one in C, one in C++). In addition, many of the support functions are not using the static prefix.
  4. The infrastructure to compile is found in src/Compiler/OnnxMlirCompiler.cpp and includes onnx/protobuf and file inputs. There are 2 distinct (but nearly identical) python versions, one in Runtime/python/PyOMCompileExecutionSession.hpp and one in src/Compiler/PyOMCompilerSession.hpp

Looking forward to hear what @chentong319 will suggest. I gathered below my initial thoughts.

  1. Runtime/CRuntime dir that includes all that is required to run a model, which includes every functions (and helper) generated by the .so and introduced by the compiler. We may want to add all of the functionality needed to use the model in C (namely the creation/destruction/introspection of OMTensor and OMTensorList. The idea of the C interface is that a model .so is linked at compile time to a C binary, and so there is no need to dynamically compile and/or dynamically load a model. This what we have here. This should be a "lightweight" version only.
  2. Runtime/CPPRuntime dir where we add support for the C++ ExecutionSession. This add the convenience of having a class where we can store pointers to a dynamically loaded entry points. Technically, this could also be provided using a structure in the C interface... but we don't use it at this point, and it's probably ok to only have it in C++. Note that currently OMTensor.inc and OMTensorList.inc include both C and C++ functionality, so we would probably need to retain this common file in the CRuntime subdirectory, but only use it as a C++ version here. This should be a "lightweight" version only. We indirect support to compile but not as a class. Such a class is at this time in the Runtime/python subdirectory. Maybe it could be pulled here for more general usability.
  3. Runtime/CPPRuntimeDebug where we add the support for debugging (create random, compare OMTensors,...) this could be added to 2) above, but really it is only used by us in tests/CIs... so I think it can be in a different directory. Or same directory but a separate library (maybe better). Note that there is some debug support in CRuntime already as far as the compiler can generate calls to print functions that are helpful to debugging/performance monitoring. This should be a "lightweight" version only.
  4. Runtime/jni for java related stuff. Mostly Java interface for Runtime basic calls, plus some handling of the input/output data. This should be a "lightweight" version only.
  5. Runtime/python for python interfaces. Right now, we have support to the typical "execution session" whose functionality is implemented in CPPRuntime augmented by conversion to/from python data structures. Right now, we also have there support to compile a model, and compile and run. We could have a "lightweight" version of compile that invoke docker, and in a different package the one that invoke it locally.
  6. We could have a subdirectory to support ORT more directly, and/or PyTorch based on the python interface.

Note that we may want to introduce more "OM" prefixed names for our runtime libraries as CRuntime is very very generic.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions