Skip to content

accelerating inference, let javacpp-pytorch support AOTIModelPackageLoader and AOTIModelContainerRunnerCuda and Profiler export file #1729

@mullerhai

Description

@mullerhai

HI, @saudet

For me, AOTIModelPackageLoader, AOT Cuda, and the profiler are critical factors for future inference acceleration. I am confident that we can use javacpp-pytorch to refactor a large model inference engine that rivals the performance of vllm, sglang, and deepspeed. A large model inference engine based on Java will one day be a tool on par with Apache Spark and Flink. However, if you think this is too ahead of its time, of course, if you consider that many Java programmers are still quite初级 (beginner), they are still stuck at the level of how to use tensor or matrix calculations. In any case, from the perspective of ecosystem completeness, AOTIModelPackageLoader, AOT Cuda, and the profiler are indispensable for javacpp-pytorch. If you can implement this, I am willing to pay 200 US dollars. This is not just for me; it is truly for the benefit of the Java, Scala, and Kotlin ecosystems. I believe that in the future, more major companies will also consider these aspects. With AOTIModelPackageLoader, AOT Cuda, and the profiler, I even feel that javacpp-pytorch would not need updates for the next five years, because it would truly be complete, without the need for other new, breakthrough features.
For everyone, these topics are already very profound, and many people may not even encounter them, or have never even considered them. However, from the perspective of accelerating inference, they are essential requirements.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions