-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Hey
The Problem I'm Facing:
I’m using Serve v0.31.0 SNAPSHOT in Docker with:
- 4GB shared memory
- GPU enabled
I’m trying to use Serve as a dynamic batcher for requests from my Java client to avoid performance issues. However, I’m encountering Serve crashes with heap space errors, causing Docker container to stop.
Logs:
2025-03-31 152801 INFO ModelServer.txt
I’ve Tried to Adjusted memory allocation in Docker and Experimented with WLM.
Error on WLM:
java.util.concurrent.ExecutionException: ai.djl.serving.wlm.util.WlmException: Receiver class ai.djl.pytorch.engine.PtNDArray does not define or inherit an implementation of the resolved method 'abstract java.nio.ByteBuffer toByteBuffer(boolean)' of interface ai.djl.ndarray.NDArray.
so, regarding these issues I have 2 questions to ask:
- Are there known memory optimizations for Serve in Docker?
- Is the WLM error related to a PyTorch engine compatibility issue?
I’d greatly appreciate any guidance! Sorry for any confusions and thank you for your time.