In QA tasks, the model is better to be stateful. Triton example: https://github.com/triton-inference-server/server/issues/1172