I have prepared this X-codec-2.0 inference as streaming but i got the non streaming performance better then stemming(60 tokens) per streaming.
if possible let me know why performance is different streaming and non streaming, if i want to same performance streaming and non streaming what approach need to do.