We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Is there any plan to support varlen interface? It seems useful for batch generation and some kv cache prefetch scheme would enhance the performance.