The LLM generate API (https://github.com/NVIDIA/NeMo/blob/93a77b1d56df80ca7f772ad37563ab1ec39a21b6/nemo/collections/llm/api.py#L1037) fails with chat-format data.