I attempted to run the VLM application with:
Efficient-Large-Model/NVILA-Lite-2B
Which has a qwen2 architecture, which is not currently supported by the mlc-llm library that this repo installs in its DockerFile.
Any comments on how we can add support to this?