vision_llm vision_llm is intended to act as an inference engine for visual models. Objectives: Run most visual models from a CLI Utilize GPU when available for fast inference Expose an endpoint on a local port that accepts images and instruction