work on llava-next LLaVA-Video-7B-Qwen2 ?

I'm currently studying about LLaVA-Video-7B-Qwen2, it uses vision model : siglip-so400m-patch14-384, can you share how to switch vision model to use MLCD-ViT-B-32-224px ?