Recently, we have been looking into optimizing tvm models with tuning and quantization. There should be a way to handle the optimized models. I don't think we should try to tune the model with CI/CD pipelines because it can be time consuming. For example, tuning centerpoint takes around 1 hour. Similar for tvm quantization, retraining and tuning are also required. I feel like these steps could be done offline and uploaded manually for now.
Recently, we have been looking into optimizing tvm models with tuning and quantization. There should be a way to handle the optimized models. I don't think we should try to tune the model with CI/CD pipelines because it can be time consuming. For example, tuning
centerpointtakes around 1 hour. Similar for tvm quantization, retraining and tuning are also required. I feel like these steps could be done offline and uploaded manually for now.