Run local LLM models on Google Colab and access them remotely via API — ideal for lightweight, cost-effective development and testing using Ollama and Cloudflare Tunnel.
✅ Access your Colab-hosted LLM API from anywhere — even inside VS Code using the ROO Code extension!
- 🔥 Run advanced LLMs (like Qwen, LLaMA3, Mistral, DeepSeek) in Colab using Ollama
- 🌐 Expose the model via secure public URL using
cloudflared
- 🧑💻 Integrate with ROO Code in VS Code for seamless coding assistance
- ✅ Automatically detects and waits for Ollama to be ready before tunneling
- 💡 Simple, professional, and reusable setup
- A Google Colab account
- A GPU runtime (preferably T4 High-RAM or better)
- No installation or cloud account needed for Cloudflare tunneling
- Installs and launches Ollama in the background
- Pulls the selected model (e.g.,
maryasov/qwen2.5-coder-cline:7b-instruct-q8_0
) - Waits until Ollama is running and responsive
- Starts a Cloudflare tunnel to expose
http://localhost:11434
- Prints a public
.trycloudflare.com
URL — ready to use
Follow these steps to get your local LLM running in Colab and accessible via public API:
-
Import the
.ipynb
notebook into your Google Colab- Open colab.research.google.com and upload the notebook.
-
Choose the runtime as
T4 GPU
- Go to
Runtime > Change runtime type
→ select:- Hardware accelerator: GPU
- GPU type: T4
- Note: Colab GPU sessions last up to ~3 hours before disconnecting. Then you can restart it.
- Go to
-
Run all cells
- Click
Runtime > Run all
- Wait for the cells to complete. Model download can take a few minutes.
- Click
-
Verify the API is working in Step 7
- You'll see a generated public
trycloudflare.com
URL - The cell will also run a test
curl
request
- You'll see a generated public
-
Click the public link
- You should see the message: “Ollama is running”
- This confirms the API is live and ready to be used from tools like curl or ROO Code in VS Code
- Install ROO Code extension
- Open extension settings
- Choose API Provider as Ollama
- Paste the public URL from Colab (e.g.
https://bold-sky-1234.trycloudflare.com
) (Do not include/
at the end of the link) - Choose your model
- Done! You can now prompt your Colab-hosted model from your local VS Code 💬
Feel free to open issues, suggest improvements, or submit pull requests. Let's make local model hosting accessible for everyone!