- Navigate to LM Studio’s official website.
- Download and install the appropriate version for your operating system.
- Install ngrok (Windows example using Chocolatey):
choco install ngrok
- Configure your ngrok auth token:
ngrok config add-authtoken <YOUR_NGROK_AUTH_TOKEN>
Replace <YOUR_NGROK_AUTH_TOKEN>
with your actual ngrok token.
- Open LM Studio.
- In LM Studio, click the Settings button (gear icon).
- Adjust the following settings (example):
- Server Port:
1234
- Enable CORS:
ON
- Serve on Local Network:
ON
- Just-in-Time Model Loading:
ON
(optional) - Auto unload unused models: (optional)
- Server Port:
- Start the server. Once running, LM Studio should display something like:
Ready
Reachable at: http://<YOUR IP ADRESS>:1234
-
Open your terminal (Command Prompt / PowerShell / etc.).
-
Run the following command to create an ngrok tunnel:
ngrok http 1234 --host-header="http://<YOUR IP ADRESS>:1234"
-
- Replace
<YOUR IP ADRESS>
with your machine’s local IP address if different. 1234
should match the port you set in LM Studio.
- Replace
- Check ngrok output:
You should see a URL similar to:
Forwarding
https://<random-string>.ngrok-free.app -> http://localhost:1234
- Open a web browser and navigate to the ngrok forwarding URL you see in the terminal, for example:
https://<random-string>.ngrok-free.app/v1/models
- You should see a JSON response listing the available models, for example:
{
"data": [
{
"id": "gpt4all-model-id",
"object": "model",
"owned_by": "organization_owner"
},
...
]
}
- Other endpoints to test:
POST /v1/chat/completions
POST /v1/completions
GET /v1/models
- In LM Studio, select the Models panel on the right.
- Choose a model (e.g.,
llama-2-7b-chat
,llama-2-13b
, etc.) and click Load. - Wait for the model to finish loading. You’ll see a “Ready” status in LM Studio logs.
- Open Cursor, then go to:
- File → Preferences → Settings (or your OS equivalent).
- In Cursor Settings, locate the OpenAI API Key section.
- Disable “Use OpenAI Key” (if you only want to use the local model).
- In the Override OpenAI Proxy URL (or similar field), enter your ngrok URL:
https://<random-string>.ngrok-free.app
- Click Verify (or any save/confirm button). Cursor should confirm it can reach the local model via ngrok.
- In Cursor, open a file or start a new code file.
- Begin typing or prompt your AI with a query. (use openai turbo model to trigger api)
- The request will be routed to your local LLM through LM Studio, exposed via ngrok.
- CORS Issues: Make sure “Enable CORS” is turned on in LM Studio if you see any cross-origin errors.
- Port Conflicts: If port
1234
is in use, choose another port in LM Studio and update your ngrok command accordingly. - Firewall: Ensure your firewall allows inbound connections on the chosen port.
- Ngrok Free Plan Limits: If you experience connection drops, it may be related to ngrok’s free-tier session limits.
Enjoy running your local LLM in Cursor AI using LM Studio!