Skip to content

ucoruh/cursor-local-llm-server-setup-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

How to Run a Local LLM in Cursor AI Using LM Studio and ngrok

1. Install LM Studio

  1. Navigate to LM Studio’s official website.
  2. Download and install the appropriate version for your operating system.

2. Install and Configure ngrok

  1. Install ngrok (Windows example using Chocolatey):
choco install ngrok
  1. Configure your ngrok auth token:
ngrok config add-authtoken <YOUR_NGROK_AUTH_TOKEN>

Replace <YOUR_NGROK_AUTH_TOKEN> with your actual ngrok token.


3. Run LM Studio Server

  1. Open LM Studio.
  2. In LM Studio, click the Settings button (gear icon).
  3. Adjust the following settings (example):
    • Server Port: 1234
    • Enable CORS: ON
    • Serve on Local Network: ON
    • Just-in-Time Model Loading: ON (optional)
    • Auto unload unused models: (optional)

  1. Start the server. Once running, LM Studio should display something like:
Ready
Reachable at: http://<YOUR IP ADRESS>:1234


4. Expose LM Studio with ngrok

  1. Open your terminal (Command Prompt / PowerShell / etc.).

  2. Run the following command to create an ngrok tunnel:

ngrok http 1234 --host-header="http://<YOUR IP ADRESS>:1234"
    • Replace <YOUR IP ADRESS> with your machine’s local IP address if different.
    • 1234 should match the port you set in LM Studio.
  • Check ngrok output:
    You should see a URL similar to:
Forwarding   
https://<random-string>.ngrok-free.app -> http://localhost:1234


5. Verify the LM Studio Endpoints

  1. Open a web browser and navigate to the ngrok forwarding URL you see in the terminal, for example:
https://<random-string>.ngrok-free.app/v1/models
  1. You should see a JSON response listing the available models, for example:
{
  "data": [
    {
      "id": "gpt4all-model-id",
      "object": "model",
      "owned_by": "organization_owner"
    },
    ...
  ]
}

  1. Other endpoints to test:
  • POST /v1/chat/completions
  • POST /v1/completions
  • GET /v1/models


6. Load a Model in LM Studio

  1. In LM Studio, select the Models panel on the right.
  2. Choose a model (e.g., llama-2-7b-chat, llama-2-13b, etc.) and click Load.
  3. Wait for the model to finish loading. You’ll see a “Ready” status in LM Studio logs.


7. Configure Cursor AI to Use Your ngrok Endpoint

  1. Open Cursor, then go to:
    • FilePreferencesSettings (or your OS equivalent).
  2. In Cursor Settings, locate the OpenAI API Key section.
  3. Disable “Use OpenAI Key” (if you only want to use the local model).
  4. In the Override OpenAI Proxy URL (or similar field), enter your ngrok URL:
https://<random-string>.ngrok-free.app
  1. Click Verify (or any save/confirm button). Cursor should confirm it can reach the local model via ngrok.


8. Start Using Cursor with Your Local LLM

  1. In Cursor, open a file or start a new code file.
  2. Begin typing or prompt your AI with a query. (use openai turbo model to trigger api)
  3. The request will be routed to your local LLM through LM Studio, exposed via ngrok.

Troubleshooting Tips

  • CORS Issues: Make sure “Enable CORS” is turned on in LM Studio if you see any cross-origin errors.
  • Port Conflicts: If port 1234 is in use, choose another port in LM Studio and update your ngrok command accordingly.
  • Firewall: Ensure your firewall allows inbound connections on the chosen port.
  • Ngrok Free Plan Limits: If you experience connection drops, it may be related to ngrok’s free-tier session limits.

Enjoy running your local LLM in Cursor AI using LM Studio!

About

CURSOR + NGROK + LMSTUDIO + DEEPSEEK (AND MORE)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published