Running Local LLMs Inside Cursor, Opencode, and Crush

I wanted the best of both worlds: local models for privacy and cost, plus the scaffolding that makes modern tools actually pleasant to use. The plan was straightforward: LM Studio running the model, LiteLLM as the OpenAI‑compatible face, and three clients in the mix: Cursor, Opencode, and Crush. No spend, no drama, just enough configuration to make things click.

The Toolkit

LM Studio: Local model runner with an OpenAI-compatible API.
LiteLLM: OpenAI‑format proxy/gateway so clients can call LLMs via the OpenAI API shape. Here it fronts LM Studio locally.
Opencode & Crush: Clients that can talk to a local endpoint.
Cursor: Powerful, but particular about how it reaches “local.”
No‑IP: A free hostname so a domain resolves to my home IP.

Warming Up: Local Pieces First

I started with LM Studio. Model running, API on. Then LiteLLM, pointed at LM Studio. Nothing cute, just enough to hand out POST /v1/chat/completions like a polite OpenAI stand‑in.

Opencode and Crush didn’t blink. I gave them LiteLLM on a local IP and port, and they answered like good citizens. Prompts in, responses out. No detours, no mystery boxes.

The Twist: Cursor Isn’t Local-Local

Cursor was different. It didn’t just want a public access point; it wanted a domain. The aha moment was seeing the traffic path: requests came through Cursor’s servers, not from my machine. If you want Cursor to “use” your local LLM, you’re really letting it reach you over the public internet.

Fine. I gave it a domain via No-IP for free. One hostname, pointed at my public IP, monthly reactivation ritual. In Cursor’s Settings I changed the OpenAI endpoint to that domain and used a fake key. After that, prompts went through Cursor → my domain → LiteLLM → LM Studio, and answers came back fast enough to keep me in flow.

Unexpected bonus: those runs didn’t burn my Cursor request usage. I’ll take it.

What I Configured

I kept it boring on purpose. Two local configs for Opencode and Crush, and a LiteLLM config that points to LM Studio. These are sanitized; swap placeholders for your environment.

`litellm_config.yaml`

model_list:
  - model_name: local-coder
    litellm_params:
      model: openai/local-coder
      api_base: http://localhost:1234/v1
      api_key: sk-local-xxxx
      timeout: 600
      stream: true
  - model_name: local-small
    litellm_params:
      model: openai/local-small
      api_base: http://localhost:1234/v1
      api_key: sk-local-xxxx
      timeout: 600
      stream: true

general_settings:
  default_model: local-coder
  completion_timeout: 600
  server_settings:
    streaming_supported: true
    proxy_server_timeout: 600
    uvicorn_params:
      timeout_keep_alive: 600

Note: LiteLLM expects model IDs in the provider/model form. The openai/ prefix is required here (e.g., openai/local-coder).

`opencode.json`

{
  "provider": {
    "litellm": {
      "name": "LiteLLM (Local)",
      "npm": "@ai-sdk/openai-compatible",
      "models": {
        "local-coder": {}
      },
      "options": {
        "apiKey": "sk-local-xxxx",
        "baseURL": "http://localhost:8001/v1"
      }
    }
  },
  "$schema": "https://opencode.ai/config.json"
}

`crush.json`

{
  "providers": {
    "litellm": {
      "type": "openai",
      "base_url": "http://localhost:8000/v1",
      "api_key": "sk-local-xxxx",
      "models": [
        { "id": "local-coder", "name": "Local Coder", "context_window": 200000, "default_max_tokens": 4096 },
        { "id": "local-small", "name": "Local Small", "context_window": 200000, "default_max_tokens": 4096 }
      ]
    }
  },
  "$schema": "https://opencode.ai/config.json"
}

Cursor: Settings, Not a Config File

No JSON here. In Cursor’s Settings, change the OpenAI API endpoint to your No-IP domain and use a dummy key. Cursor wants a domain (not just an IP) and will proxy your prompts through its servers to reach you. That’s the toll for using local models inside Cursor’s workflows.

The Payoff

In the end, the editors and CLIs did the scaffolding, snippets, structure and flow, while the heavy lifting ran on models in my own setup. No paid plans. No heroics. Just a tiny detour to get Cursor on board with a free domain. I’ll be passing on using Cursor unless the need is dire but the other tools are worth playing with.

TL;DR

Opencode/Crush: Point to http://localhost:<port>/v1 on LiteLLM. They just work.
Cursor: In Settings, switch the OpenAI endpoint to your No-IP domain and use a fake key. Requests route through Cursor’s servers, but compute stays local and usage doesn’t tick up.
LM Studio + LiteLLM: Enough to power all three.