LM Studio

✓ Editorially verified

Desktop app for discovering, downloading, and running open-weight LLMs locally with an OpenAI-compatible server.

Freemium· Free for personal and commercial use; paid LM Studio for Work / Enterprise tierAgentsMulti-model (gpt-oss, Qwen3, Gemma, DeepSeek-R1, Llama, others)

Visit website →

Best for

Pick LM Studio if you want the fastest path from 'I have a laptop' to 'I have a private OpenAI-compatible endpoint serving Qwen or gpt-oss.'

Skip if

Skip it if you need cloud-scale throughput, multi-tenant serving, or a fully open-source stack you can audit end-to-end.

LM Studio is a cross-platform desktop application (macOS, Windows, Linux) that turns any reasonably modern laptop or workstation into a private inference rig for open-weight language models. It bundles a model browser backed by Hugging Face, a chat UI, a llama.cpp and Apple MLX runtime, and a local OpenAI-compatible HTTP server so existing SDK code can be pointed at localhost with a one-line base URL change. Recent builds ship with first-class support for gpt-oss, Qwen3, Gemma, DeepSeek-R1, and other current open-weight families, plus an MCP client for tool use.

It is free for personal and commercial work, with a separate paid enterprise tier that adds centralized device management, SSO, and offline licensing. The audience is developers who want to prototype against local models without an API bill, privacy-sensitive teams handling regulated data, and tinkerers running quantized models on consumer GPUs or Apple Silicon. A headless `llmster` runtime and the `lms` CLI extend the same stack to servers, while official JavaScript and Python SDKs (npm/pip) wrap the local server for agent and RAG workloads.

LM Studio itself is not open source, but it leans heavily on open runtimes (llama.cpp, MLX) and only runs open-weight models you download yourself. The main caveats are hardware-bound performance, no first-party fine-tuning, and a desktop-first UX that still feels lighter than server-grade tooling like vLLM or Ollama-on-Kubernetes.

Editor's take

LM Studio is the friendliest on-ramp to local inference we know of, and the OpenAI-compatible server is the feature that makes it stick in real workflows. It's not a replacement for vLLM or TGI in production, but for desktop prototyping, offline demos, and privacy-bound work it's hard to beat at the price of zero.

— The AI Tool Bible editorial team