mini-SWE-agent
A 100-line open-source coding agent that scores 74%+ on SWE-bench Verified.
Pick mini-SWE-agent if you want a tiny, readable, model-agnostic coding-agent baseline to benchmark, extend, or embed in a research pipeline.
Skip it if you want a polished IDE assistant or a managed autonomous-engineer SaaS with dashboards and team features.
mini-SWE-agent is a deliberately minimal AI coding agent from the Princeton/Stanford team behind SWE-bench and SWE-agent. The entire core is roughly 100 lines of Python with a linear, append-only message history and a single tool: bash via subprocess. That brutal simplicity is the point — and it still cracks 74%+ on SWE-bench Verified, putting it in the same league as far more elaborate agent frameworks.
It's aimed at researchers, agent tinkerers, and engineers who want a hackable baseline rather than a productized assistant. Model-agnostic via litellm, OpenRouter, and Portkey, so you can plug in Claude, GPT, Gemini, or local models. Sandbox options span local, Docker, Podman, Singularity, and Bubblewrap, which makes it usable for batch inference on benchmark suites without root-level worries. Apache 2.0 and free.
There are Python bindings for embedding it in larger pipelines, but there's no hosted product, no UI, no SaaS billing, and no support contract. If you want a Cursor-style IDE or a managed autonomous engineer, look elsewhere; if you want a readable reference implementation that you can read in one sitting and extend in an afternoon, this is the one.
The most refreshing thing in agent-land in a while — a credible SWE-bench performer that fits on a single screen. We'd reach for this before any of the heavier autonomous-coder frameworks when the goal is to understand, modify, or benchmark agent behavior rather than ship a polished UX.
— The AI Tool Bible editorial team
Pros
- ✅ Roughly 100 lines of Python — trivially readable and hackable
- ✅ Scores 74%+ on SWE-bench Verified despite the minimal design
- ✅ Model-agnostic via litellm, OpenRouter, and Portkey
- ✅ Sandbox-friendly: Docker, Podman, Singularity, Bubblewrap, local
- ✅ Apache 2.0; backed by the SWE-bench and SWE-agent authors
Cons
- ⚠️ No GUI, IDE integration, or hosted product
- ⚠️ Bash-only tool surface — no built-in browser, search, or planning tools
- ⚠️ You build your own ops layer (logging, cost caps, retries)
- ⚠️ Aimed at researchers more than day-to-day end users
Use cases
Explore related
Compare with similar tools
All in Agents →LangGraph
FeaturedStateful, graph-based agent orchestration from LangChain.
CrewAI
FeaturedPython framework for multi-agent orchestration.
Claude Agent SDK
Anthropic's official SDK for building autonomous Claude agents.
Manus
Generalist agent for research, code, and web tasks.
Devin
Cognition Labs' "autonomous software engineer" agent.
AutoGPT
Open-source platform for building autonomous AI agents.