Langchain-Chatchat

Self-hostable RAG and agent framework that wires LangChain to any local open-source LLM and a knowledge base.

Free· Apache-2.0 open source; self-hosted, infra costs onlyRAGMulti-model (GLM-4, Qwen2, Llama 3, etc. via Xinference/Ollama/LocalAI/FastChat)

Visit website →

Best for

Pick Langchain-Chatchat if you need an open-source, on-prem RAG and agent scaffold that can drive local Qwen, GLM or Llama models against a private knowledge base.

Skip if

Skip it if you want a hosted, turnkey RAG product or a polished consumer chatbot without managing Python, GPUs and a vector store yourself.

Langchain-Chatchat is an open-source RAG and agent application platform built on top of LangChain, designed to run fully offline against local LLMs. It bundles document ingestion, vectorization, retrieval, a FastAPI service and a Streamlit web UI so a team can stand up a private knowledge-base chatbot without piping documents through a third party. Out of the box it speaks to Xinference, Ollama, LocalAI, FastChat and One API, and works with GLM-4, Qwen2, Llama 3 and other open-weight models, plus BGE-class embedding models.

It is squarely aimed at developers and infrastructure teams who want a Chinese-and-English RAG stack they can air-gap on their own GPUs, not at end users buying a hosted SaaS. The project is Apache-2.0 and free; the only cost is your own compute (and any optional cloud LLM calls if you wire those up). With ~38k GitHub stars it is one of the most popular Chinese-language LangChain wrappers, and v0.3.x added a meaningful agent layer with tools for SQL chat, arXiv lookup, Wolfram, and text-to-image.

Caveats: this is an integration framework rather than a polished product, so expect to read code, manage Python and CUDA dependencies, and pick your own vector DB (FAISS, Milvus and others are supported). Documentation skews Chinese-first, and release cadence has slowed compared to the project's peak, so treat it as a strong scaffolding starter rather than a turnkey enterprise RAG appliance.

Editor's take

A pragmatic LangChain wrapper that solved the 'private GPT over my own docs' problem early and still holds up as a reference architecture. We would use it as a starting template rather than a finished product, and we would budget time for the Python and CUDA plumbing before any of the agent magic appears.

— The AI Tool Bible editorial team