📖 The AI Tool Bible

OneKE

Open-source multi-agent framework for schema-guided knowledge extraction from documents.

Free· Free, MIT-licensed; you pay for LLM API calls or self-hosted computeRAGMulti-model (OneKE-13B, LLaMA3, Qwen2.5, GPT, DeepSeek-R1)
Visit website →
Best for

Pick OneKE if you're building a domain knowledge graph and want a flexible, open-source extraction pipeline that runs against either API or local LLMs.

Skip if

Skip it if you need a managed SaaS extraction API, English-first docs, or a turnkey solution without DevOps work.

OneKE is a dockerized, MIT-licensed knowledge extraction system from the ZJU-NLP lab and Ant Group's OpenSPG project. It uses a multi-agent LLM pipeline (schema agent, extraction agent, reflection agent) to pull structured facts out of plain text, HTML, PDF, Word, JSON, and TXT files, covering NER, relation extraction, event extraction, triple extraction, and open-ended information extraction. Output can be assembled directly into a visualizable knowledge graph.

The project's differentiator is flexibility on both ends: you can plug in OpenAI or DeepSeek-R1 via API, or run it fully locally against LLaMA3, Qwen2.5, ChatGLM4, MiniCPM3, or the bundled OneKE-13B model with optional vLLM acceleration. Schemas can be default, predefined, or self-deduced by the agent, and case-retrieval plus reflection loops let you trade speed for accuracy. It's aimed at researchers and engineers building domain-specific KGs who don't want to wire up extraction infrastructure from scratch.

Deployment is via Docker or Conda, with a Streamlit web UI for interactive runs and a HuggingFace Spaces demo. As an open-source academic-led project, the polish lags commercial extraction APIs, and the Yuque-hosted user guide is mostly Chinese, but the breadth of supported tasks, models, and file types is rare at this license tier.

Editor's take

OneKE is one of the more serious open-source attempts at productionizing LLM-based information extraction, and the multi-agent schema/reflection design is genuinely useful. The catch is that it's an academic-flavored release; expect to read Chinese docs and do real integration work. Worth it if you'd otherwise glue together LangChain agents yourself.

— The AI Tool Bible editorial team

Pros

  • Covers NER, RE, EE, and triple extraction in one framework
  • Works with API models or fully local LLMs via vLLM
  • Ingests PDF, Word, HTML, JSON, and plain text out of the box
  • Multi-agent schema + reflection loop improves extraction quality
  • MIT license with Docker and Streamlit UI included

Cons

  • ⚠️ Documentation is primarily Chinese and scattered across Yuque/GitHub
  • ⚠️ Self-hosting and tuning agents is non-trivial for non-researchers
  • ⚠️ No managed cloud offering; you bring the infrastructure
  • ⚠️ Quality depends heavily on the underlying LLM you wire in

Use cases

knowledge-graph-constructionnamed-entity-recognitionrelation-extractionevent-extractiondocument-parsing

Explore related

Compare with similar tools

All in RAG