RAG
Retrieval-augmented generation, vector stores, indexers.
70 tools
RAG isn't a model, it's an architecture — retrieve, augment, generate. The choice is between frameworks that orchestrate the retrieval and the vector stores underneath.
Includes RAG frameworks (LlamaIndex, LangChain), managed vector databases (Pinecone), open-source vector stores (Weaviate, Chroma, Vespa), and hybrid-search engines.
Pick LlamaIndex when retrieval quality is the bottleneck. Pick Pinecone for zero-ops production. Pick Weaviate or Chroma for self-hosted or budget-conscious. Pick Vespa at scale beyond a few million docs.
Feast
Open-source feature store that serves consistent features to ML training and online inference, with RAG vector search built in.
FinChat (Fiscal.ai)
AI copilot for equity research that reads filings, transcripts, and KPI tables across 100,000+ public companies.
Firecrawl
Web scraping and crawling API that returns LLM-ready markdown, JSON, or structured data from any URL.
FutureHouse Platform
Multi-agent AI research stack for scientists, with retrieval over 175M+ papers, patents, and trials.
GaliChat
No-code AI chatbot builder that trains on your website content for support and lead capture.
Genei
AI research assistant that summarizes PDFs and web pages and answers questions across your document library.
Graphify
Open-source on-device knowledge graph engine that turns code, docs, papers, meetings and images into a queryable graph.
Graphiti
Open-source temporal knowledge graph framework for building agent memory that updates in real time.
Haystack
Open-source Python framework from deepset for building production RAG pipelines and LLM agents.
HelixDB
Unified graph-and-vector database built for AI agent memory and GraphRAG.
Humata.ai
Chat-with-your-documents RAG tool with citation-backed answers across uploaded PDFs and files.
Kotaemon
Open-source RAG UI for chatting with your own documents, locally or self-hosted.
LanceDB
Open-source multimodal lakehouse and vector database built for AI training and retrieval at petabyte scale.
LangExtract
Google's open-source Python library for LLM-driven structured extraction from unstructured text, with source-grounded outputs.
Langchain-Chatchat
Self-hostable RAG and agent framework that wires LangChain to any local open-source LLM and a knowledge base.
MaxKB
Open-source enterprise RAG and agent platform with built-in workflow engine and multi-LLM support.
NotebookLM
Google's source-grounded research notebook that turns your documents into chats, briefs, and AI-hosted podcasts.
OneKE
Open-source multi-agent framework for schema-guided knowledge extraction from documents.
OpenDataLoader PDF
Open-source PDF parser built for RAG pipelines, with reading-order detection, table extraction, and bounding-box citations.
PageIndex
Vectorless reasoning-based retrieval for long documents, with traceable, auditable answers.
Pathway
Live data framework for production RAG and streaming ETL pipelines in Python.
Perplexity AI
Conversational answer engine that cites its sources by default.
PostgresML
PostgreSQL extension that runs embeddings, vector search, and LLM inference inside your database.
PrivateGPT
Production-ready, air-gapped RAG framework for querying your documents with local LLMs.