Pathway

Live data framework for production RAG and streaming ETL pipelines in Python.

Freemium· Community free (BSL 1.1, 8GB/4 cores); Scale and Enterprise tiers with license keyRAGMulti-model

Best for

Pick Pathway if you're building production RAG over constantly changing sources (Drive, SharePoint, Kafka) and need freshness without rebuild jobs.

Skip if

Skip it if you just want a quick prototype RAG over static PDFs - LlamaIndex or a hosted vector DB will get you there faster.

Pathway is a Python-first framework for building real-time data pipelines, with a strong focus on production-grade Retrieval-Augmented Generation. Instead of stitching together a vector store, ingestion job, and orchestration glue, you describe the pipeline once and Pathway keeps it live: documents flowing in from S3, SharePoint, Google Drive, Kafka, or Postgres are continuously parsed, embedded, indexed, and served to your LLM with low-latency freshness.

The Templates library is the practical entry point. It ships ready-made YAML and Python recipes for question-answering RAG, multimodal RAG over PDFs and images, adaptive RAG, private RAG with Ollama, and various ETL/anomaly-detection patterns. The engine itself is a Rust core with a Python API, licensed under BSL 1.1 for self-hosting, which makes it genuinely usable for teams who can't ship data to a hosted vector DB. Pricing scales from a free Community tier (8 GB RAM, 4 cores) through Scale and Enterprise tiers with managed deployment.

Pathway sits closer to the data-engineering end of the RAG stack than tools like LlamaIndex or LangChain. Native connectors cover Kafka, Delta Lake, Airbyte, Postgres, and most major object stores, and the same pipeline handles batch and streaming without rewrites. The trade-off is a learning curve: you're writing dataflow code, not stringing together prompt chains.

Editor's take

Pathway is one of the few RAG frameworks that takes streaming seriously, and the live-indexing story is the real differentiator versus rebuild-on-cron setups. The BSL license and Python API make it a reasonable bet for teams who want to own their stack. Expect to write dataflow code, not glue.

— The AI Tool Bible editorial team