📖 The AI Tool Bible

Portkey AI Gateway

Open-source AI gateway that routes a single API call across 1,600+ LLMs with caching, fallbacks, and observability.

Freemium· Free tier available; paid Pro and Enterprise plans (contact for pricing)AgentsMulti-model (1,600+ LLMs)
Visit website →
Best for

Pick Portkey if you're shipping LLM features in production across multiple providers and need routing, caching, and cost visibility without writing it yourself.

Skip if

Skip it if you call one model from one provider in a single app and don't care about fallbacks, caching, or per-team usage tracking.

Portkey AI Gateway is a production-grade proxy layer that sits between your application and the LLM providers you actually use. With one unified API it gives you access to 1,600+ models across OpenAI, Anthropic, Google, Cohere, Mistral, Bedrock, Azure, open-source endpoints, and more, plus the routing primitives serious teams need: load balancing, automatic fallbacks, conditional routing, retries, timeouts, semantic and simple caching, and canary deployments for new models.

The gateway itself is open source (Apache-2.0, 10K+ GitHub stars) and can be self-hosted via Docker or `npx @portkey-ai/gateway`, but most teams run it through Portkey's managed cloud, which layers on a virtual-key vault, real-time observability, cost tracking, prompt management, and guardrails. It's aimed at engineering teams running multi-provider LLM workloads at scale, the kind of org that's tired of writing bespoke retry/fallback code for every SDK and wants budget controls and audit trails without building them in-house.

Pricing is freemium: a generous free tier covers solo developers and prototypes, paid plans add higher rate limits and team features, and enterprise tiers cover SSO, SOC 2, on-prem, and custom routing. It plays well with LangChain, LlamaIndex, Vercel AI SDK, and any OpenAI-compatible client by changing only the base URL.

Editor's take

Portkey is the most mature open-source AI gateway in the market right now, and the routing and observability story is genuinely useful once you have more than one provider in play. The self-host path is real, not theater, which makes it an easy recommendation even for teams that are skeptical of managed LLM middleware.

— The AI Tool Bible editorial team

Pros

  • Single unified API for 1,600+ models across every major provider
  • Open-source core (Apache-2.0) with optional managed cloud
  • Built-in load balancing, fallbacks, retries, and semantic caching
  • Real observability: latency, cost, and token usage per request
  • OpenAI-compatible, drops in by changing the base URL

Cons

  • ⚠️ Adds another network hop and a vendor dependency for managed users
  • ⚠️ Advanced features (guardrails, prompt mgmt) require the paid platform
  • ⚠️ Self-host setup still needs you to wire your own dashboards if you skip cloud

Use cases

llm-routingfallbacks-and-retriessemantic-cachingcost-observabilityprompt-managementmulti-provider-llm

Explore related

Compare with similar tools

All in Agents