Skyvern
✓ Editorially verifiedAI browser agent that automates web workflows from natural-language instructions, with CAPTCHA and 2FA handling built in.
Pick Skyvern if you need to automate complex multi-step workflows on sites that lack APIs and you want an LLM-driven agent rather than brittle selector-based RPA.
Skip it if the target service already exposes a clean API, or if your compliance posture forbids AGPL source and you can't use the hosted plan.
Skyvern is an AI-powered browser automation platform that runs multi-step web tasks the way a human would, without needing fragile selectors or per-site scripts. You describe the workflow in natural language (or record a browser session, upload an SOP, or build it visually) and Skyvern's agent navigates pages, fills forms, extracts structured JSON/CSV, and handles CAPTCHAs and 2FA along the way. It exposes Python and TypeScript SDKs and is Model Context Protocol-ready, so it can plug into larger agent stacks.
It is aimed squarely at companies that need to automate against sites with no usable API: accounts-payable portals, insurance carriers, lead-gen sources, government forms, browser-based QA. Pricing starts free with 5,000 monthly credits and no card required, then scales to enterprise. The platform is SOC2 Type II and HIPAA compliant with a 99.9% uptime SLA, and the core engine is open source on GitHub under AGPL-3.0 (20k+ stars) and self-hostable via Docker.
Skyvern is LLM-agnostic, supporting OpenAI, Anthropic, Google Gemini, or local Ollama models, and ships native integrations with Zapier, Make, n8n, Clay, Salesforce, HubSpot, and Workday. The AGPL license is the main caveat to flag for commercial users embedding the source rather than using the hosted product.
Skyvern is one of the more credible browser-agent platforms: a serious open-source core, model-agnostic backend, and enterprise compliance check-marks rather than a thin demo. The CAPTCHA/2FA handling and SDK story make it a real alternative to legacy RPA for API-less workflows.
— The AI Tool Bible editorial team
Pros
- ✅ Handles CAPTCHA and 2FA natively, unlike most RPA tools
- ✅ Open source (AGPL-3.0) and self-hostable via Docker
- ✅ LLM-agnostic: works with OpenAI, Anthropic, Gemini, or Ollama
- ✅ SOC2 Type II + HIPAA with enterprise integrations (Salesforce, Workday)
- ✅ Free tier with 5,000 credits and no credit card required
Cons
- ⚠️ AGPL-3.0 license is restrictive for closed-source commercial embedding
- ⚠️ Credit-based pricing can get expensive at high volume
- ⚠️ Browser-agent runs are slower and less deterministic than real APIs
- ⚠️ Effectiveness still varies on heavily JS-driven or anti-bot-hardened sites
Use cases
Explore related
Compare with similar tools
All in Agents →LangGraph
FeaturedStateful, graph-based agent orchestration from LangChain.
CrewAI
FeaturedPython framework for multi-agent orchestration.
Claude Agent SDK
Anthropic's official SDK for building autonomous Claude agents.
Manus
Generalist agent for research, code, and web tasks.
Devin
Cognition Labs' "autonomous software engineer" agent.
AutoGPT
Open-source platform for building autonomous AI agents.