Modal
Serverless GPUs and infra for training & serving ML.
Modal is a serverless platform for ML workloads — pip-install Modal in a Python script, and you can call functions that run on H100s with zero infrastructure. Great for fine-tuning runs, batch inference, and custom serving.
Pros
- ✅ Zero-ops GPU access
- ✅ Python-native
- ✅ Auto-scaling
Cons
- ⚠️ Cold start latency on big models
- ⚠️ Bills can surprise at scale
Use cases
serverless GPUfine-tuningbatch inference
Compare with similar tools
All in Fine-tuning →Compare
Modal vs Together AI
Side-by-side breakdown
Compare
Modal vs Replicate
Side-by-side breakdown
Compare
Modal vs OpenAI Fine-tuning
Side-by-side breakdown
Together AI
FeaturedFine-tuning
8.6
Fine-tune & serve open-weight models (Llama, Mistral, DeepSeek).
Paid· Pay-per-token; fine-tuning per-tokenopen modelsfine-tuning
Replicate
Fine-tuning
8.5
One-API platform for running and fine-tuning open-source models.
Paid· Pay-per-second of GPUmodel hostingfine-tuning
OpenAI Fine-tuning
Fine-tuning · GPT-4o-mini / GPT-3.5
8.4
Fine-tune GPT-4o-mini and friends on your own data.
Paid· Training $25/1M tokens; usage at standard ratesstyleformat
Anyscale
Fine-tuning
7.9
Ray-powered platform for training, serving, and scaling LLMs.
Paid· Enterprise/contact salesdistributed trainingRay
Lamini
Fine-tuning
7.7
Memory-tuning platform for grounding LLMs in your facts.
Paid· Enterprise pricingenterprise FTfactual recall