Audio
Voice cloning, music generation, speech-to-text.
48 tools
AI audio has split cleanly into three lanes: speech synthesis (TTS + voice cloning), music generation, and speech-to-text — each with a clear leader.
Covers voice cloning and TTS (ElevenLabs, Resemble.ai, Murf), AI music generation (Suno, Udio), and speech-to-text (AssemblyAI, Whisper).
Pick ElevenLabs for voice quality. Pick Suno or Udio for AI music. Pick AssemblyAI when you need diarisation and timestamps; pick Whisper when you can self-host and want zero cost.
LOVO AI
Text-to-speech and voice cloning platform with 500+ voices, an integrated video editor, and a developer API.
Loudly
AI music generator with royalty-free output, stem splitting, and distribution to Spotify and friends.
MockingBird
Open-source Mandarin-first voice cloning that mimics a speaker from a 5-second sample.
Mubert
AI music generator that spits out royalty-free background tracks for video, podcast, and app use.
Murf AI
Studio-grade text-to-speech and real-time voice agents with 200+ voices across 35+ languages.
Otter.ai
AI meeting notetaker that transcribes calls, summarizes them, and pulls out action items in real time.
Read AI
AI meeting copilot that transcribes, summarizes, and surfaces action items across Zoom, Meet, and Teams.
Remusic
All-in-one AI music studio that bundles song generation, voice cloning, stem splitting, and karaoke tools.
Respeecher
Studio-grade AI voice cloning and TTS used by Hollywood productions for speech-to-speech and dubbing work.
Scribbl
Bot-free AI meeting recorder, transcriber, and summarizer for Google Meet.
Sesame
Conversational voice AI aiming to cross the uncanny valley with context-aware, emotionally aware speech.
Soundful
Template-driven AI music generator that spits out royalty-free, commercially licensable tracks in seconds.
Soundraw
AI music generator that spits out royalty-free, customizable tracks by genre and mood.
Stable Audio
Stability AI's generative audio model family for music and sound effects, with open weights for the smaller variants.
Transgate
Pay-as-you-go AI transcription and translation with summaries, highlights, and chat over your audio.
Veritone Voice
Enterprise-grade voice cloning and synthesis platform built for broadcasters, studios, and large media operations.
Vibe
Offline desktop transcription app powered by Whisper, with diarization, batch processing, and an HTTP API.
Voicebox
Open-source desktop voice studio for local cloning, dictation, and giving MCP agents a voice.
WellSaid
Enterprise-grade AI text-to-speech built on licensed voice actor recordings.
WellSaid Labs
Enterprise AI text-to-speech studio built on licensed voice-actor recordings, with a director-style editor for pacing and pronunciation.
WhisperAPI
Hosted OpenAI Whisper transcription with a pay-as-you-go API and drop-in web dashboard.
Wispr Flow
System-wide voice-to-text dictation that auto-edits filler words and learns your jargon.
ZenMic
Text-to-podcast generator with multi-speaker AI voices and RSS publishing.
iSpeech
Veteran cloud TTS and speech recognition API with broad SDK and language coverage.