Whisper API

AI Models

Fast cloud transcription via OpenAI Whisper API

What You Can Do

Cloud transcription — Fast speech-to-text via OpenAI's API (often faster than local for large files)

Language specification — Set expected language for better accuracy on non-English audio

Custom prompts — Provide speaker names, technical terms, or jargon for improved accuracy

Format options — JSON (with timestamps) or plain text output

Auto file output — Saves transcription alongside the original audio file

"Transcribe this interview.mp3 via the API"

"Transcribe this meeting in Spanish"

"Transcribe with context: speakers are Dr. Smith and Prof. Jones discussing quantum computing"

"Get JSON transcription with timestamps"

"Transcribe this earnings call with company-specific terminology hints"

Requires OPENAI_API_KEY environment variable

Custom prompts dramatically improve accuracy for domain-specific content

Language hints help when audio quality is poor or accented

API is typically faster than local processing for files over 10 minutes

Max file size is 25MB — use local Whisper for larger files

JSON format includes segment timestamps useful for time-coded summaries