Speech-to-Text
AI 모델
API 비용 없이 오디오를 로컬에서 전사
- 모델 다운로드 후 완전 오프라인
- SRT·VTT·JSON·텍스트 출력
- 모든 언어를 영어로 번역
할 수 있는 것
Local transcription — Convert speech to text completely offline, no API key required
Multiple model sizes — tiny (fastest) → base → small → medium → large (most accurate)
Output formats — Plain text, SRT subtitles, VTT captions, or JSON with timestamps
Translation mode — Translate any language audio directly to English text
Wide format support — WAV, MP3, M4A, FLAC, OGG, and more
Auto model caching — Downloads models on first use, fully offline after that시도해볼 질문
"Transcribe this podcast.mp3 using the medium model"
"Convert this interview to SRT subtitles"
"Transcribe my voice memo and translate it to English"
"Generate VTT captions for this video's audio track"
"Use the large model for this important lecture recording"
"Get JSON output with word-level timestamps"전문가 팁
tiny = fast but rough, small = good balance, medium = professional quality, large = maximum accuracy
First run downloads the model (40MB–3GB depending on size), then fully offline
SRT/VTT formats include timestamps for subtitle syncing
Translation mode outputs English regardless of input language
JSON output includes segment-level and word-level timing data
Works completely offline after initial model download — great for privacy