OpenClaw Install

Voice Transcribe

Voice Transcribe skill brings OpenAI's Whisper speech recognition model to your Claude agent, enabling accurate transcription of voice messages, meeting recordings, podcast clips, and any audio file. Supports MP3, MP4, WAV, FLAC, OGG, WebM, and M4A formats up to 25MB per file.

Whisper handles 99 languages with automatic language detection — no need to specify the language in advance. Accuracy is production-grade for most accents and recording conditions, including moderate background noise. The skill outputs plain text transcripts, timestamped segments (with start/end times per sentence), or speaker-separated transcripts when multiple distinct voices are present.

The most common integration: Telegram users receive voice messages from contacts and want text summaries. Set up an automation where incoming voice messages are piped through Voice Transcribe → Summarize → delivered as a text reply. Meeting recordings from Zoom or Google Meet can be dropped into a folder and auto-transcribed on a schedule, with summaries posted to Notion or Slack.

The skill runs against the Whisper API endpoint (included in standard OpenAI API access) — no separate key needed if you already use OpenAI. Local Whisper execution mode is available for privacy-sensitive audio, processing on your machine without sending audio to any external server. Journalists, researchers, and remote teams rely on this skill to make audio content searchable and actionable.

Installation

bash
clawhub install voice-transcribe
voicetranscriptionWhisperaudio

Install: clawhub install voice-transcribe

We'll configure this skill for you

We'll install Voice Transcribe and connect it to your OpenClaw

Get Started