add-voice-transcriptionlisted

Add voice message transcription to NanoClaw using OpenAI's Whisper API. Automatically transcribes WhatsApp voice notes so the agent can read and respond to them.
crypdick/pynchy · ★ 10 · AI & Automation · score 79

Install: claude install-skill crypdick/pynchy

# Add Voice Message Transcription This skill adds automatic voice message transcription using OpenAI's Whisper API. When users send voice notes in WhatsApp, they'll be transcribed and the agent can read and respond to the content. **UX Note:** When asking the user questions, prefer using the `AskUserQuestion` tool instead of just outputting text. This integrates with Claude's built-in question/answer system for a better experience. ## Prerequisites **USER ACTION REQUIRED** **Use the AskUserQuestion tool** to present this: > You'll need an OpenAI API key for Whisper transcription. > > Get one at: https://platform.openai.com/api-keys > > Cost: ~$0.006 per minute of audio (~$0.003 per typical 30-second voice note) > > Once you have your API key, we'll configure it securely. Wait for user to confirm they have an API key before continuing. --- ## Implementation ### Step 1: Add OpenAI Dependency Read `package.json` and add the `openai` package to dependencies: ```json "dependencies": { ...existing dependencies... "openai": "^4.77.0" } ``` Then install it. **IMPORTANT:** The OpenAI SDK requires Zod v3 as an optional peer dependency, but NanoClaw uses Zod v4. This conflict is guaranteed, so always use `--legacy-peer-deps`: ```bash npm install --legacy-peer-deps ``` ### Step 2: Create Transcription Configuration Create a configuration file for transcription settings (without the API key): Write to `.transcription.config.json`: ```json { "provider": "openai",