add-voice-transcription
SolidAdd voice message transcription to NanoClaw using OpenAI's Whisper API. Automatically transcribes WhatsApp voice notes so the agent can read and respond to them.
AI & Automation 29,820 stars
12908 forks Updated today MIT
Install
Quality Score: 93/100
Stars 20%
Recency 20%
Frontmatter 20%
Documentation 15%
Issue Health 10%
License 10%
Description 5%
Skill Content
# Add Voice Transcription
This skill adds automatic voice message transcription to NanoClaw's WhatsApp channel using OpenAI's Whisper API. When a voice note arrives, it is downloaded, transcribed, and delivered to the agent as `[Voice: <transcript>]`.
## Phase 1: Pre-flight
### Check if already applied
Check if `src/transcription.ts` exists. If it does, skip to Phase 3 (Configure). The code changes are already in place.
### Ask the user
Use `AskUserQuestion` to collect information:
AskUserQuestion: Do you have an OpenAI API key for Whisper transcription?
If yes, collect it now. If no, direct them to create one at https://platform.openai.com/api-keys.
## Phase 2: Apply Code Changes
**Prerequisite:** WhatsApp must be installed first (`skill/whatsapp` merged). This skill modifies WhatsApp channel files.
### Ensure WhatsApp fork remote
```bash
git remote -v
```
If `whatsapp` is missing, add it:
```bash
git remote add whatsapp https://github.com/qwibitai/nanoclaw-whatsapp.git
```
### Merge the skill branch
```bash
git fetch whatsapp skill/voice-transcription
git merge whatsapp/skill/voice-transcription || {
git checkout --theirs package-lock.json
git add package-lock.json
git merge --continue
}
```
This merges in:
- `src/transcription.ts` (voice transcription module using OpenAI Whisper)
- Voice handling in `src/channels/whatsapp.ts` (isVoiceMessage check, transcribeAudioMessage call)
- Transcription tests in `src/channels/whatsapp.test.ts`
- `openai` npm de...
Details
- Author
- qwibitai
- Repository
- qwibitai/nanoclaw
- Created
- 4 months ago
- Last Updated
- today
- Language
- TypeScript
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
AI & Automation Listed
add-voice-transcription
Add voice message transcription to Deus using OpenAI's Whisper API. Automatically transcribes WhatsApp voice notes so the agent can read and respond to them.
43 Updated today
sliamh11 AI & Automation Solid
add-image-vision
Add image vision to NanoClaw agents. Resizes and processes WhatsApp image attachments, then sends them to Claude as multimodal content blocks.
29,820 Updated today
qwibitai AI & Automation Listed
use-local-whisper
Use when the user wants local voice transcription instead of OpenAI Whisper API. Switches to whisper.cpp running on Apple Silicon. WhatsApp only for now. Requires voice-transcription skill to be applied first.
43 Updated today
sliamh11