openai-whisperlisted
Install: claude install-skill rkz91/coco
# OpenAI Whisper — Speech-to-Text
Transcribe audio files using OpenAI's Whisper model. Two modes available depending on your needs:
| Mode | Latency | Cost | Privacy | Setup |
|------|---------|------|---------|-------|
| Local CLI | Slower (on-device GPU/CPU) | Free | Audio never leaves machine | Install `whisper` binary |
| Cloud API | Fast | Per-minute pricing | Audio sent to OpenAI | `OPENAI_API_KEY` required |
---
## Mode 1: Local CLI
Run Whisper locally with no API key required. Models download to `~/.cache/whisper` on first run.
### Quick Start
```bash
whisper /path/audio.mp3 --model medium --output_format txt --output_dir .
```
### Common Commands
```bash
# Transcribe to text file
whisper /path/audio.mp3 --model medium --output_format txt --output_dir .
# Transcribe with translation to English
whisper /path/audio.m4a --task translate --output_format srt
# Transcribe with specific language
whisper /path/audio.wav --model large --language en --output_format json
```
### Model Selection
| Model | Speed | Accuracy | VRAM |
|-------|-------|----------|------|
| `tiny` | Fastest | Lowest | ~1 GB |
| `base` | Fast | Low | ~1 GB |
| `small` | Medium | Good | ~2 GB |
| `medium` | Slow | Better | ~5 GB |
| `large` | Slowest | Best | ~10 GB |
| `turbo` | Fast | Good (default) | ~6 GB |
### Output Formats
- `txt` — Plain text transcript
- `srt` — SubRip subtitle format with timestamps
- `vtt` — WebVTT subtitle format
- `json` — Detailed JSON with word-level timesta