transcribe
SolidTranscribe audio and video files using the configured speech-to-text provider
AI & Automation 648 stars
94 forks Updated today MIT
Install
Quality Score: 89/100
Stars 20%
Recency 20%
Frontmatter 20%
Documentation 15%
Issue Health 10%
License 10%
Description 5%
Skill Content
Transcribe audio and video files using the configured speech-to-text provider. Supports multiple STT providers including OpenAI Whisper, Deepgram, and Google Gemini — the active provider is selected in Settings under Speech-to-Text (`services.stt`).
## Usage Notes
- The tool accepts a `file_path` (absolute path to a local audio or video file) to transcribe.
- Supported formats: any video (mp4, mov, etc.) or audio (mp3, wav, m4a, etc.) file.
- For video files, audio is automatically extracted via ffmpeg before transcription.
- Large files are automatically split into chunks for processing.
- If no STT provider credentials are configured, the tool will return an error with setup instructions.
- The STT provider (`services.stt`) is shared between transcription and telephony call paths.
## Maintenance
When adding or modifying an STT provider, follow the onboarding checklist at `assistant/docs/stt-provider-onboarding.md`. That document covers the daemon catalog, config schema, adapter wiring, client catalog parity, and required tests.
Details
- Author
- vellum-ai
- Repository
- vellum-ai/vellum-assistant
- Created
- 4 months ago
- Last Updated
- today
- Language
- TypeScript
- License
- MIT
Integrates with
Similar Skills
Semantically similar based on skill content — not just same category
AI & Automation Listed
stt
Transcribe audio to text using OpenAI Whisper through the verging.ai proxy API.
353 Updated today
aiskillstore AI & Automation Solid
omniroute-stt
Speech-to-text via OmniRoute using OpenAI /v1/audio/transcriptions format with auto-fallback across Whisper, AssemblyAI, Deepgram, Azure STT. Use when the user wants transcription of audio files or real-time speech recognition.
6,067 Updated today
diegosouzapw AI & Automation Listed
transcribe
Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings.
1 Updated today
HGGodhand33