omniroute-stt

Solid

Speech-to-text via OmniRoute using OpenAI /v1/audio/transcriptions format with auto-fallback across Whisper, AssemblyAI, Deepgram, Azure STT. Use when the user wants transcription of audio files or real-time speech recognition.

AI & Automation 5,612 stars 967 forks Updated today MIT

Install

Quality Score: 91/100

Stars 20%

100

Recency 20%

100

Frontmatter 20%

70

Documentation 15%

54

Issue Health 10%

50

License 10%

100

Description 5%

100

Skill Content

# OmniRoute — Speech-to-Text Requires `OMNIROUTE_URL` and `OMNIROUTE_KEY`. See [entry-point SKILL](https://raw.githubusercontent.com/diegosouzapw/OmniRoute/main/skills/omniroute/SKILL.md) for setup. ## Endpoints - `POST $OMNIROUTE_URL/v1/audio/transcriptions` — multipart upload, returns text - `POST $OMNIROUTE_URL/v1/audio/translations` — transcribe + translate to English ## Discover ```bash curl $OMNIROUTE_URL/v1/models/stt | jq '.data[]' ``` ## Example ```bash curl -X POST $OMNIROUTE_URL/v1/audio/transcriptions \ -H "Authorization: Bearer $OMNIROUTE_KEY" \ -F "file=@audio.mp3" \ -F "model=whisper-1" \ -F "response_format=verbose_json" ``` Response: `{ text, language, duration, segments?:[{ start, end, text }] }` ## Supported formats Audio: `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `wav`, `webm`. Response formats: `json`, `text`, `srt`, `verbose_json`, `vtt`. ## Errors - `400 invalid_file_format` → unsupported audio format - `400 file_too_large` → exceeds provider limit (usually 25MB) - `503` → provider unavailable; try another model in `/v1/models/stt`

Details

Author: diegosouzapw
Repository: diegosouzapw/OmniRoute
Created: 3 months ago
Last Updated: today
Language: TypeScript
License: MIT

Integrates with

OpenAI · AI Anthropic · AI Azure · Cloud

Similar Skills

Semantically similar based on skill content — not just same category

AI & Automation Solid

omniroute-tts

Text-to-speech via OmniRoute using OpenAI /v1/audio/speech format with auto-fallback across OpenAI TTS, ElevenLabs, Azure Neural, Google Cloud TTS. Use when the user wants spoken audio output from text.

5,612 Updated today

AI & Automation Solid

omniroute-chat

Chat / code generation via OmniRoute using OpenAI /v1/chat/completions or Anthropic /v1/messages format with SSE streaming, auto-fallback combos, RTK token saver, and 207+ providers. Use when the user wants to ask an LLM, generate code, summarize text, or run prompts through OmniRoute.

5,612 Updated today

AI & Automation Solid

omniroute

Entry point for OmniRoute — local/remote AI gateway with OpenAI-compatible REST for chat, image, TTS, STT, embeddings, web search, web fetch, MCP, A2A. Use when the user mentions OmniRoute, OMNIROUTE_URL, or wants AI without writing provider boilerplate. This skill covers setup + indexes capability skills; fetch the relevant capability SKILL.md from the URLs below when needed.

5,612 Updated today

AI & Automation Listed

speech-to-text

Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation, multi-language, timestamps. Use for: meeting transcription, subtitles, podcast transcripts, voice notes. Triggers: speech to text, transcription, whisper, audio to text, transcribe audio, voice to text, stt, automatic transcription, subtitles generation, transcribe meeting, audio transcription, whisper ai

335 Updated today

AI & Automation Solid

omniroute-embeddings

Embeddings via OmniRoute using OpenAI /v1/embeddings format with auto-fallback across text-embedding-3-large, Voyage, Cohere, Gemini embeddings, Jina. Use when the user needs vector embeddings for RAG, similarity search, or clustering.

5,612 Updated today