← ClaudeAtlas

podcast-generationlisted

Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation from content, or integrating with Azure OpenAI Realtime API for real audio output. Covers full-stack implementation from React frontend to Python FastAPI backend with WebSocket streaming.
aiskillstore/marketplace · ★ 329 · AI & Automation · score 79
Install: claude install-skill aiskillstore/marketplace
# Podcast Generation with GPT Realtime Mini Generate real audio narratives from text content using Azure OpenAI's Realtime API. ## Quick Start 1. Configure environment variables for Realtime API 2. Connect via WebSocket to Azure OpenAI Realtime endpoint 3. Send text prompt, collect PCM audio chunks + transcript 4. Convert PCM to WAV format 5. Return base64-encoded audio to frontend for playback ## Environment Configuration ```env AZURE_OPENAI_AUDIO_API_KEY=your_realtime_api_key AZURE_OPENAI_AUDIO_ENDPOINT=https://your-resource.cognitiveservices.azure.com AZURE_OPENAI_AUDIO_DEPLOYMENT=gpt-realtime-mini ``` **Note**: Endpoint should NOT include `/openai/v1/` - just the base URL. ## Core Workflow ### Backend Audio Generation ```python from openai import AsyncOpenAI import base64 # Convert HTTPS endpoint to WebSocket URL ws_url = endpoint.replace("https://", "wss://") + "/openai/v1" client = AsyncOpenAI( websocket_base_url=ws_url, api_key=api_key ) audio_chunks = [] transcript_parts = [] async with client.realtime.connect(model="gpt-realtime-mini") as conn: # Configure for audio-only output await conn.session.update(session={ "output_modalities": ["audio"], "instructions": "You are a narrator. Speak naturally." }) # Send text to narrate await conn.conversation.item.create(item={ "type": "message", "role": "user", "content": [{"type": "input_text", "text": prompt}] }) await conn.resp