transcribing-meeting-recordingslisted
Install: claude install-skill desmondc9/agent-skills
# Transcribing Meeting Recordings (with Speaker Identification)
## Overview
End-to-end pipeline: video/audio file → cleaned SRT transcript with timestamps and **real participant names**. Uses **whisperx** (faster-whisper + alignment + pyannote diarization) for ASR + speaker clustering, plus a Microsoft Teams / Zoom **video-frame trick** to map anonymized `SPEAKER_XX` clusters onto real names.
## When to Use
- User provides a meeting recording (`.mp4` / `.mov` / `.mkv` / `.wav` / `.m4a` / `.mp3`) and wants a transcript with timestamps.
- User asks for SRT/VTT subtitles, captions, 字幕, or "transcript with speaker labels".
- Multilingual content (especially Chinese with English technical terms mixed in — whisper large-v3 handles this well with the right prompt).
- User wants to know **who said what** in a meeting, not just the words.
**Do NOT use for:**
- Live / real-time / streaming transcription — whisperx is batch-only.
- Summarization without producing a transcript artifact.
- Audio-only files where speaker identification doesn't matter — then skip diarization and the Teams-frame step.
## Quick Reference
| Stage | Command |
| ------------------------------ | ---------------------------------------------------------------------------------------- |
| 1. Extract WAV | `ffmpeg -i in.mp4 -vn -ac 1 -ar 16000 -c:a pcm_s16le audio.wav`