Download 📥, clip ✂️, transcribe 📝, and thumbnail 🎞️ YouTube videos—fast and reliably—using yt-dlp, ffmpeg, and Whisper.
- Download Video (MP4/WebM/MKV/AVI)
- Download Audio (MP3)
- Extract Clip at a start time for a set duration
- Download Thumbnail (auto) or extract a frame at a timecode
- Transcribe audio to text (OpenAI Whisper)
Outputs are saved to your Downloads folder:
- Video:
~/Downloads/<Title> [<ID>].<ext> - Audio (MP3):
~/Downloads/<Title>.mp3 - Clip:
~/Downloads/<VideoID>_clip.mp4 - Thumbnail:
~/Downloads/<VideoID>_thumbnail.jpg - Transcript:
~/Downloads/transcript.txtortranscript_<ID>.txt
Table of Contents
🚀 Quick Start
- Copy a YouTube Video ID (the part after
v=), e.g.,mkLjMu8SKqk. - Paste it into the YouTube Video ID box. A preview thumbnail should appear.
- Select what you want to do: Download Transcript Download Audio (MP3) Download Video Extract Clip Download Thumbnail
- Click Download and Transcribe to run the selected tasks sequentially.
🖥️ The Interface
- YouTube Video ID — enter the 11-character ID. The app builds the full URL internally.
- Status indicators — red while pending, green on success for: Thumbnail, Clip, Full Video, Audio, Transcript.
- Progress — a gauge and message line reflect the current step. The big log box keeps a running history.
🎛️ Modes & Options
1) Download Transcript 📝
Grabs audio, runs it through OpenAI Whisper (base model), and writes transcript.txt to your Downloads folder.
- Temporary MP3 is deleted when done to keep things tidy.
- If you want to keep the MP3 from transcription, that cleanup can be disabled in code.
2) Download Audio (MP3) 🎧
Prefers bestaudio[ext=m4a], falls back to bestaudio, and finally to best (muxed). FFmpeg extracts MP3 @ 192 kbps.
- Robust against “Requested format is not available.” because it accepts any viable source.
- Output lands in
~/Downloads.
3) Download Video 📥
Prefers an MP4 pair (up to 1080p), otherwise any video+audio pair, and as a final fallback a single best file. Tries to merge to MP4; if that isn’t feasible (e.g., WebM-only), it keeps a working container.
- Format menu: MP4 (best compatibility), WEBM (no re-encode if the streams are WebM), MKV, AVI (re-encodes; slower).
4) Extract Clip ✂️
Enter Start as MM:SS and Duration in seconds. Produces H.264/AAC MP4.
- Uses accurate seeking if the fast path can’t grab the exact frame.
5) Download Thumbnail 🎞️
Two options:
- No time provided → Downloads
maxresdefault.jpg, falling back tohqdefault.jpg. - Time provided (e.g.,
01:20) → Downloads a lightweight playable stream and extracts a frame at that time with FFmpeg.
🧠 Tips & Good Practices
- Keep ffmpeg on your system PATH. Muxing, audio extraction, and frame grabs depend on it.
- Don’t hard-code exact format IDs like
248+251in your own scripts. Let the preference chain pick what exists. - If a file comes out as .webm, it’s because only WebM streams were offered. That’s expected and safer than failing.
- For age-gated/region-blocked videos, use cookies (see Advanced).
🧯 Troubleshooting
“Requested format is not available.”
This app already uses resilient selectors and fallbacks. If you still see this in the log:
- The video may expose only niche codecs for your region/device; try again later or with cookies.
- Live/Upcoming streams often don’t have stable muxable formats yet.
- Network hiccups can look like format failures—retry usually resolves it.
“ffmpeg not found” or merge/extract failures
- Install ffmpeg and ensure it’s on PATH. On Windows, verify
ffmpeg.exeis reachable from a terminal. - If merge to MP4 fails, the app falls back to the source container when possible.
Transcription slow or fails
- The Whisper
basemodel runs locally; long videos take time. Use shorter clips or a smaller model if needed. - Make sure you have adequate disk space and CPU resources.
Thumbnails at a specific time
- Time format is MM:SS. The app converts this to seconds internally.
- If fast seek fails, an accurate seek is attempted automatically.
Nothing happens / empty log
- Check your network connection and try a different video.
- Some corporate networks or VPNs throttle YouTube aggressively.
🛠️ Advanced
Using cookies (age-gated/region-blocked videos)
The app supports two common approaches (off by default):
- Cookie file: set
cookiefile="cookies.txt"in yt-dlp opts. - Browser cookies:
cookiesfrombrowser=("firefox",)(requires a local browser profile).
Only enable these if you need access to restricted content you’re allowed to view.
Rate limiting & sleep
You’ll sometimes see “Sleeping X.XX seconds as required by the site…” in verbose logs. That’s yt-dlp’s own throttle. You don’t need to add manual sleeps.
Shorts & Live
- YouTube Shorts work like normal VODs.
- Live streams may be HLS/DASH only and not suitable for merging to MP4 during the live window.
📦 System Requirements
- Python 3.9+ (desktop app built with
wxPython) yt-dlp(recent version recommended)ffmpegon PATH- Optional:
pydub(already configured),openai-whisperfor transcription - Internet connection with enough bandwidth for your chosen formats
⚖️ Privacy & Legal
- Use this tool only for content you have the right to download, transform, or transcribe.
- Respect creator licenses, platform terms, and local laws.
- Transcripts are stored locally in your Downloads folder. No data is uploaded unless you modify the app to do so.
FAQ
Q: Why did my video save as .webm instead of .mp4?
A: The video only exposed WebM streams for your environment. The app prefers MP4 but won’t fail if only WebM exists.
Q: Can I choose exact resolutions (e.g., 720p)?
A: Not via the UI yet. The app prefers up to 1080p. If you need strict resolution selection, we can add a dropdown.
Q: Does “Extract Clip” re-encode?
A: Yes—H.264/AAC for compatibility. It’s reliable across source codecs.
Changelog (highlights)
- Resilient selectors for video, audio, clip, and frame grabs to avoid format errors.
- Accurate thumbnail frame-grab with fast-seek fallback.
- Sequential task runner with per-task status and progress gauge.
Built with ❤️ for pragmatic creators. If something breaks, the log panel usually tells you why.