YouTube Transcript Processing Pipeline

I don’t re-watch videos.
I built something that does it for me.

Long lectures, podcasts, talks with no transcript — content was piling up faster than it could be consumed. The specific problem wasn’t just length. It was the moments buried inside: the key insight at minute 34, the joke that made a concept stick, the story that was actually the point of the whole video. Rewatching to find them is not a workflow. This is.

What it produces

One output. Structured notes, key moments with timestamps, quotes worth saving, and extracted insights — from a single URL. It runs daily. It has since it was built.

The pipeline

Transcript fetch — fast path

YouTube Transcript API is queried first. If a transcript exists, it’s retrieved instantly — no audio processing needed. Fast, cheap, zero compute.

Whisper fallback — no transcript, no problem

If the API returns nothing, the audio stream is downloaded and passed through Whisper for speech-to-text. The pipeline doesn’t ask — it just switches. The output is identical either way.

Post-processing — structure and insight

The transcript is passed through prompt-based processing to extract summaries, key moments, timestamps, and quotes. The post-processing layer is optional — the core pipeline works without it.

Coming from C/C++

This was the first real Python project after working extensively in C and C++. The shift is significant — Python is a far more liberal language. Types are optional, memory is managed, the ecosystem hands you things instead of making you build them.

That looseness is disorienting at first when you’re used to the compiler catching everything. The adjustment was less about syntax and more about trusting a different set of guarantees — and knowing which ones no longer exist.

What this taught

How to design a pipeline that degrades gracefully — fast path first, fallback second, output consistent regardless of which path ran. That pattern shows up everywhere in production systems. Building it from scratch made it concrete.

Also: a tool you actually use daily reveals problems that a tool you built and forgot never will. Daily use is the best stress test.

Status & next

Obsidian compatibility is live. Web UI and podcast/local file support are actively in development.

Daily

active use

3

pipeline stages