System Design FAQ: Top Questions
52. How would you design a Video Transcoding Pipeline like YouTube or Vimeo?
A Video Transcoding Pipeline converts raw user-uploaded videos into multiple formats and resolutions (e.g., 480p, 720p, 1080p) suitable for streaming. It ensures fast upload handling, horizontal scalability, and durable storage.
๐ Functional Requirements
- Allow user uploads via frontend or API
- Trigger background transcoding to HLS/MP4/MKV formats
- Store multiple renditions and thumbnails
- Track job progress and notify frontend
๐ฆ Non-Functional Requirements
- Horizontal scalability and parallel encoding
- Durable, cold, and hot storage tiers
- Resumable uploads and retryable jobs
๐๏ธ Core Components
- Uploader Service: Accepts multipart or resumable uploads
- Transcoder Worker: Executes FFmpeg jobs per resolution
- Job Queue: Stores encoding jobs (SQS, Kafka)
- Metadata Store: Tracks state (e.g., DynamoDB, Postgres)
๐๏ธ Example Transcode Job Schema
CREATE TABLE transcode_jobs (
id UUID PRIMARY KEY,
input_url TEXT,
output_720p TEXT,
output_1080p TEXT,
status TEXT,
progress INT,
created_at TIMESTAMPTZ DEFAULT now()
);
๐งช FFmpeg CLI Example
ffmpeg -i input.mp4 -vf "scale=-2:720" -c:v libx264 -crf 23 -preset fast output_720p.mp4
๐ค S3 Upload Example
// Node.js Upload to S3
const AWS = require('aws-sdk');
const fs = require('fs');
const s3 = new AWS.S3();
const fileStream = fs.createReadStream("output_720p.mp4");
s3.upload({
Bucket: "video-bucket",
Key: "videos/user123/output_720p.mp4",
Body: fileStream
}).promise();
๐ HLS (HTTP Live Streaming) Packaging
ffmpeg -i input.mp4 -c:v libx264 -crf 20 -preset fast -f hls -hls_time 10 -hls_playlist_type vod output.m3u8
๐ Observability
- Job duration by resolution
- Error rate by codec or container
- Storage utilization trends
๐งฐ Tools/Infra Used
- Storage: S3, GCS, or Azure Blob
- Transcoder: FFmpeg, AWS Elemental, Bitmovin
- Queue: SQS, Kafka, Celery
๐ Final Insight
A well-designed video pipeline decouples ingest, transcoding, and delivery. Use scalable queues and stateless workers for distributed encoding, paired with HLS packaging for playback flexibility.