48 KiB
Jellyfin Transcoding Architecture Documentation
Executive Summary
Jellyfin's transcoding system is a sophisticated media processing pipeline built around FFmpeg that provides real-time video and audio conversion for various client devices. This document details the core architecture, control flow, and implementation patterns valuable for building a similar system in Next.js.
Key Findings:
- Engine: FFmpeg managed through MediaEncoder class with hardware acceleration support
- Job Management: TranscodeManager orchestrates process lifecycle, throttling, and cleanup
- Session Control: Client ping system (10s/60s timeouts) with automatic kill timers
- Resource Management: Intelligent throttling based on playback position (+60s threshold)
- Segment Management: Automatic cleanup for HLS content >5 minutes duration
- Seeking Strategy: FFmpeg process restart required for large seeks due to linear encoding nature
Key Implementation Insights
1. FFmpeg Process Management Strategy
Core Principle: Each transcoding job is a linear, stateful FFmpeg process that cannot seek to arbitrary positions after encoding starts. This fundamental limitation drives the entire architecture design.
Implications:
- Large seeks require process restart: When users seek >30 seconds, kill current job and start new one
- Small seeks use player buffering: Let HLS players handle <30 second seeks naturally
- Process isolation: Each session gets independent FFmpeg process for resource control
2. Resource Management Philosophy
Throttling Strategy: Prevent transcoding from running too far ahead of playback
- Threshold: Pause encoding when >60 seconds ahead of client consumption
- Control Method: Send FFmpeg stdin commands ('p' for pause, 'u' for resume)
- Benefits: Reduces CPU/disk usage, prevents unnecessary work
Segment Cleanup Policy: Automatic disk space management for HLS
- Trigger Conditions: Video content with duration >5 minutes
- Retention: Configurable number of segments to keep
- Safety: Retry logic with graceful error handling
3. Session Lifecycle Management
Ping System Design: Keep-alive mechanism prevents orphaned processes
- Progressive streams: 10-second timeout (fast response needed)
- HLS streams: 60-second timeout (more buffering tolerance)
- Client responsibility: Ping every 30-45 seconds during active playback
Kill Timer Implementation: Automatic cleanup for abandoned sessions
- Grace periods: Multiple timeout intervals before termination
- Resource cleanup: Process termination + file cleanup + stream closure
4. FFmpeg Command Architecture
Standard HLS Command Structure (per memory specifications):
ffmpeg [input_modifiers] -i input.mp4 [encoding_params] -f hls -hls_time 6 -hls_segment_filename "segment%d.ts" output.m3u8
Essential Parameters:
-f hls: HLS output format-hls_time 6: 6-second segments for optimal seek performance-hls_segment_filename: Consistent naming patternoutput.m3u8: Playlist file output
Critical Design Decisions
1. When to Restart vs Continue Transcoding
Process Restart Required:
- Large seek operations (>30 seconds)
- Quality/resolution changes
- Audio track switching
- Subtitle track changes
Continue Existing Process:
- Small seeks (<30 seconds) - let player handle
- Pause/resume operations
- Client reconnections within timeout window
2. Segment Duration Strategy
6-Second Standard: Optimal balance between:
- Seek performance: Reasonable granularity for user seeking
- Network efficiency: Not too many small requests
- Startup time: Quick initial buffering
- Storage overhead: Manageable file count
3. Hardware Acceleration Integration
MediaEncoder Responsibilities:
- Capability detection: Probe available hardware encoders
- Process execution: Manage FFmpeg with hardware flags
- Fallback handling: Graceful degradation to software encoding
- Performance monitoring: Track encoding speeds and success rates
Architecture Patterns for Next.js Implementation
1. Core Classes Structure
// Main orchestrator (equivalent to TranscodeManager)
class TranscodingOrchestrator {
private activeJobs = new Map<string, TranscodingJob>();
private killTimers = new Map<string, NodeJS.Timeout>();
async startTranscoding(request: TranscodingRequest): Promise<TranscodingJob>
pingSession(sessionId: string, isUserPaused?: boolean): void
async killSession(sessionId: string): Promise<void>
}
// FFmpeg interface (equivalent to MediaEncoder)
class MediaProcessor {
private ffmpegPath: string;
private hardwareCapabilities: HardwareCapabilities;
async probe(inputPath: string): Promise<MediaInfo>
async startProcess(args: string[]): Promise<ChildProcess>
detectHardwareSupport(): HardwareCapabilities
}
// Individual job tracking (equivalent to TranscodingJob)
class TranscodingSession {
id: string;
process: ChildProcess;
lastPingDate: Date;
throttler?: TranscodingThrottler;
segmentCleaner?: SegmentCleaner;
startKillTimer(callback: () => void): void
stopKillTimer(): void
pause(): Promise<void>
resume(): Promise<void>
}
2. API Endpoint Design
// Next.js API routes structure
// POST /api/transcoding/start
export async function POST(request: Request) {
const { itemId, startTimeTicks, quality } = await request.json();
const job = await orchestrator.startTranscoding({
itemId,
startTimeTicks,
quality,
sessionId: generateSessionId()
});
return Response.json({ sessionId: job.id, playlistUrl: job.playlistPath });
}
// POST /api/transcoding/[sessionId]/ping
export async function POST(request: Request, { params }: { params: { sessionId: string } }) {
const { isUserPaused } = await request.json();
orchestrator.pingSession(params.sessionId, isUserPaused);
return Response.json({ success: true });
}
// GET /api/hls/[sessionId]/[segmentId].ts
export async function GET(request: Request, { params }: { params: { sessionId: string, segmentId: string } }) {
const segmentPath = getSegmentPath(params.sessionId, params.segmentId);
const stream = fs.createReadStream(segmentPath);
return new Response(stream as any);
}
// DELETE /api/transcoding/[sessionId]
export async function DELETE(request: Request, { params }: { params: { sessionId: string } }) {
await orchestrator.killSession(params.sessionId);
return Response.json({ success: true });
}
3. Client Integration Patterns
class AdaptiveMediaPlayer {
private sessionId: string;
private pingInterval: NodeJS.Timeout;
private lastPosition: number = 0;
async startPlayback(itemId: string, startPosition: number = 0) {
// Start transcoding session
const response = await fetch('/api/transcoding/start', {
method: 'POST',
body: JSON.stringify({ itemId, startTimeTicks: startPosition * 10000000 })
});
const { sessionId, playlistUrl } = await response.json();
this.sessionId = sessionId;
this.startPinging();
// Load HLS playlist
this.hls.loadSource(playlistUrl);
}
private startPinging() {
this.pingInterval = setInterval(() => {
fetch(`/api/transcoding/${this.sessionId}/ping`, {
method: 'POST',
body: JSON.stringify({ isUserPaused: this.videoElement.paused })
});
}, 30000); // Ping every 30 seconds
}
async seekTo(targetPosition: number) {
const seekDistance = Math.abs(targetPosition - this.lastPosition);
if (seekDistance > 30) {
// Large seek: restart transcoding
await this.stopPlayback();
await this.startPlayback(this.itemId, targetPosition);
} else {
// Small seek: let HLS player handle
this.videoElement.currentTime = targetPosition;
}
this.lastPosition = targetPosition;
}
async stopPlayback() {
if (this.pingInterval) {
clearInterval(this.pingInterval);
}
if (this.sessionId) {
await fetch(`/api/transcoding/${this.sessionId}`, { method: 'DELETE' });
}
}
}
4. Configuration Management
// Environment configuration
interface TranscodingConfig {
ffmpegPath: string;
transcodingTempPath: string;
maxConcurrentStreams: number;
segmentDuration: number; // 6 seconds recommended
segmentKeepCount: number;
throttleThresholdSeconds: number; // 60 seconds recommended
pingTimeoutProgressive: number; // 10 seconds
pingTimeoutHls: number; // 60 seconds
hardwareAcceleration: 'auto' | 'nvidia' | 'intel' | 'amd' | 'none';
}
// Load from environment
const config: TranscodingConfig = {
ffmpegPath: process.env.FFMPEG_PATH || '/usr/bin/ffmpeg',
transcodingTempPath: process.env.TRANSCODING_TEMP_PATH || '/tmp/transcoding',
maxConcurrentStreams: parseInt(process.env.MAX_CONCURRENT_STREAMS || '3'),
segmentDuration: 6,
segmentKeepCount: parseInt(process.env.SEGMENT_KEEP_COUNT || '10'),
throttleThresholdSeconds: 60,
pingTimeoutProgressive: 10000,
pingTimeoutHls: 60000,
hardwareAcceleration: (process.env.HARDWARE_ACCEL as any) || 'auto'
};
Production Deployment Considerations
1. Performance Monitoring
interface TranscodingMetrics {
activeJobs: number;
averageStartupTime: number;
successRate: number;
cpuUsage: number;
memoryUsage: number;
diskIORate: number;
clientTimeouts: number;
}
class MetricsCollector {
collectMetrics(): TranscodingMetrics {
return {
activeJobs: this.orchestrator.getActiveJobCount(),
averageStartupTime: this.calculateAverageStartup(),
successRate: this.calculateSuccessRate(),
cpuUsage: process.cpuUsage().user / 1000000, // Convert to seconds
memoryUsage: process.memoryUsage().heapUsed / 1024 / 1024, // MB
diskIORate: this.calculateDiskIO(),
clientTimeouts: this.timeoutCounter
};
}
}
2. Error Handling Strategy
class TranscodingErrorHandler {
async handleFFmpegFailure(job: TranscodingSession, error: Error) {
// Log error with context
logger.error('FFmpeg process failed', {
jobId: job.id,
inputPath: job.inputPath,
error: error.message,
exitCode: job.process.exitCode
});
// Attempt recovery strategies
if (this.isRetryableError(error)) {
return this.retryWithFallback(job);
}
// Clean up resources
await this.cleanupFailedJob(job);
// Notify client
this.notifyClientError(job.sessionId, error);
}
private async retryWithFallback(job: TranscodingSession): Promise<boolean> {
// Try software encoding if hardware failed
if (job.hardwareAcceleration && this.isHardwareError(job.lastError)) {
logger.info('Retrying with software encoding', { jobId: job.id });
return this.restartWithSoftwareEncoding(job);
}
// Try lower quality settings
if (job.retryCount < 2) {
logger.info('Retrying with reduced quality', { jobId: job.id });
return this.restartWithReducedQuality(job);
}
return false;
}
}
3. Security Considerations
class SecurityValidator {
validateTranscodingRequest(request: TranscodingRequest): ValidationResult {
// Path traversal protection
if (this.containsPathTraversal(request.inputPath)) {
return { valid: false, error: 'Invalid input path' };
}
// File type validation
if (!this.isAllowedMediaType(request.inputPath)) {
return { valid: false, error: 'Unsupported media type' };
}
// Resource limits
if (this.exceedsResourceLimits(request)) {
return { valid: false, error: 'Resource limits exceeded' };
}
// Rate limiting per client
if (this.isRateLimited(request.clientId)) {
return { valid: false, error: 'Rate limit exceeded' };
}
return { valid: true };
}
private containsPathTraversal(path: string): boolean {
return path.includes('..') || path.includes('~') || path.startsWith('/');
}
private isAllowedMediaType(path: string): boolean {
const allowedExtensions = ['.mp4', '.mkv', '.avi', '.mov', '.m4v'];
return allowedExtensions.some(ext => path.toLowerCase().endsWith(ext));
}
}
4. Scaling Strategies
// Horizontal scaling with Redis coordination
class DistributedTranscodingOrchestrator {
private redis: Redis;
private nodeId: string;
async startTranscoding(request: TranscodingRequest): Promise<TranscodingJob> {
// Check global capacity
const globalLoad = await this.getGlobalLoad();
if (globalLoad > 0.8) {
throw new Error('System at capacity');
}
// Assign to least loaded node
const targetNode = await this.findLeastLoadedNode();
if (targetNode === this.nodeId) {
return this.startLocalTranscoding(request);
} else {
return this.delegateToNode(targetNode, request);
}
}
private async getGlobalLoad(): Promise<number> {
const nodes = await this.redis.smembers('transcoding:nodes');
let totalJobs = 0;
let totalCapacity = 0;
for (const node of nodes) {
const nodeLoad = await this.redis.hgetall(`transcoding:node:${node}`);
totalJobs += parseInt(nodeLoad.activeJobs || '0');
totalCapacity += parseInt(nodeLoad.maxCapacity || '0');
}
return totalCapacity > 0 ? totalJobs / totalCapacity : 1;
}
}
Summary of Key Learnings
Architecture Principles
- Process Isolation: Each transcoding session gets independent FFmpeg process
- Linear Encoding: FFmpeg cannot seek backwards, requiring process restart for large seeks
- Resource Management: Proactive throttling and cleanup prevent resource exhaustion
- Client Lifecycle: Ping-based session management with automatic cleanup
Performance Optimizations
- Segment Strategy: 6-second HLS segments balance seek performance with efficiency
- Throttling Logic: Pause encoding when >60 seconds ahead of playback
- Cleanup Policies: Automatic segment removal for content >5 minutes
- Hardware Acceleration: Graceful fallback from hardware to software encoding
Implementation Strategy
- API Design: RESTful endpoints for session management and HLS segment delivery
- Client Integration: Adaptive seeking strategy based on seek distance
- Error Handling: Comprehensive retry logic with fallback options
- Security: Input validation, path traversal protection, rate limiting
Deployment Considerations
- Monitoring: Track active jobs, success rates, resource usage
- Scaling: Horizontal scaling with Redis coordination
- Storage: Temporary file management and cleanup
- Network: CDN integration for segment delivery
This architecture provides a robust foundation for building production-grade media transcoding systems with proper resource management, client lifecycle handling, and performance optimization.
Core Architecture
1. Main Components
- Role: Central orchestrator for all transcoding operations
- Key Responsibilities:
- FFmpeg process lifecycle management
- Job tracking and session management
- Resource cleanup and monitoring
- Client ping/keepalive handling
MediaEncoder (MediaEncoder.cs)
- Role: FFmpeg binary interface and process execution
- Key Responsibilities:
- FFmpeg path validation and capability detection
- Process creation and monitoring
- Hardware acceleration detection
- Encoder/decoder capability enumeration
TranscodingJob (TranscodingJob.cs)
- Role: Individual transcoding session state management
- Key Responsibilities:
- Process lifecycle tracking
- Resource usage monitoring (bytes, position, bitrate)
- Timer management for auto-cleanup
- Client connection state
2. Control Flow
graph TB
Client[Client Request] --> API[API Controller]
API --> StreamState[Create StreamState]
StreamState --> TranscodeManager[TranscodeManager.StartFfMpeg]
TranscodeManager --> FFmpeg[Launch FFmpeg Process]
FFmpeg --> Job[Create TranscodingJob]
Job --> Monitor[Job Monitoring]
Monitor --> Throttle[Throttling System]
Monitor --> Cleanup[Segment Cleanup]
Monitor --> Ping[Ping System]
Ping --> Timeout[Kill Timer]
Detailed Flow:
- Request Reception: API controllers (
DynamicHlsController,VideosController,AudioController) receive transcoding requests - State Creation:
StreamingHelpers.GetStreamingState()analyzes media and createsStreamState - Job Initialization:
TranscodeManager.StartFfMpeg()creates and starts FFmpeg process - Process Monitoring:
TranscodingJobtracks process state and resource usage - Client Management: Ping system keeps sessions alive, kill timers handle disconnections
- Resource Management: Throttling and segment cleaning optimize performance
- Cleanup: Automatic cleanup when sessions end or timeout
Key Systems
1. Ping System - Keep Transcoding Alive
Purpose: Prevents transcoding jobs from being killed when clients are actively consuming content.
Implementation:
// PingTranscodingJob method in TranscodeManager
public void PingTranscodingJob(string playSessionId, bool? isUserPaused)
{
var jobs = _activeTranscodingJobs.Where(j =>
string.Equals(playSessionId, j.PlaySessionId, StringComparison.OrdinalIgnoreCase))
.ToList();
foreach (var job in jobs)
{
if (isUserPaused.HasValue)
{
job.IsUserPaused = isUserPaused.Value;
}
PingTimer(job, true);
}
}
private void PingTimer(TranscodingJob job, bool isProgressCheckIn)
{
if (job.HasExited)
{
job.StopKillTimer();
return;
}
// Different timeouts for different job types
var timerDuration = job.Type != TranscodingJobType.Progressive ? 60000 : 10000;
job.PingTimeout = timerDuration;
job.LastPingDate = DateTime.UtcNow;
job.StartKillTimer(OnTranscodeKillTimerStopped);
}
Key Parameters:
- Progressive streams: 10 second timeout (10000ms)
- HLS streams: 60 second timeout (60000ms)
- Ping frequency: Clients should ping every 30-45 seconds
- Auto-ping: Triggered on playback progress events
2. Kill Timers - Automatic Cleanup
Purpose: Automatically terminate abandoned transcoding jobs to free resources.
Implementation:
private async void OnTranscodeKillTimerStopped(object? state)
{
var job = state as TranscodingJob;
if (!job.HasExited && job.Type != TranscodingJobType.Progressive)
{
var timeSinceLastPing = (DateTime.UtcNow - job.LastPingDate).TotalMilliseconds;
if (timeSinceLastPing < job.PingTimeout)
{
// Reset timer if ping is still fresh
job.StartKillTimer(OnTranscodeKillTimerStopped, job.PingTimeout);
return;
}
}
// Kill the job
await KillTranscodingJob(job, true, path => true).ConfigureAwait(false);
}
Configuration:
- Grace period: Jobs get multiple timeout intervals before termination
- Progressive vs HLS: Different timeout strategies
- Resource cleanup: Process termination, file cleanup, live stream closure
3. Throttling - Resource Management
Purpose: Controls transcoding speed to prevent resource waste and improve efficiency.
Conditions for Throttling:
private static bool EnableThrottling(StreamState state)
=> state.InputProtocol == MediaProtocol.File
&& state.RunTimeTicks.HasValue
&& state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks
&& state.IsInputVideo
&& state.VideoType == VideoType.VideoFile;
Throttling Logic:
private bool IsThrottleAllowed(TranscodingJob job, int thresholdSeconds)
{
var bytesDownloaded = job.BytesDownloaded;
var transcodingPositionTicks = job.TranscodingPositionTicks ?? 0;
var downloadPositionTicks = job.DownloadPositionTicks ?? 0;
var gapLengthInTicks = TimeSpan.FromSeconds(thresholdSeconds).Ticks;
if (downloadPositionTicks > 0 && transcodingPositionTicks > 0)
{
// HLS - time-based consideration
var gap = transcodingPositionTicks - downloadPositionTicks;
return gap > gapLengthInTicks;
}
// Progressive - byte-based consideration
// Calculate if transcoding is ahead enough to throttle
}
Control Mechanism:
- Pause command: Send 'p' (or 'c' for older FFmpeg) to stdin
- Resume command: Send 'u' (or newline for older FFmpeg) to stdin
- Threshold: Minimum 60 seconds ahead before throttling kicks in
4. Segment Cleaning - HLS Management
Purpose: Removes old HLS segments to prevent disk space exhaustion.
Conditions:
private static bool EnableSegmentCleaning(StreamState state)
=> state.InputProtocol is MediaProtocol.File or MediaProtocol.Http
&& state.IsInputVideo
&& state.TranscodingType == TranscodingJobType.Hls
&& state.RunTimeTicks.HasValue
&& state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks;
Implementation:
- Segment retention: Keeps last N segments (configurable)
- Cleanup frequency: Runs periodically during transcoding
- File patterns: Removes
.tsor.mp4segments and related files - Safety: Includes retry logic and error handling
FFmpeg Command Structure
Complete Command Template
{inputModifier} {inputArgument} -map_metadata -1 -map_chapters -1 -threads {threads} {mapArgs} {videoArguments} {audioArguments} -copyts -avoid_negative_ts disabled -max_muxing_queue_size {maxMuxingQueueSize} -f hls -max_delay 5000000 -hls_time {segmentLength} -hls_segment_type {segmentFormat} -start_number {startNumber} -hls_segment_filename "{segmentPath}" {hlsArguments} -y "{outputPath}"
Parameter Breakdown
Input Modifiers
-re # Read input at native framerate
-hwaccel cuda # Hardware acceleration (optional)
-fflags +genpts # Generate presentation timestamps
-analyzeduration 5000000 # Analysis duration for streams
-readrate 10 # Input read rate limit (for segment deletion)
Core Parameters
-threads 0 # Auto-detect thread count
-map_metadata -1 # Strip metadata
-map_chapters -1 # Strip chapters
-copyts # Copy timestamps
-avoid_negative_ts disabled # Handle negative timestamps
-max_muxing_queue_size 128 # Muxing queue size
HLS-Specific Parameters
-f hls # Output format
-hls_time 6 # Segment duration (seconds)
-hls_segment_type mpegts # Segment container (mpegts/fmp4)
-hls_playlist_type vod # Playlist type (vod/event)
-hls_list_size 0 # Keep all segments in playlist
-start_number 0 # Starting segment number
-hls_segment_filename "output%d.ts" # Segment naming pattern
-hls_base_url "hls/stream/" # Base URL for segments
Output Specification
"output.m3u8" # Output playlist file
Video Encoding Parameters
Quality Control
-c:v libx264 # Video codec
-preset veryfast # Encoding speed/quality trade-off
-crf 23 # Constant rate factor (quality)
-maxrate 2000k # Maximum bitrate
-bufsize 4000k # Buffer size
Resolution and Framerate
-vf "scale=1920:1080" # Scale to specific resolution
-r 30 # Output framerate
-pix_fmt yuv420p # Pixel format
Audio Encoding Parameters
Basic Audio
-c:a aac # Audio codec
-ab 128k # Audio bitrate
-ar 48000 # Sample rate
-ac 2 # Audio channels
Advanced Audio Processing
-af "volume=1.0" # Audio filters
-acodec copy # Copy audio stream
Implementation Guide for Next.js
1. Core Architecture
// Core interfaces
interface TranscodingJob {
id: string;
playSessionId: string;
process: ChildProcess;
lastPingDate: Date;
pingTimeout: number;
bytesDownloaded: number;
transcodingPositionTicks: number;
activeRequestCount: number;
isUserPaused: boolean;
hasExited: boolean;
}
interface StreamState {
outputFilePath: string;
segmentLength: number;
inputProtocol: string;
runTimeTicks: number;
isInputVideo: boolean;
transcodingType: 'hls' | 'progressive';
}
2. TranscodeManager Implementation
class TranscodeManager {
private activeJobs = new Map<string, TranscodingJob>();
private killTimers = new Map<string, NodeJS.Timeout>();
async startFfmpeg(state: StreamState, playSessionId: string): Promise<TranscodingJob> {
const commandArgs = this.buildFfmpegCommand(state);
const process = spawn('ffmpeg', commandArgs);
const job: TranscodingJob = {
id: crypto.randomUUID(),
playSessionId,
process,
lastPingDate: new Date(),
pingTimeout: state.transcodingType === 'progressive' ? 10000 : 60000,
bytesDownloaded: 0,
transcodingPositionTicks: 0,
activeRequestCount: 1,
isUserPaused: false,
hasExited: false
};
this.activeJobs.set(playSessionId, job);
this.startKillTimer(job);
return job;
}
pingTranscodingJob(playSessionId: string, isUserPaused?: boolean) {
const job = this.activeJobs.get(playSessionId);
if (!job || job.hasExited) return;
if (isUserPaused !== undefined) {
job.isUserPaused = isUserPaused;
}
job.lastPingDate = new Date();
this.resetKillTimer(job);
}
private startKillTimer(job: TranscodingJob) {
this.clearKillTimer(job.playSessionId);
const timer = setTimeout(() => {
this.checkAndKillJob(job);
}, job.pingTimeout);
this.killTimers.set(job.playSessionId, timer);
}
private async checkAndKillJob(job: TranscodingJob) {
const timeSinceLastPing = Date.now() - job.lastPingDate.getTime();
if (timeSinceLastPing < job.pingTimeout) {
// Reset timer if ping is still fresh
this.startKillTimer(job);
return;
}
// Kill the job
await this.killTranscodingJob(job);
}
}
3. API Endpoints
// Next.js API routes
// /api/transcoding/[playSessionId]/ping
export async function POST(request: Request, { params }: { params: { playSessionId: string } }) {
const { isUserPaused } = await request.json();
transcodeManager.pingTranscodingJob(params.playSessionId, isUserPaused);
return Response.json({ success: true });
}
// /api/hls/[...segments]
export async function GET(request: Request, { params }: { params: { segments: string[] } }) {
const [playSessionId, segmentFile] = params.segments;
if (segmentFile.endsWith('.m3u8')) {
// Return playlist
return new Response(playlist, {
headers: { 'Content-Type': 'application/vnd.apple.mpegurl' }
});
} else {
// Return segment file
const filePath = path.join(transcodingDir, segmentFile);
const fileStream = fs.createReadStream(filePath);
return new Response(fileStream as any);
}
}
4. Client Integration
// Client-side ping implementation
class MediaPlayer {
private pingInterval: NodeJS.Timeout | null = null;
private playSessionId: string;
startPinging() {
this.pingInterval = setInterval(() => {
fetch(`/api/transcoding/${this.playSessionId}/ping`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ isUserPaused: this.isPaused })
});
}, 30000); // Ping every 30 seconds
}
stopPinging() {
if (this.pingInterval) {
clearInterval(this.pingInterval);
this.pingInterval = null;
}
}
}
Seeking Implementation
Jellyfin implements sophisticated seeking mechanisms for both progressive and HLS transcoding, handling different scenarios and optimizing for performance.
1. Core Seeking Logic
Central Method: GetFastSeekCommandLineParameter() in EncodingHelper.cs
public string GetFastSeekCommandLineParameter(EncodingJobInfo state, EncodingOptions options, string segmentContainer)
{
var time = state.BaseRequest.StartTimeTicks ?? 0;
var maxTime = state.RunTimeTicks ?? 0;
var seekParam = string.Empty;
if (time > 0)
{
// For direct streaming/remuxing, we seek at the exact position of the keyframe
// However, ffmpeg will seek to previous keyframe when the exact time is the input
// Workaround this by adding 0.5s offset to the seeking time to get the exact keyframe on most videos.
// This will help subtitle syncing.
var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec);
var seekTick = isHlsRemuxing ? time + 5000000L : time;
// Seeking beyond EOF makes no sense in transcoding. Clamp the seekTick value to
// [0, RuntimeTicks - 5.0s], so that the muxer gets packets and avoid error codes.
if (maxTime > 0)
{
seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0));
}
seekParam += string.Format(CultureInfo.InvariantCulture, "-ss {0}", _mediaEncoder.GetTimeParameter(seekTick));
if (state.IsVideoRequest)
{
// Add -noaccurate_seek for specific conditions
if (!string.Equals(state.InputContainer, "wtv", StringComparison.OrdinalIgnoreCase)
&& !string.Equals(segmentFormat, "ts", StringComparison.OrdinalIgnoreCase)
&& state.TranscodingType != TranscodingJobType.Progressive
&& !state.EnableBreakOnNonKeyFrames(outputVideoCodec)
&& (state.BaseRequest.StartTimeTicks ?? 0) > 0)
{
seekParam += " -noaccurate_seek";
}
}
}
return seekParam;
}
2. Seeking Types
A. Input Seeking (-ss before input)
- Purpose: Seek to position before decoding starts
- Advantages: Very fast, minimal CPU usage
- Disadvantages: Less accurate, seeks to nearest keyframe
- Usage: Primary method for initial positioning
B. Output Seeking (-ss after input)
- Purpose: Decode from beginning, then seek in output
- Advantages: Frame-accurate positioning
- Disadvantages: High CPU usage, slower startup
- Usage: When precision is critical
C. Accurate vs Fast Seeking
- Fast seeking (
-noaccurate_seek): Seeks to nearest keyframe (default) - Accurate seeking: Frame-precise but slower
- Dynamic selection: Based on container and transcoding type
3. HLS Segment-Based Seeking
Segment URL Structure
/hls/{itemId}/{playlistId}/{segmentId}.{container}?runtimeTicks={position}&actualSegmentLengthTicks={duration}
Key Parameters:
runtimeTicks: Starting position of segment in media timelineactualSegmentLengthTicks: Precise duration of this specific segmentsegmentId: Sequential segment number (0-based)
Segment Calculation:
// Equal-length segments
var segmentLengthTicks = TimeSpan.FromSeconds(segmentLength).Ticks;
var wholeSegments = runtimeTicks / segmentLengthTicks;
var remainingTicks = runtimeTicks % segmentLengthTicks;
// Keyframe-aware segments (optimal)
var result = new List<double>();
var desiredSegmentLengthTicks = TimeSpan.FromMilliseconds(desiredSegmentLengthMs).Ticks;
foreach (var keyframe in keyframeData.KeyframeTicks)
{
if (keyframe >= desiredCutTime)
{
var currentSegmentLength = keyframe - lastKeyframe;
result.Add(TimeSpan.FromTicks(currentSegmentLength).TotalSeconds);
lastKeyframe = keyframe;
desiredCutTime += desiredSegmentLengthTicks;
}
}
4. Seek Optimizations
HLS Remuxing Offset
// Add 0.5s offset for HLS remuxing to hit exact keyframes
var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec);
var seekTick = isHlsRemuxing ? time + 5000000L : time; // +0.5s
EOF Protection
// Prevent seeking beyond end of file
if (maxTime > 0)
{
seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0)); // -5s buffer
}
Container-Specific Rules
- WTV containers: Never use
-noaccurate_seek(breaks seeking) - MPEGTS segments: Disable
-noaccurate_seekfor client compatibility - fMP4 containers: Require
-noaccurate_seekfor audio sync
5. Keyframe Extraction
Purpose: Generate precise segment boundaries aligned with keyframes
FFprobe Method:
ffprobe -loglevel error -skip_frame nokey -select_streams v:0 -show_entries packet=pts_time,flags -of csv=print_section=0 "input.mp4"
Matroska Method:
- Direct cue point extraction from container metadata
- Much faster than FFprobe for MKV files
- Reads cue tables for instant keyframe positions
6. Real-Time Seeking Scenarios
Progressive Transcoding
// Client seeks to new position
const seekTo = (positionSeconds: number) => {
// Kill current transcoding job
await fetch(`/api/transcode/kill/${playSessionId}`, { method: 'POST' });
// Start new transcoding from seek position
const startTimeTicks = positionSeconds * 10000000; // Convert to ticks
const newUrl = `/api/videos/${itemId}/stream?startTimeTicks=${startTimeTicks}`;
// Update video source
videoElement.src = newUrl;
};
HLS Seeking
// HLS seeking is handled by the player automatically
// Server generates segments with proper seek points
const hlsPlayer = new Hls();
hlsPlayer.loadSource('/api/videos/{itemId}/master.m3u8');
// Player handles seeking by requesting appropriate segments
// No need to restart transcoding jobs
7. External Media Handling
External Subtitles
// Also seek external subtitle streams
var seekSubParam = GetFastSeekCommandLineParameter(state, options, segmentContainer);
if (!string.IsNullOrEmpty(seekSubParam))
{
arg.Append(' ').Append(seekSubParam);
}
arg.Append(" -i file:\"").Append(subtitlePath).Append('"');
External Audio
// Seek external audio streams to match video
var seekAudioParam = GetFastSeekCommandLineParameter(state, options, segmentContainer);
if (!string.IsNullOrEmpty(seekAudioParam))
{
arg.Append(' ').Append(seekAudioParam);
}
arg.Append(" -i \"").Append(state.AudioStream.Path).Append('"');
8. Implementation Guide for Next.js
Progressive Seeking
class ProgressiveTranscoder {
async seekTo(positionTicks: number): Promise<string> {
// Kill existing job
await this.killCurrentJob();
// Calculate seek parameters
const seekSeconds = positionTicks / 10000000;
const maxTime = this.mediaDuration;
// Clamp to safe bounds
const safeSeekTicks = Math.max(0, Math.min(positionTicks, maxTime - 50000000));
// Build FFmpeg command with seek
const args = [
'-ss', this.formatTime(safeSeekTicks),
'-i', this.inputPath,
'-c:v', 'libx264',
'-preset', 'veryfast',
// ... other encoding params
this.outputPath
];
return this.startTranscoding(args);
}
private formatTime(ticks: number): string {
const totalSeconds = ticks / 10000000;
const hours = Math.floor(totalSeconds / 3600);
const minutes = Math.floor((totalSeconds % 3600) / 60);
const seconds = totalSeconds % 60;
return `${hours}:${minutes.toString().padStart(2, '0')}:${seconds.toFixed(6).padStart(9, '0')}`;
}
}
HLS Seeking
class HLSTranscoder {
generateSegmentUrl(segmentId: number, runtimeTicks: number, segmentDurationTicks: number): string {
const params = new URLSearchParams({
runtimeTicks: runtimeTicks.toString(),
actualSegmentLengthTicks: segmentDurationTicks.toString()
});
return `/api/hls/${this.itemId}/${this.playlistId}/${segmentId}.ts?${params}`;
}
async generateSegment(segmentId: number, runtimeTicks: number): Promise<Buffer> {
const seekSeconds = runtimeTicks / 10000000;
const args = [
'-ss', this.formatTime(runtimeTicks),
'-i', this.inputPath,
'-t', this.segmentDuration.toString(),
'-c:v', 'libx264',
'-preset', 'veryfast',
'-force_key_frames', `expr:gte(t,n_forced*${this.segmentDuration})`,
'-f', 'mpegts',
'-'
];
return this.executeFFmpeg(args);
}
}
9. Performance Considerations
Seek Performance Tips:
- Use input seeking (
-ssbefore-i) when possible - Cache keyframe data for containers that support it
- Implement seek debouncing to prevent rapid job restarts
- Use appropriate segment duration (6s recommended for seek performance)
- Pre-generate keyframe indexes for frequently accessed content
Client-Side Optimizations:
// Debounce seeking to prevent excessive requests
const debouncedSeek = debounce((position: number) => {
this.performSeek(position);
}, 300);
// Progressive seeking strategy
if (Math.abs(targetPosition - currentPosition) < 30) {
// Small seeks: let player buffer naturally
player.currentTime = targetPosition;
} else {
// Large seeks: restart transcoding
this.seekTo(targetPosition);
}
Progress Tracking and Seeking During Transcoding
The Challenge: Unlike direct play where duration and seek positions are straightforward, transcoding creates a "streaming-like" scenario where the real duration is not immediately available and progress tracking becomes complex.
1. Core Progress Tracking Architecture
Key Components:
TranscodingPositionTicks: Where FFmpeg transcoding has currently reachedDownloadPositionTicks: Where the client has consumed content toCompletionPercentage: Calculated progress based on runtime vs current positionRunTimeTicks: Total media duration from metadata
Progress Calculation Logic:
// From JobLogger.ParseLogLine() - extracts progress from FFmpeg output
var totalMs = state.RunTimeTicks.HasValue
? TimeSpan.FromTicks(state.RunTimeTicks.Value).TotalMilliseconds
: 0;
var currentMs = /* parsed from FFmpeg time output */;
if (totalMs > 0)
{
percent = 100.0 * currentMs / totalMs;
transcodingPosition = TimeSpan.FromMilliseconds(currentMs);
}
2. Real-Time Progress Updates
FFmpeg Output Parsing:
// JobLogger monitors FFmpeg stderr output for progress
private void ParseLogLine(string line, EncodingJobInfo state)
{
// Parse: frame= 123 fps= 25 q=28.0 size= 1024kB time=00:01:23.45 bitrate= 512.0kbits/s
// Extract: time value for current transcoding position
// Extract: size value for bytes transcoded
// Extract: bitrate for current encoding rate
}
Progress Reporting Chain:
// 1. FFmpeg outputs progress to stderr
// 2. JobLogger.ParseLogLine() extracts values
// 3. TranscodeManager.ReportTranscodingProgress() updates job state
// 4. SessionManager.ReportTranscodingInfo() updates client session
// 5. TranscodingInfo DTO sent to client via WebSocket/API
interface TranscodingInfo {
CompletionPercentage?: number; // 0-100 progress
Bitrate?: number; // Current bitrate
Framerate?: number; // Current FPS
Width?: number; // Video width
Height?: number; // Video height
AudioCodec: string; // Audio codec in use
VideoCodec: string; // Video codec in use
Container: string; // Output container
}
3. Client Progress Bar Implementation
Progressive Transcoding:
class ProgressiveTranscodingProgress {
private transcodingInfo: TranscodingInfo;
private mediaRunTimeTicks: number;
updateProgressBar(): void {
if (this.transcodingInfo?.CompletionPercentage) {
// Use transcoding percentage directly
const progress = this.transcodingInfo.CompletionPercentage / 100;
this.progressBar.value = progress;
// Estimate current playable duration
const availableDuration = this.mediaRunTimeTicks * progress;
this.updateSeekableRange(0, availableDuration);
}
}
handleSeek(targetPositionTicks: number): void {
const transcodedTicks = this.mediaRunTimeTicks * (this.transcodingInfo.CompletionPercentage / 100);
if (targetPositionTicks <= transcodedTicks) {
// Seek within transcoded content
this.player.currentTime = targetPositionTicks / 10000000;
} else {
// Restart transcoding from seek position
this.startTranscodingFromPosition(targetPositionTicks);
}
}
}
HLS Transcoding:
class HLSTranscodingProgress {
private segmentDuration: number = 6; // seconds
private totalSegments: number;
calculateProgress(): ProgressInfo {
// HLS progress based on segment availability
const availableSegments = this.getAvailableSegmentCount();
const progress = availableSegments / this.totalSegments;
return {
percentage: progress * 100,
availableDuration: availableSegments * this.segmentDuration,
seekableEnd: availableSegments * this.segmentDuration
};
}
updateSegmentDownloadPosition(): void {
// Update DownloadPositionTicks when segments are consumed
const segmentEndTicks = this.currentRuntimeTicks + this.actualSegmentLengthTicks;
this.transcodingJob.DownloadPositionTicks = Math.max(
this.transcodingJob.DownloadPositionTicks ?? segmentEndTicks,
segmentEndTicks
);
}
}
4. Solving the "Real Duration is Not Real" Problem
Duration Estimation Strategies:
1. Metadata-Based Duration:
// Use media file metadata as baseline
const estimatedDuration = mediaSource.RunTimeTicks;
if (estimatedDuration) {
this.totalDuration = estimatedDuration;
this.progressBar.max = estimatedDuration;
}
2. Progressive Duration Discovery:
// Update duration as transcoding progresses
if (transcodingInfo.CompletionPercentage > 0) {
const currentTranscodedTicks = /* current position from transcoding */;
const estimatedTotal = currentTranscodedTicks / (transcodingInfo.CompletionPercentage / 100);
// Only update if estimate seems reliable (>10% transcoded)
if (transcodingInfo.CompletionPercentage > 10) {
this.totalDuration = estimatedTotal;
}
}
3. HLS Segment-Based Calculation:
// For HLS, calculate from segment structure
const calculateHLSDuration = (segments: SegmentInfo[]): number => {
return segments.reduce((total, segment) => {
return total + segment.actualSegmentLengthTicks;
}, 0);
};
5. Advanced Progress Management
Buffering and Availability:
class TranscodingBuffer {
private bufferAheadSeconds: number = 30;
isPositionAvailable(targetPositionTicks: number): boolean {
const transcodedTicks = this.getTranscodedPositionTicks();
return targetPositionTicks <= transcodedTicks;
}
calculateSeekableRange(): { start: number; end: number } {
return {
start: 0,
end: this.getTranscodedPositionTicks() - (this.bufferAheadSeconds * 10000000)
};
}
shouldThrottleTranscoding(): boolean {
const gap = this.transcodingPositionTicks - this.downloadPositionTicks;
const targetGap = this.bufferAheadSeconds * 10000000; // 30s in ticks
return gap > targetGap;
}
}
Smooth Progress Updates:
class SmoothProgressUpdater {
private interpolationInterval: number = 1000; // 1 second
private lastKnownPosition: number = 0;
private lastUpdateTime: number = Date.now();
interpolateProgress(): number {
if (!this.isPlaying) return this.lastKnownPosition;
const now = Date.now();
const elapsed = now - this.lastUpdateTime;
const estimatedProgress = this.lastKnownPosition + elapsed;
// Don't exceed known transcoded position
const maxAvailable = this.getTranscodedPositionTicks();
return Math.min(estimatedProgress, maxAvailable);
}
}
6. Error Handling and Edge Cases
Transcoding Failures:
class TranscodingErrorHandler {
handleTranscodingError(error: TranscodingError): void {
switch (error.type) {
case 'SEEK_BEYOND_DURATION':
// Clamp seek to valid range
this.seekTo(Math.min(this.targetPosition, this.maxAvailablePosition));
break;
case 'TRANSCODING_STALLED':
// Restart transcoding
this.restartTranscodingFromLastKnownPosition();
break;
case 'INVALID_DURATION':
// Fall back to live estimation
this.enableLiveDurationEstimation();
break;
}
}
}
Network Issues:
class NetworkResilienceHandler {
private retryPolicy = {
maxRetries: 3,
backoffMs: [1000, 2000, 4000]
};
async handleProgressUpdateFailure(attempt: number): Promise<void> {
if (attempt < this.retryPolicy.maxRetries) {
await this.delay(this.retryPolicy.backoffMs[attempt]);
return this.fetchProgressUpdate();
} else {
// Switch to local time-based estimation
this.enableLocalProgressEstimation();
}
}
}
7. Next.js Implementation Guide
Complete Progress Management:
class NextJSTranscodingProgressManager {
private wsConnection: WebSocket;
private progressUpdateInterval: NodeJS.Timeout;
constructor(private videoElement: HTMLVideoElement) {
this.setupWebSocketUpdates();
this.setupProgressInterpolation();
}
private setupWebSocketUpdates(): void {
this.wsConnection.onmessage = (event) => {
const message = JSON.parse(event.data);
if (message.MessageType === 'TranscodingInfo') {
this.updateTranscodingInfo(message.Data);
}
};
}
private updateTranscodingInfo(info: TranscodingInfo): void {
// Update progress bar
if (info.CompletionPercentage) {
this.progressBar.value = info.CompletionPercentage;
}
// Update seekable range
const seekableEnd = this.calculateSeekableEnd(info);
this.videoElement.setAttribute('data-seekable-end', seekableEnd.toString());
// Update duration if we have better estimate
this.updateDurationEstimate(info);
}
async handleSeek(targetSeconds: number): Promise<void> {
const targetTicks = targetSeconds * 10000000;
const transcodedTicks = this.getTranscodedPositionTicks();
if (targetTicks <= transcodedTicks) {
// Seek within available content
this.videoElement.currentTime = targetSeconds;
} else {
// Show loading state
this.showBufferingState();
// Request new transcoding position
await this.requestTranscodingFromPosition(targetTicks);
// Update video source
this.videoElement.src = this.generateStreamUrl(targetTicks);
}
}
}
Important Configuration
Environment Variables
FFMPEG_PATH=/usr/bin/ffmpeg
TRANSCODING_TEMP_PATH=/tmp/jellyfin/transcoding
MAX_CONCURRENT_STREAMS=3
SEGMENT_KEEP_SECONDS=300
THROTTLE_DELAY_SECONDS=60
Performance Tuning
- Thread count: Auto-detect based on CPU cores
- Buffer sizes: Adjust based on available memory
- Segment duration: 6 seconds for good seek performance
- Concurrent streams: Limit based on system resources
Security Considerations
- Input validation: Sanitize all file paths and parameters
- Resource limits: Prevent DOS through excessive transcoding
- Access control: Validate session ownership
- File cleanup: Remove orphaned files regularly
Monitoring and Logging
Key Metrics to Track
- Active transcoding jobs count
- Resource usage (CPU, memory, disk I/O)
- Average transcoding speed vs playback speed
- Client ping frequency and timeouts
- Segment cleanup efficiency
Error Handling
- FFmpeg process failures
- Disk space exhaustion
- Network timeouts
- Invalid media files
- Hardware acceleration failures
This architecture provides a robust foundation for building a media transcoding system with proper resource management, client lifecycle handling, and performance optimization.