48 KiB

Raw Blame History

Jellyfin Transcoding Architecture Documentation

Executive Summary

Jellyfin's transcoding system is a sophisticated media processing pipeline built around FFmpeg that provides real-time video and audio conversion for various client devices. This document details the core architecture, control flow, and implementation patterns valuable for building a similar system in Next.js.

Key Findings:

Engine: FFmpeg managed through MediaEncoder class with hardware acceleration support
Job Management: TranscodeManager orchestrates process lifecycle, throttling, and cleanup
Session Control: Client ping system (10s/60s timeouts) with automatic kill timers
Resource Management: Intelligent throttling based on playback position (+60s threshold)
Segment Management: Automatic cleanup for HLS content >5 minutes duration
Seeking Strategy: FFmpeg process restart required for large seeks due to linear encoding nature

Key Implementation Insights

1. FFmpeg Process Management Strategy

Core Principle: Each transcoding job is a linear, stateful FFmpeg process that cannot seek to arbitrary positions after encoding starts. This fundamental limitation drives the entire architecture design.

Implications:

Large seeks require process restart: When users seek >30 seconds, kill current job and start new one
Small seeks use player buffering: Let HLS players handle <30 second seeks naturally
Process isolation: Each session gets independent FFmpeg process for resource control

2. Resource Management Philosophy

Throttling Strategy: Prevent transcoding from running too far ahead of playback

Threshold: Pause encoding when >60 seconds ahead of client consumption
Control Method: Send FFmpeg stdin commands ('p' for pause, 'u' for resume)
Benefits: Reduces CPU/disk usage, prevents unnecessary work

Segment Cleanup Policy: Automatic disk space management for HLS

Trigger Conditions: Video content with duration >5 minutes
Retention: Configurable number of segments to keep
Safety: Retry logic with graceful error handling

3. Session Lifecycle Management

Ping System Design: Keep-alive mechanism prevents orphaned processes

Progressive streams: 10-second timeout (fast response needed)
HLS streams: 60-second timeout (more buffering tolerance)
Client responsibility: Ping every 30-45 seconds during active playback

Kill Timer Implementation: Automatic cleanup for abandoned sessions

Grace periods: Multiple timeout intervals before termination
Resource cleanup: Process termination + file cleanup + stream closure

4. FFmpeg Command Architecture

Standard HLS Command Structure (per memory specifications):

ffmpeg [input_modifiers] -i input.mp4 [encoding_params] -f hls -hls_time 6 -hls_segment_filename "segment%d.ts" output.m3u8

Essential Parameters:

-f hls: HLS output format
-hls_time 6: 6-second segments for optimal seek performance
-hls_segment_filename: Consistent naming pattern
output.m3u8: Playlist file output

Critical Design Decisions

1. When to Restart vs Continue Transcoding

Process Restart Required:

Large seek operations (>30 seconds)
Quality/resolution changes
Audio track switching
Subtitle track changes

Continue Existing Process:

Small seeks (<30 seconds) - let player handle
Pause/resume operations
Client reconnections within timeout window

2. Segment Duration Strategy

6-Second Standard: Optimal balance between:

Seek performance: Reasonable granularity for user seeking
Network efficiency: Not too many small requests
Startup time: Quick initial buffering
Storage overhead: Manageable file count

3. Hardware Acceleration Integration

MediaEncoder Responsibilities:

Capability detection: Probe available hardware encoders
Process execution: Manage FFmpeg with hardware flags
Fallback handling: Graceful degradation to software encoding
Performance monitoring: Track encoding speeds and success rates

Architecture Patterns for Next.js Implementation

1. Core Classes Structure

// Main orchestrator (equivalent to TranscodeManager)
class TranscodingOrchestrator {
  private activeJobs = new Map<string, TranscodingJob>();
  private killTimers = new Map<string, NodeJS.Timeout>();
  
  async startTranscoding(request: TranscodingRequest): Promise<TranscodingJob>
  pingSession(sessionId: string, isUserPaused?: boolean): void
  async killSession(sessionId: string): Promise<void>
}

// FFmpeg interface (equivalent to MediaEncoder)
class MediaProcessor {
  private ffmpegPath: string;
  private hardwareCapabilities: HardwareCapabilities;
  
  async probe(inputPath: string): Promise<MediaInfo>
  async startProcess(args: string[]): Promise<ChildProcess>
  detectHardwareSupport(): HardwareCapabilities
}

// Individual job tracking (equivalent to TranscodingJob)
class TranscodingSession {
  id: string;
  process: ChildProcess;
  lastPingDate: Date;
  throttler?: TranscodingThrottler;
  segmentCleaner?: SegmentCleaner;
  
  startKillTimer(callback: () => void): void
  stopKillTimer(): void
  pause(): Promise<void>
  resume(): Promise<void>
}

2. API Endpoint Design

// Next.js API routes structure

// POST /api/transcoding/start
export async function POST(request: Request) {
  const { itemId, startTimeTicks, quality } = await request.json();
  const job = await orchestrator.startTranscoding({
    itemId,
    startTimeTicks,
    quality,
    sessionId: generateSessionId()
  });
  return Response.json({ sessionId: job.id, playlistUrl: job.playlistPath });
}

// POST /api/transcoding/[sessionId]/ping
export async function POST(request: Request, { params }: { params: { sessionId: string } }) {
  const { isUserPaused } = await request.json();
  orchestrator.pingSession(params.sessionId, isUserPaused);
  return Response.json({ success: true });
}

// GET /api/hls/[sessionId]/[segmentId].ts
export async function GET(request: Request, { params }: { params: { sessionId: string, segmentId: string } }) {
  const segmentPath = getSegmentPath(params.sessionId, params.segmentId);
  const stream = fs.createReadStream(segmentPath);
  return new Response(stream as any);
}

// DELETE /api/transcoding/[sessionId]
export async function DELETE(request: Request, { params }: { params: { sessionId: string } }) {
  await orchestrator.killSession(params.sessionId);
  return Response.json({ success: true });
}

3. Client Integration Patterns

class AdaptiveMediaPlayer {
  private sessionId: string;
  private pingInterval: NodeJS.Timeout;
  private lastPosition: number = 0;
  
  async startPlayback(itemId: string, startPosition: number = 0) {
    // Start transcoding session
    const response = await fetch('/api/transcoding/start', {
      method: 'POST',
      body: JSON.stringify({ itemId, startTimeTicks: startPosition * 10000000 })
    });
    const { sessionId, playlistUrl } = await response.json();
    
    this.sessionId = sessionId;
    this.startPinging();
    
    // Load HLS playlist
    this.hls.loadSource(playlistUrl);
  }
  
  private startPinging() {
    this.pingInterval = setInterval(() => {
      fetch(`/api/transcoding/${this.sessionId}/ping`, {
        method: 'POST',
        body: JSON.stringify({ isUserPaused: this.videoElement.paused })
      });
    }, 30000); // Ping every 30 seconds
  }
  
  async seekTo(targetPosition: number) {
    const seekDistance = Math.abs(targetPosition - this.lastPosition);
    
    if (seekDistance > 30) {
      // Large seek: restart transcoding
      await this.stopPlayback();
      await this.startPlayback(this.itemId, targetPosition);
    } else {
      // Small seek: let HLS player handle
      this.videoElement.currentTime = targetPosition;
    }
    
    this.lastPosition = targetPosition;
  }
  
  async stopPlayback() {
    if (this.pingInterval) {
      clearInterval(this.pingInterval);
    }
    
    if (this.sessionId) {
      await fetch(`/api/transcoding/${this.sessionId}`, { method: 'DELETE' });
    }
  }
}

4. Configuration Management

// Environment configuration
interface TranscodingConfig {
  ffmpegPath: string;
  transcodingTempPath: string;
  maxConcurrentStreams: number;
  segmentDuration: number; // 6 seconds recommended
  segmentKeepCount: number;
  throttleThresholdSeconds: number; // 60 seconds recommended
  pingTimeoutProgressive: number; // 10 seconds
  pingTimeoutHls: number; // 60 seconds
  hardwareAcceleration: 'auto' | 'nvidia' | 'intel' | 'amd' | 'none';
}

// Load from environment
const config: TranscodingConfig = {
  ffmpegPath: process.env.FFMPEG_PATH || '/usr/bin/ffmpeg',
  transcodingTempPath: process.env.TRANSCODING_TEMP_PATH || '/tmp/transcoding',
  maxConcurrentStreams: parseInt(process.env.MAX_CONCURRENT_STREAMS || '3'),
  segmentDuration: 6,
  segmentKeepCount: parseInt(process.env.SEGMENT_KEEP_COUNT || '10'),
  throttleThresholdSeconds: 60,
  pingTimeoutProgressive: 10000,
  pingTimeoutHls: 60000,
  hardwareAcceleration: (process.env.HARDWARE_ACCEL as any) || 'auto'
};

Production Deployment Considerations

1. Performance Monitoring

interface TranscodingMetrics {
  activeJobs: number;
  averageStartupTime: number;
  successRate: number;
  cpuUsage: number;
  memoryUsage: number;
  diskIORate: number;
  clientTimeouts: number;
}

class MetricsCollector {
  collectMetrics(): TranscodingMetrics {
    return {
      activeJobs: this.orchestrator.getActiveJobCount(),
      averageStartupTime: this.calculateAverageStartup(),
      successRate: this.calculateSuccessRate(),
      cpuUsage: process.cpuUsage().user / 1000000, // Convert to seconds
      memoryUsage: process.memoryUsage().heapUsed / 1024 / 1024, // MB
      diskIORate: this.calculateDiskIO(),
      clientTimeouts: this.timeoutCounter
    };
  }
}

2. Error Handling Strategy

class TranscodingErrorHandler {
  async handleFFmpegFailure(job: TranscodingSession, error: Error) {
    // Log error with context
    logger.error('FFmpeg process failed', {
      jobId: job.id,
      inputPath: job.inputPath,
      error: error.message,
      exitCode: job.process.exitCode
    });
    
    // Attempt recovery strategies
    if (this.isRetryableError(error)) {
      return this.retryWithFallback(job);
    }
    
    // Clean up resources
    await this.cleanupFailedJob(job);
    
    // Notify client
    this.notifyClientError(job.sessionId, error);
  }
  
  private async retryWithFallback(job: TranscodingSession): Promise<boolean> {
    // Try software encoding if hardware failed
    if (job.hardwareAcceleration && this.isHardwareError(job.lastError)) {
      logger.info('Retrying with software encoding', { jobId: job.id });
      return this.restartWithSoftwareEncoding(job);
    }
    
    // Try lower quality settings
    if (job.retryCount < 2) {
      logger.info('Retrying with reduced quality', { jobId: job.id });
      return this.restartWithReducedQuality(job);
    }
    
    return false;
  }
}

3. Security Considerations

class SecurityValidator {
  validateTranscodingRequest(request: TranscodingRequest): ValidationResult {
    // Path traversal protection
    if (this.containsPathTraversal(request.inputPath)) {
      return { valid: false, error: 'Invalid input path' };
    }
    
    // File type validation
    if (!this.isAllowedMediaType(request.inputPath)) {
      return { valid: false, error: 'Unsupported media type' };
    }
    
    // Resource limits
    if (this.exceedsResourceLimits(request)) {
      return { valid: false, error: 'Resource limits exceeded' };
    }
    
    // Rate limiting per client
    if (this.isRateLimited(request.clientId)) {
      return { valid: false, error: 'Rate limit exceeded' };
    }
    
    return { valid: true };
  }
  
  private containsPathTraversal(path: string): boolean {
    return path.includes('..') || path.includes('~') || path.startsWith('/');
  }
  
  private isAllowedMediaType(path: string): boolean {
    const allowedExtensions = ['.mp4', '.mkv', '.avi', '.mov', '.m4v'];
    return allowedExtensions.some(ext => path.toLowerCase().endsWith(ext));
  }
}

4. Scaling Strategies

// Horizontal scaling with Redis coordination
class DistributedTranscodingOrchestrator {
  private redis: Redis;
  private nodeId: string;
  
  async startTranscoding(request: TranscodingRequest): Promise<TranscodingJob> {
    // Check global capacity
    const globalLoad = await this.getGlobalLoad();
    if (globalLoad > 0.8) {
      throw new Error('System at capacity');
    }
    
    // Assign to least loaded node
    const targetNode = await this.findLeastLoadedNode();
    if (targetNode === this.nodeId) {
      return this.startLocalTranscoding(request);
    } else {
      return this.delegateToNode(targetNode, request);
    }
  }
  
  private async getGlobalLoad(): Promise<number> {
    const nodes = await this.redis.smembers('transcoding:nodes');
    let totalJobs = 0;
    let totalCapacity = 0;
    
    for (const node of nodes) {
      const nodeLoad = await this.redis.hgetall(`transcoding:node:${node}`);
      totalJobs += parseInt(nodeLoad.activeJobs || '0');
      totalCapacity += parseInt(nodeLoad.maxCapacity || '0');
    }
    
    return totalCapacity > 0 ? totalJobs / totalCapacity : 1;
  }
}

Summary of Key Learnings

Architecture Principles

Process Isolation: Each transcoding session gets independent FFmpeg process
Linear Encoding: FFmpeg cannot seek backwards, requiring process restart for large seeks
Resource Management: Proactive throttling and cleanup prevent resource exhaustion
Client Lifecycle: Ping-based session management with automatic cleanup

Performance Optimizations

Segment Strategy: 6-second HLS segments balance seek performance with efficiency
Throttling Logic: Pause encoding when >60 seconds ahead of playback
Cleanup Policies: Automatic segment removal for content >5 minutes
Hardware Acceleration: Graceful fallback from hardware to software encoding

Implementation Strategy

API Design: RESTful endpoints for session management and HLS segment delivery
Client Integration: Adaptive seeking strategy based on seek distance
Error Handling: Comprehensive retry logic with fallback options
Security: Input validation, path traversal protection, rate limiting

Deployment Considerations

Monitoring: Track active jobs, success rates, resource usage
Scaling: Horizontal scaling with Redis coordination
Storage: Temporary file management and cleanup
Network: CDN integration for segment delivery

This architecture provides a robust foundation for building production-grade media transcoding systems with proper resource management, client lifecycle handling, and performance optimization.

Core Architecture

1. Main Components

Role: Central orchestrator for all transcoding operations
Key Responsibilities:
- FFmpeg process lifecycle management
- Job tracking and session management
- Resource cleanup and monitoring
- Client ping/keepalive handling

MediaEncoder (`MediaEncoder.cs`)

Role: FFmpeg binary interface and process execution
Key Responsibilities:
- FFmpeg path validation and capability detection
- Process creation and monitoring
- Hardware acceleration detection
- Encoder/decoder capability enumeration

TranscodingJob (`TranscodingJob.cs`)

Role: Individual transcoding session state management
Key Responsibilities:
- Process lifecycle tracking
- Resource usage monitoring (bytes, position, bitrate)
- Timer management for auto-cleanup
- Client connection state

2. Control Flow

graph TB
    Client[Client Request] --> API[API Controller]
    API --> StreamState[Create StreamState]
    StreamState --> TranscodeManager[TranscodeManager.StartFfMpeg]
    TranscodeManager --> FFmpeg[Launch FFmpeg Process]
    FFmpeg --> Job[Create TranscodingJob]
    Job --> Monitor[Job Monitoring]
    Monitor --> Throttle[Throttling System]
    Monitor --> Cleanup[Segment Cleanup]
    Monitor --> Ping[Ping System]
    Ping --> Timeout[Kill Timer]

Detailed Flow:

Request Reception: API controllers (DynamicHlsController, VideosController, AudioController) receive transcoding requests
State Creation: StreamingHelpers.GetStreamingState() analyzes media and creates StreamState
Job Initialization: TranscodeManager.StartFfMpeg() creates and starts FFmpeg process
Process Monitoring: TranscodingJob tracks process state and resource usage
Client Management: Ping system keeps sessions alive, kill timers handle disconnections
Resource Management: Throttling and segment cleaning optimize performance
Cleanup: Automatic cleanup when sessions end or timeout

Key Systems

1. Ping System - Keep Transcoding Alive

Purpose: Prevents transcoding jobs from being killed when clients are actively consuming content.

Implementation:

// PingTranscodingJob method in TranscodeManager
public void PingTranscodingJob(string playSessionId, bool? isUserPaused)
{
    var jobs = _activeTranscodingJobs.Where(j => 
        string.Equals(playSessionId, j.PlaySessionId, StringComparison.OrdinalIgnoreCase))
        .ToList();
    
    foreach (var job in jobs)
    {
        if (isUserPaused.HasValue)
        {
            job.IsUserPaused = isUserPaused.Value;
        }
        PingTimer(job, true);
    }
}

private void PingTimer(TranscodingJob job, bool isProgressCheckIn)
{
    if (job.HasExited)
    {
        job.StopKillTimer();
        return;
    }

    // Different timeouts for different job types
    var timerDuration = job.Type != TranscodingJobType.Progressive ? 60000 : 10000;
    
    job.PingTimeout = timerDuration;
    job.LastPingDate = DateTime.UtcNow;
    
    job.StartKillTimer(OnTranscodeKillTimerStopped);
}

Key Parameters:

Progressive streams: 10 second timeout (10000ms)
HLS streams: 60 second timeout (60000ms)
Ping frequency: Clients should ping every 30-45 seconds
Auto-ping: Triggered on playback progress events

2. Kill Timers - Automatic Cleanup

Purpose: Automatically terminate abandoned transcoding jobs to free resources.

Implementation:

private async void OnTranscodeKillTimerStopped(object? state)
{
    var job = state as TranscodingJob;
    if (!job.HasExited && job.Type != TranscodingJobType.Progressive)
    {
        var timeSinceLastPing = (DateTime.UtcNow - job.LastPingDate).TotalMilliseconds;
        
        if (timeSinceLastPing < job.PingTimeout)
        {
            // Reset timer if ping is still fresh
            job.StartKillTimer(OnTranscodeKillTimerStopped, job.PingTimeout);
            return;
        }
    }
    
    // Kill the job
    await KillTranscodingJob(job, true, path => true).ConfigureAwait(false);
}

Configuration:

Grace period: Jobs get multiple timeout intervals before termination
Progressive vs HLS: Different timeout strategies
Resource cleanup: Process termination, file cleanup, live stream closure

3. Throttling - Resource Management

Purpose: Controls transcoding speed to prevent resource waste and improve efficiency.

Conditions for Throttling:

private static bool EnableThrottling(StreamState state)
    => state.InputProtocol == MediaProtocol.File
       && state.RunTimeTicks.HasValue
       && state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks
       && state.IsInputVideo
       && state.VideoType == VideoType.VideoFile;

Throttling Logic:

private bool IsThrottleAllowed(TranscodingJob job, int thresholdSeconds)
{
    var bytesDownloaded = job.BytesDownloaded;
    var transcodingPositionTicks = job.TranscodingPositionTicks ?? 0;
    var downloadPositionTicks = job.DownloadPositionTicks ?? 0;
    
    var gapLengthInTicks = TimeSpan.FromSeconds(thresholdSeconds).Ticks;
    
    if (downloadPositionTicks > 0 && transcodingPositionTicks > 0)
    {
        // HLS - time-based consideration
        var gap = transcodingPositionTicks - downloadPositionTicks;
        return gap > gapLengthInTicks;
    }
    
    // Progressive - byte-based consideration
    // Calculate if transcoding is ahead enough to throttle
}

Control Mechanism:

Pause command: Send 'p' (or 'c' for older FFmpeg) to stdin
Resume command: Send 'u' (or newline for older FFmpeg) to stdin
Threshold: Minimum 60 seconds ahead before throttling kicks in

4. Segment Cleaning - HLS Management

Purpose: Removes old HLS segments to prevent disk space exhaustion.

Conditions:

private static bool EnableSegmentCleaning(StreamState state)
    => state.InputProtocol is MediaProtocol.File or MediaProtocol.Http
       && state.IsInputVideo
       && state.TranscodingType == TranscodingJobType.Hls
       && state.RunTimeTicks.HasValue
       && state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks;

Implementation:

Segment retention: Keeps last N segments (configurable)
Cleanup frequency: Runs periodically during transcoding
File patterns: Removes .ts or .mp4 segments and related files
Safety: Includes retry logic and error handling

FFmpeg Command Structure

Complete Command Template

{inputModifier} {inputArgument} -map_metadata -1 -map_chapters -1 -threads {threads} {mapArgs} {videoArguments} {audioArguments} -copyts -avoid_negative_ts disabled -max_muxing_queue_size {maxMuxingQueueSize} -f hls -max_delay 5000000 -hls_time {segmentLength} -hls_segment_type {segmentFormat} -start_number {startNumber} -hls_segment_filename "{segmentPath}" {hlsArguments} -y "{outputPath}"

Parameter Breakdown

Input Modifiers

-re                           # Read input at native framerate
-hwaccel cuda                 # Hardware acceleration (optional)
-fflags +genpts              # Generate presentation timestamps
-analyzeduration 5000000     # Analysis duration for streams
-readrate 10                 # Input read rate limit (for segment deletion)

Core Parameters

-threads 0                    # Auto-detect thread count
-map_metadata -1             # Strip metadata
-map_chapters -1             # Strip chapters
-copyts                      # Copy timestamps
-avoid_negative_ts disabled  # Handle negative timestamps
-max_muxing_queue_size 128   # Muxing queue size

HLS-Specific Parameters

-f hls                                    # Output format
-hls_time 6                              # Segment duration (seconds)
-hls_segment_type mpegts                 # Segment container (mpegts/fmp4)
-hls_playlist_type vod                   # Playlist type (vod/event)
-hls_list_size 0                         # Keep all segments in playlist
-start_number 0                          # Starting segment number
-hls_segment_filename "output%d.ts"      # Segment naming pattern
-hls_base_url "hls/stream/"              # Base URL for segments

Output Specification

"output.m3u8"                            # Output playlist file

Video Encoding Parameters

Quality Control

-c:v libx264                   # Video codec
-preset veryfast               # Encoding speed/quality trade-off
-crf 23                        # Constant rate factor (quality)
-maxrate 2000k                 # Maximum bitrate
-bufsize 4000k                 # Buffer size

Resolution and Framerate

-vf "scale=1920:1080"         # Scale to specific resolution
-r 30                         # Output framerate
-pix_fmt yuv420p              # Pixel format

Audio Encoding Parameters

Basic Audio

-c:a aac                      # Audio codec
-ab 128k                      # Audio bitrate
-ar 48000                     # Sample rate
-ac 2                         # Audio channels

Advanced Audio Processing

-af "volume=1.0"              # Audio filters
-acodec copy                  # Copy audio stream

Implementation Guide for Next.js

1. Core Architecture

// Core interfaces
interface TranscodingJob {
  id: string;
  playSessionId: string;
  process: ChildProcess;
  lastPingDate: Date;
  pingTimeout: number;
  bytesDownloaded: number;
  transcodingPositionTicks: number;
  activeRequestCount: number;
  isUserPaused: boolean;
  hasExited: boolean;
}

interface StreamState {
  outputFilePath: string;
  segmentLength: number;
  inputProtocol: string;
  runTimeTicks: number;
  isInputVideo: boolean;
  transcodingType: 'hls' | 'progressive';
}

2. TranscodeManager Implementation

class TranscodeManager {
  private activeJobs = new Map<string, TranscodingJob>();
  private killTimers = new Map<string, NodeJS.Timeout>();
  
  async startFfmpeg(state: StreamState, playSessionId: string): Promise<TranscodingJob> {
    const commandArgs = this.buildFfmpegCommand(state);
    const process = spawn('ffmpeg', commandArgs);
    
    const job: TranscodingJob = {
      id: crypto.randomUUID(),
      playSessionId,
      process,
      lastPingDate: new Date(),
      pingTimeout: state.transcodingType === 'progressive' ? 10000 : 60000,
      bytesDownloaded: 0,
      transcodingPositionTicks: 0,
      activeRequestCount: 1,
      isUserPaused: false,
      hasExited: false
    };
    
    this.activeJobs.set(playSessionId, job);
    this.startKillTimer(job);
    
    return job;
  }
  
  pingTranscodingJob(playSessionId: string, isUserPaused?: boolean) {
    const job = this.activeJobs.get(playSessionId);
    if (!job || job.hasExited) return;
    
    if (isUserPaused !== undefined) {
      job.isUserPaused = isUserPaused;
    }
    
    job.lastPingDate = new Date();
    this.resetKillTimer(job);
  }
  
  private startKillTimer(job: TranscodingJob) {
    this.clearKillTimer(job.playSessionId);
    
    const timer = setTimeout(() => {
      this.checkAndKillJob(job);
    }, job.pingTimeout);
    
    this.killTimers.set(job.playSessionId, timer);
  }
  
  private async checkAndKillJob(job: TranscodingJob) {
    const timeSinceLastPing = Date.now() - job.lastPingDate.getTime();
    
    if (timeSinceLastPing < job.pingTimeout) {
      // Reset timer if ping is still fresh
      this.startKillTimer(job);
      return;
    }
    
    // Kill the job
    await this.killTranscodingJob(job);
  }
}

3. API Endpoints

// Next.js API routes
// /api/transcoding/[playSessionId]/ping
export async function POST(request: Request, { params }: { params: { playSessionId: string } }) {
  const { isUserPaused } = await request.json();
  transcodeManager.pingTranscodingJob(params.playSessionId, isUserPaused);
  return Response.json({ success: true });
}

// /api/hls/[...segments]
export async function GET(request: Request, { params }: { params: { segments: string[] } }) {
  const [playSessionId, segmentFile] = params.segments;
  
  if (segmentFile.endsWith('.m3u8')) {
    // Return playlist
    return new Response(playlist, {
      headers: { 'Content-Type': 'application/vnd.apple.mpegurl' }
    });
  } else {
    // Return segment file
    const filePath = path.join(transcodingDir, segmentFile);
    const fileStream = fs.createReadStream(filePath);
    return new Response(fileStream as any);
  }
}

4. Client Integration

// Client-side ping implementation
class MediaPlayer {
  private pingInterval: NodeJS.Timeout | null = null;
  private playSessionId: string;
  
  startPinging() {
    this.pingInterval = setInterval(() => {
      fetch(`/api/transcoding/${this.playSessionId}/ping`, {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ isUserPaused: this.isPaused })
      });
    }, 30000); // Ping every 30 seconds
  }
  
  stopPinging() {
    if (this.pingInterval) {
      clearInterval(this.pingInterval);
      this.pingInterval = null;
    }
  }
}

Seeking Implementation

Jellyfin implements sophisticated seeking mechanisms for both progressive and HLS transcoding, handling different scenarios and optimizing for performance.

1. Core Seeking Logic

Central Method: GetFastSeekCommandLineParameter() in EncodingHelper.cs

public string GetFastSeekCommandLineParameter(EncodingJobInfo state, EncodingOptions options, string segmentContainer)
{
    var time = state.BaseRequest.StartTimeTicks ?? 0;
    var maxTime = state.RunTimeTicks ?? 0;
    var seekParam = string.Empty;

    if (time > 0)
    {
        // For direct streaming/remuxing, we seek at the exact position of the keyframe
        // However, ffmpeg will seek to previous keyframe when the exact time is the input
        // Workaround this by adding 0.5s offset to the seeking time to get the exact keyframe on most videos.
        // This will help subtitle syncing.
        var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec);
        var seekTick = isHlsRemuxing ? time + 5000000L : time;

        // Seeking beyond EOF makes no sense in transcoding. Clamp the seekTick value to
        // [0, RuntimeTicks - 5.0s], so that the muxer gets packets and avoid error codes.
        if (maxTime > 0)
        {
            seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0));
        }

        seekParam += string.Format(CultureInfo.InvariantCulture, "-ss {0}", _mediaEncoder.GetTimeParameter(seekTick));

        if (state.IsVideoRequest)
        {
            // Add -noaccurate_seek for specific conditions
            if (!string.Equals(state.InputContainer, "wtv", StringComparison.OrdinalIgnoreCase)
                && !string.Equals(segmentFormat, "ts", StringComparison.OrdinalIgnoreCase)
                && state.TranscodingType != TranscodingJobType.Progressive
                && !state.EnableBreakOnNonKeyFrames(outputVideoCodec)
                && (state.BaseRequest.StartTimeTicks ?? 0) > 0)
            {
                seekParam += " -noaccurate_seek";
            }
        }
    }

    return seekParam;
}

2. Seeking Types

A. Input Seeking (`-ss` before input)

Purpose: Seek to position before decoding starts
Advantages: Very fast, minimal CPU usage
Disadvantages: Less accurate, seeks to nearest keyframe
Usage: Primary method for initial positioning

B. Output Seeking (`-ss` after input)

Purpose: Decode from beginning, then seek in output
Advantages: Frame-accurate positioning
Disadvantages: High CPU usage, slower startup
Usage: When precision is critical

C. Accurate vs Fast Seeking

Fast seeking (-noaccurate_seek): Seeks to nearest keyframe (default)
Accurate seeking: Frame-precise but slower
Dynamic selection: Based on container and transcoding type

3. HLS Segment-Based Seeking

Segment URL Structure

/hls/{itemId}/{playlistId}/{segmentId}.{container}?runtimeTicks={position}&actualSegmentLengthTicks={duration}

Key Parameters:

runtimeTicks: Starting position of segment in media timeline
actualSegmentLengthTicks: Precise duration of this specific segment
segmentId: Sequential segment number (0-based)

Segment Calculation:

// Equal-length segments
var segmentLengthTicks = TimeSpan.FromSeconds(segmentLength).Ticks;
var wholeSegments = runtimeTicks / segmentLengthTicks;
var remainingTicks = runtimeTicks % segmentLengthTicks;

// Keyframe-aware segments (optimal)
var result = new List<double>();
var desiredSegmentLengthTicks = TimeSpan.FromMilliseconds(desiredSegmentLengthMs).Ticks;
foreach (var keyframe in keyframeData.KeyframeTicks)
{
    if (keyframe >= desiredCutTime)
    {
        var currentSegmentLength = keyframe - lastKeyframe;
        result.Add(TimeSpan.FromTicks(currentSegmentLength).TotalSeconds);
        lastKeyframe = keyframe;
        desiredCutTime += desiredSegmentLengthTicks;
    }
}

4. Seek Optimizations

HLS Remuxing Offset

// Add 0.5s offset for HLS remuxing to hit exact keyframes
var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec);
var seekTick = isHlsRemuxing ? time + 5000000L : time; // +0.5s

EOF Protection

// Prevent seeking beyond end of file
if (maxTime > 0)
{
    seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0)); // -5s buffer
}

Container-Specific Rules

WTV containers: Never use -noaccurate_seek (breaks seeking)
MPEGTS segments: Disable -noaccurate_seek for client compatibility
fMP4 containers: Require -noaccurate_seek for audio sync

5. Keyframe Extraction

Purpose: Generate precise segment boundaries aligned with keyframes

FFprobe Method:

ffprobe -loglevel error -skip_frame nokey -select_streams v:0 -show_entries packet=pts_time,flags -of csv=print_section=0 "input.mp4"

Matroska Method:

Direct cue point extraction from container metadata
Much faster than FFprobe for MKV files
Reads cue tables for instant keyframe positions

6. Real-Time Seeking Scenarios

Progressive Transcoding

// Client seeks to new position
const seekTo = (positionSeconds: number) => {
  // Kill current transcoding job
  await fetch(`/api/transcode/kill/${playSessionId}`, { method: 'POST' });
  
  // Start new transcoding from seek position
  const startTimeTicks = positionSeconds * 10000000; // Convert to ticks
  const newUrl = `/api/videos/${itemId}/stream?startTimeTicks=${startTimeTicks}`;
  
  // Update video source
  videoElement.src = newUrl;
};

HLS Seeking

// HLS seeking is handled by the player automatically
// Server generates segments with proper seek points
const hlsPlayer = new Hls();
hlsPlayer.loadSource('/api/videos/{itemId}/master.m3u8');

// Player handles seeking by requesting appropriate segments
// No need to restart transcoding jobs

7. External Media Handling

External Subtitles

// Also seek external subtitle streams
var seekSubParam = GetFastSeekCommandLineParameter(state, options, segmentContainer);
if (!string.IsNullOrEmpty(seekSubParam))
{
    arg.Append(' ').Append(seekSubParam);
}
arg.Append(" -i file:\"").Append(subtitlePath).Append('"');

External Audio

// Seek external audio streams to match video
var seekAudioParam = GetFastSeekCommandLineParameter(state, options, segmentContainer);
if (!string.IsNullOrEmpty(seekAudioParam))
{
    arg.Append(' ').Append(seekAudioParam);
}
arg.Append(" -i \"").Append(state.AudioStream.Path).Append('"');

8. Implementation Guide for Next.js

Progressive Seeking

class ProgressiveTranscoder {
  async seekTo(positionTicks: number): Promise<string> {
    // Kill existing job
    await this.killCurrentJob();
    
    // Calculate seek parameters
    const seekSeconds = positionTicks / 10000000;
    const maxTime = this.mediaDuration;
    
    // Clamp to safe bounds
    const safeSeekTicks = Math.max(0, Math.min(positionTicks, maxTime - 50000000));
    
    // Build FFmpeg command with seek
    const args = [
      '-ss', this.formatTime(safeSeekTicks),
      '-i', this.inputPath,
      '-c:v', 'libx264',
      '-preset', 'veryfast',
      // ... other encoding params
      this.outputPath
    ];
    
    return this.startTranscoding(args);
  }
  
  private formatTime(ticks: number): string {
    const totalSeconds = ticks / 10000000;
    const hours = Math.floor(totalSeconds / 3600);
    const minutes = Math.floor((totalSeconds % 3600) / 60);
    const seconds = totalSeconds % 60;
    return `${hours}:${minutes.toString().padStart(2, '0')}:${seconds.toFixed(6).padStart(9, '0')}`;
  }
}

HLS Seeking

class HLSTranscoder {
  generateSegmentUrl(segmentId: number, runtimeTicks: number, segmentDurationTicks: number): string {
    const params = new URLSearchParams({
      runtimeTicks: runtimeTicks.toString(),
      actualSegmentLengthTicks: segmentDurationTicks.toString()
    });
    
    return `/api/hls/${this.itemId}/${this.playlistId}/${segmentId}.ts?${params}`;
  }
  
  async generateSegment(segmentId: number, runtimeTicks: number): Promise<Buffer> {
    const seekSeconds = runtimeTicks / 10000000;
    
    const args = [
      '-ss', this.formatTime(runtimeTicks),
      '-i', this.inputPath,
      '-t', this.segmentDuration.toString(),
      '-c:v', 'libx264',
      '-preset', 'veryfast',
      '-force_key_frames', `expr:gte(t,n_forced*${this.segmentDuration})`,
      '-f', 'mpegts',
      '-'
    ];
    
    return this.executeFFmpeg(args);
  }
}

9. Performance Considerations

Seek Performance Tips:

Use input seeking (-ss before -i) when possible
Cache keyframe data for containers that support it
Implement seek debouncing to prevent rapid job restarts
Use appropriate segment duration (6s recommended for seek performance)
Pre-generate keyframe indexes for frequently accessed content

Client-Side Optimizations:

// Debounce seeking to prevent excessive requests
const debouncedSeek = debounce((position: number) => {
  this.performSeek(position);
}, 300);

// Progressive seeking strategy
if (Math.abs(targetPosition - currentPosition) < 30) {
  // Small seeks: let player buffer naturally
  player.currentTime = targetPosition;
} else {
  // Large seeks: restart transcoding
  this.seekTo(targetPosition);
}

Progress Tracking and Seeking During Transcoding

The Challenge: Unlike direct play where duration and seek positions are straightforward, transcoding creates a "streaming-like" scenario where the real duration is not immediately available and progress tracking becomes complex.

1. Core Progress Tracking Architecture

Key Components:

TranscodingPositionTicks: Where FFmpeg transcoding has currently reached
DownloadPositionTicks: Where the client has consumed content to
CompletionPercentage: Calculated progress based on runtime vs current position
RunTimeTicks: Total media duration from metadata

Progress Calculation Logic:

// From JobLogger.ParseLogLine() - extracts progress from FFmpeg output
var totalMs = state.RunTimeTicks.HasValue
    ? TimeSpan.FromTicks(state.RunTimeTicks.Value).TotalMilliseconds
    : 0;

var currentMs = /* parsed from FFmpeg time output */;

if (totalMs > 0)
{
    percent = 100.0 * currentMs / totalMs;
    transcodingPosition = TimeSpan.FromMilliseconds(currentMs);
}

2. Real-Time Progress Updates

FFmpeg Output Parsing:

// JobLogger monitors FFmpeg stderr output for progress
private void ParseLogLine(string line, EncodingJobInfo state)
{
    // Parse: frame=  123 fps= 25 q=28.0 size=    1024kB time=00:01:23.45 bitrate= 512.0kbits/s
    // Extract: time value for current transcoding position
    // Extract: size value for bytes transcoded
    // Extract: bitrate for current encoding rate
}

Progress Reporting Chain:

// 1. FFmpeg outputs progress to stderr
// 2. JobLogger.ParseLogLine() extracts values
// 3. TranscodeManager.ReportTranscodingProgress() updates job state
// 4. SessionManager.ReportTranscodingInfo() updates client session
// 5. TranscodingInfo DTO sent to client via WebSocket/API

interface TranscodingInfo {
  CompletionPercentage?: number;  // 0-100 progress
  Bitrate?: number;              // Current bitrate
  Framerate?: number;            // Current FPS
  Width?: number;                // Video width
  Height?: number;               // Video height
  AudioCodec: string;            // Audio codec in use
  VideoCodec: string;            // Video codec in use
  Container: string;             // Output container
}

3. Client Progress Bar Implementation

Progressive Transcoding:

class ProgressiveTranscodingProgress {
  private transcodingInfo: TranscodingInfo;
  private mediaRunTimeTicks: number;
  
  updateProgressBar(): void {
    if (this.transcodingInfo?.CompletionPercentage) {
      // Use transcoding percentage directly
      const progress = this.transcodingInfo.CompletionPercentage / 100;
      this.progressBar.value = progress;
      
      // Estimate current playable duration
      const availableDuration = this.mediaRunTimeTicks * progress;
      this.updateSeekableRange(0, availableDuration);
    }
  }
  
  handleSeek(targetPositionTicks: number): void {
    const transcodedTicks = this.mediaRunTimeTicks * (this.transcodingInfo.CompletionPercentage / 100);
    
    if (targetPositionTicks <= transcodedTicks) {
      // Seek within transcoded content
      this.player.currentTime = targetPositionTicks / 10000000;
    } else {
      // Restart transcoding from seek position
      this.startTranscodingFromPosition(targetPositionTicks);
    }
  }
}

HLS Transcoding:

class HLSTranscodingProgress {
  private segmentDuration: number = 6; // seconds
  private totalSegments: number;
  
  calculateProgress(): ProgressInfo {
    // HLS progress based on segment availability
    const availableSegments = this.getAvailableSegmentCount();
    const progress = availableSegments / this.totalSegments;
    
    return {
      percentage: progress * 100,
      availableDuration: availableSegments * this.segmentDuration,
      seekableEnd: availableSegments * this.segmentDuration
    };
  }
  
  updateSegmentDownloadPosition(): void {
    // Update DownloadPositionTicks when segments are consumed
    const segmentEndTicks = this.currentRuntimeTicks + this.actualSegmentLengthTicks;
    this.transcodingJob.DownloadPositionTicks = Math.max(
      this.transcodingJob.DownloadPositionTicks ?? segmentEndTicks,
      segmentEndTicks
    );
  }
}

4. Solving the "Real Duration is Not Real" Problem

Duration Estimation Strategies:

1. Metadata-Based Duration:

// Use media file metadata as baseline
const estimatedDuration = mediaSource.RunTimeTicks;
if (estimatedDuration) {
  this.totalDuration = estimatedDuration;
  this.progressBar.max = estimatedDuration;
}

2. Progressive Duration Discovery:

// Update duration as transcoding progresses
if (transcodingInfo.CompletionPercentage > 0) {
  const currentTranscodedTicks = /* current position from transcoding */;
  const estimatedTotal = currentTranscodedTicks / (transcodingInfo.CompletionPercentage / 100);
  
  // Only update if estimate seems reliable (>10% transcoded)
  if (transcodingInfo.CompletionPercentage > 10) {
    this.totalDuration = estimatedTotal;
  }
}

3. HLS Segment-Based Calculation:

// For HLS, calculate from segment structure
const calculateHLSDuration = (segments: SegmentInfo[]): number => {
  return segments.reduce((total, segment) => {
    return total + segment.actualSegmentLengthTicks;
  }, 0);
};

5. Advanced Progress Management

Buffering and Availability:

class TranscodingBuffer {
  private bufferAheadSeconds: number = 30;
  
  isPositionAvailable(targetPositionTicks: number): boolean {
    const transcodedTicks = this.getTranscodedPositionTicks();
    return targetPositionTicks <= transcodedTicks;
  }
  
  calculateSeekableRange(): { start: number; end: number } {
    return {
      start: 0,
      end: this.getTranscodedPositionTicks() - (this.bufferAheadSeconds * 10000000)
    };
  }
  
  shouldThrottleTranscoding(): boolean {
    const gap = this.transcodingPositionTicks - this.downloadPositionTicks;
    const targetGap = this.bufferAheadSeconds * 10000000; // 30s in ticks
    return gap > targetGap;
  }
}

Smooth Progress Updates:

class SmoothProgressUpdater {
  private interpolationInterval: number = 1000; // 1 second
  private lastKnownPosition: number = 0;
  private lastUpdateTime: number = Date.now();
  
  interpolateProgress(): number {
    if (!this.isPlaying) return this.lastKnownPosition;
    
    const now = Date.now();
    const elapsed = now - this.lastUpdateTime;
    const estimatedProgress = this.lastKnownPosition + elapsed;
    
    // Don't exceed known transcoded position
    const maxAvailable = this.getTranscodedPositionTicks();
    return Math.min(estimatedProgress, maxAvailable);
  }
}

6. Error Handling and Edge Cases

Transcoding Failures:

class TranscodingErrorHandler {
  handleTranscodingError(error: TranscodingError): void {
    switch (error.type) {
      case 'SEEK_BEYOND_DURATION':
        // Clamp seek to valid range
        this.seekTo(Math.min(this.targetPosition, this.maxAvailablePosition));
        break;
        
      case 'TRANSCODING_STALLED':
        // Restart transcoding
        this.restartTranscodingFromLastKnownPosition();
        break;
        
      case 'INVALID_DURATION':
        // Fall back to live estimation
        this.enableLiveDurationEstimation();
        break;
    }
  }
}

Network Issues:

class NetworkResilienceHandler {
  private retryPolicy = {
    maxRetries: 3,
    backoffMs: [1000, 2000, 4000]
  };
  
  async handleProgressUpdateFailure(attempt: number): Promise<void> {
    if (attempt < this.retryPolicy.maxRetries) {
      await this.delay(this.retryPolicy.backoffMs[attempt]);
      return this.fetchProgressUpdate();
    } else {
      // Switch to local time-based estimation
      this.enableLocalProgressEstimation();
    }
  }
}

7. Next.js Implementation Guide

Complete Progress Management:

class NextJSTranscodingProgressManager {
  private wsConnection: WebSocket;
  private progressUpdateInterval: NodeJS.Timeout;
  
  constructor(private videoElement: HTMLVideoElement) {
    this.setupWebSocketUpdates();
    this.setupProgressInterpolation();
  }
  
  private setupWebSocketUpdates(): void {
    this.wsConnection.onmessage = (event) => {
      const message = JSON.parse(event.data);
      
      if (message.MessageType === 'TranscodingInfo') {
        this.updateTranscodingInfo(message.Data);
      }
    };
  }
  
  private updateTranscodingInfo(info: TranscodingInfo): void {
    // Update progress bar
    if (info.CompletionPercentage) {
      this.progressBar.value = info.CompletionPercentage;
    }
    
    // Update seekable range
    const seekableEnd = this.calculateSeekableEnd(info);
    this.videoElement.setAttribute('data-seekable-end', seekableEnd.toString());
    
    // Update duration if we have better estimate
    this.updateDurationEstimate(info);
  }
  
  async handleSeek(targetSeconds: number): Promise<void> {
    const targetTicks = targetSeconds * 10000000;
    const transcodedTicks = this.getTranscodedPositionTicks();
    
    if (targetTicks <= transcodedTicks) {
      // Seek within available content
      this.videoElement.currentTime = targetSeconds;
    } else {
      // Show loading state
      this.showBufferingState();
      
      // Request new transcoding position
      await this.requestTranscodingFromPosition(targetTicks);
      
      // Update video source
      this.videoElement.src = this.generateStreamUrl(targetTicks);
    }
  }
}

Important Configuration

Environment Variables

FFMPEG_PATH=/usr/bin/ffmpeg
TRANSCODING_TEMP_PATH=/tmp/jellyfin/transcoding
MAX_CONCURRENT_STREAMS=3
SEGMENT_KEEP_SECONDS=300
THROTTLE_DELAY_SECONDS=60

Performance Tuning

Thread count: Auto-detect based on CPU cores
Buffer sizes: Adjust based on available memory
Segment duration: 6 seconds for good seek performance
Concurrent streams: Limit based on system resources

Security Considerations

Input validation: Sanitize all file paths and parameters
Resource limits: Prevent DOS through excessive transcoding
Access control: Validate session ownership
File cleanup: Remove orphaned files regularly

Monitoring and Logging

Key Metrics to Track

Active transcoding jobs count
Resource usage (CPU, memory, disk I/O)
Average transcoding speed vs playback speed
Client ping frequency and timeouts
Segment cleanup efficiency

Error Handling

FFmpeg process failures
Disk space exhaustion
Network timeouts
Invalid media files
Hardware acceleration failures

This architecture provides a robust foundation for building a media transcoding system with proper resource management, client lifecycle handling, and performance optimization.

48 KiB Raw Blame History

Jellyfin Transcoding Architecture Documentation

Executive Summary

Key Implementation Insights

1. FFmpeg Process Management Strategy

2. Resource Management Philosophy

3. Session Lifecycle Management

4. FFmpeg Command Architecture

Critical Design Decisions

1. When to Restart vs Continue Transcoding

2. Segment Duration Strategy

3. Hardware Acceleration Integration

Architecture Patterns for Next.js Implementation

1. Core Classes Structure

2. API Endpoint Design

3. Client Integration Patterns

4. Configuration Management

Production Deployment Considerations

1. Performance Monitoring

2. Error Handling Strategy

3. Security Considerations

4. Scaling Strategies

Summary of Key Learnings

Architecture Principles

Performance Optimizations

Implementation Strategy

Deployment Considerations

Core Architecture

1. Main Components

MediaEncoder (MediaEncoder.cs)

TranscodingJob (TranscodingJob.cs)

2. Control Flow

Detailed Flow:

Key Systems

1. Ping System - Keep Transcoding Alive

2. Kill Timers - Automatic Cleanup

3. Throttling - Resource Management

4. Segment Cleaning - HLS Management

FFmpeg Command Structure

Complete Command Template

Parameter Breakdown

Input Modifiers

Core Parameters

HLS-Specific Parameters

Output Specification

Video Encoding Parameters

Quality Control

Resolution and Framerate

Audio Encoding Parameters

Basic Audio

Advanced Audio Processing

Implementation Guide for Next.js

1. Core Architecture

2. TranscodeManager Implementation

3. API Endpoints

4. Client Integration

Seeking Implementation

1. Core Seeking Logic

2. Seeking Types

A. Input Seeking (-ss before input)

B. Output Seeking (-ss after input)

C. Accurate vs Fast Seeking

3. HLS Segment-Based Seeking

Segment URL Structure

Key Parameters:

Segment Calculation:

4. Seek Optimizations

HLS Remuxing Offset

EOF Protection

Container-Specific Rules

5. Keyframe Extraction

FFprobe Method:

Matroska Method:

6. Real-Time Seeking Scenarios

Progressive Transcoding

HLS Seeking

7. External Media Handling

External Subtitles

External Audio

8. Implementation Guide for Next.js

48 KiB

Raw Blame History

MediaEncoder (`MediaEncoder.cs`)

TranscodingJob (`TranscodingJob.cs`)

A. Input Seeking (`-ss` before input)

B. Output Seeking (`-ss` after input)