# Jellyfin Transcoding Architecture Documentation ## Executive Summary Jellyfin's transcoding system is a sophisticated media processing pipeline built around **FFmpeg** that provides real-time video and audio conversion for various client devices. This document details the core architecture, control flow, and implementation patterns valuable for building a similar system in Next.js. **Key Findings:** - **Engine**: FFmpeg managed through MediaEncoder class with hardware acceleration support - **Job Management**: TranscodeManager orchestrates process lifecycle, throttling, and cleanup - **Session Control**: Client ping system (10s/60s timeouts) with automatic kill timers - **Resource Management**: Intelligent throttling based on playback position (+60s threshold) - **Segment Management**: Automatic cleanup for HLS content >5 minutes duration - **Seeking Strategy**: FFmpeg process restart required for large seeks due to linear encoding nature ## Key Implementation Insights ### 1. **FFmpeg Process Management Strategy** **Core Principle**: Each transcoding job is a **linear, stateful FFmpeg process** that cannot seek to arbitrary positions after encoding starts. This fundamental limitation drives the entire architecture design. **Implications**: - **Large seeks require process restart**: When users seek >30 seconds, kill current job and start new one - **Small seeks use player buffering**: Let HLS players handle <30 second seeks naturally - **Process isolation**: Each session gets independent FFmpeg process for resource control ### 2. **Resource Management Philosophy** **Throttling Strategy**: Prevent transcoding from running too far ahead of playback - **Threshold**: Pause encoding when >60 seconds ahead of client consumption - **Control Method**: Send FFmpeg stdin commands ('p' for pause, 'u' for resume) - **Benefits**: Reduces CPU/disk usage, prevents unnecessary work **Segment Cleanup Policy**: Automatic disk space management for HLS - **Trigger Conditions**: Video content with duration >5 minutes - **Retention**: Configurable number of segments to keep - **Safety**: Retry logic with graceful error handling ### 3. **Session Lifecycle Management** **Ping System Design**: Keep-alive mechanism prevents orphaned processes - **Progressive streams**: 10-second timeout (fast response needed) - **HLS streams**: 60-second timeout (more buffering tolerance) - **Client responsibility**: Ping every 30-45 seconds during active playback **Kill Timer Implementation**: Automatic cleanup for abandoned sessions - **Grace periods**: Multiple timeout intervals before termination - **Resource cleanup**: Process termination + file cleanup + stream closure ### 4. **FFmpeg Command Architecture** **Standard HLS Command Structure** (per memory specifications): ```bash ffmpeg [input_modifiers] -i input.mp4 [encoding_params] -f hls -hls_time 6 -hls_segment_filename "segment%d.ts" output.m3u8 ``` **Essential Parameters**: - **`-f hls`**: HLS output format - **`-hls_time 6`**: 6-second segments for optimal seek performance - **`-hls_segment_filename`**: Consistent naming pattern - **`output.m3u8`**: Playlist file output ## Critical Design Decisions ### 1. **When to Restart vs Continue Transcoding** **Process Restart Required**: - Large seek operations (>30 seconds) - Quality/resolution changes - Audio track switching - Subtitle track changes **Continue Existing Process**: - Small seeks (<30 seconds) - let player handle - Pause/resume operations - Client reconnections within timeout window ### 2. **Segment Duration Strategy** **6-Second Standard**: Optimal balance between: - **Seek performance**: Reasonable granularity for user seeking - **Network efficiency**: Not too many small requests - **Startup time**: Quick initial buffering - **Storage overhead**: Manageable file count ### 3. **Hardware Acceleration Integration** **MediaEncoder Responsibilities**: - **Capability detection**: Probe available hardware encoders - **Process execution**: Manage FFmpeg with hardware flags - **Fallback handling**: Graceful degradation to software encoding - **Performance monitoring**: Track encoding speeds and success rates ## Architecture Patterns for Next.js Implementation ### 1. **Core Classes Structure** ```typescript // Main orchestrator (equivalent to TranscodeManager) class TranscodingOrchestrator { private activeJobs = new Map(); private killTimers = new Map(); async startTranscoding(request: TranscodingRequest): Promise pingSession(sessionId: string, isUserPaused?: boolean): void async killSession(sessionId: string): Promise } // FFmpeg interface (equivalent to MediaEncoder) class MediaProcessor { private ffmpegPath: string; private hardwareCapabilities: HardwareCapabilities; async probe(inputPath: string): Promise async startProcess(args: string[]): Promise detectHardwareSupport(): HardwareCapabilities } // Individual job tracking (equivalent to TranscodingJob) class TranscodingSession { id: string; process: ChildProcess; lastPingDate: Date; throttler?: TranscodingThrottler; segmentCleaner?: SegmentCleaner; startKillTimer(callback: () => void): void stopKillTimer(): void pause(): Promise resume(): Promise } ``` ### 2. **API Endpoint Design** ```typescript // Next.js API routes structure // POST /api/transcoding/start export async function POST(request: Request) { const { itemId, startTimeTicks, quality } = await request.json(); const job = await orchestrator.startTranscoding({ itemId, startTimeTicks, quality, sessionId: generateSessionId() }); return Response.json({ sessionId: job.id, playlistUrl: job.playlistPath }); } // POST /api/transcoding/[sessionId]/ping export async function POST(request: Request, { params }: { params: { sessionId: string } }) { const { isUserPaused } = await request.json(); orchestrator.pingSession(params.sessionId, isUserPaused); return Response.json({ success: true }); } // GET /api/hls/[sessionId]/[segmentId].ts export async function GET(request: Request, { params }: { params: { sessionId: string, segmentId: string } }) { const segmentPath = getSegmentPath(params.sessionId, params.segmentId); const stream = fs.createReadStream(segmentPath); return new Response(stream as any); } // DELETE /api/transcoding/[sessionId] export async function DELETE(request: Request, { params }: { params: { sessionId: string } }) { await orchestrator.killSession(params.sessionId); return Response.json({ success: true }); } ``` ### 3. **Client Integration Patterns** ```typescript class AdaptiveMediaPlayer { private sessionId: string; private pingInterval: NodeJS.Timeout; private lastPosition: number = 0; async startPlayback(itemId: string, startPosition: number = 0) { // Start transcoding session const response = await fetch('/api/transcoding/start', { method: 'POST', body: JSON.stringify({ itemId, startTimeTicks: startPosition * 10000000 }) }); const { sessionId, playlistUrl } = await response.json(); this.sessionId = sessionId; this.startPinging(); // Load HLS playlist this.hls.loadSource(playlistUrl); } private startPinging() { this.pingInterval = setInterval(() => { fetch(`/api/transcoding/${this.sessionId}/ping`, { method: 'POST', body: JSON.stringify({ isUserPaused: this.videoElement.paused }) }); }, 30000); // Ping every 30 seconds } async seekTo(targetPosition: number) { const seekDistance = Math.abs(targetPosition - this.lastPosition); if (seekDistance > 30) { // Large seek: restart transcoding await this.stopPlayback(); await this.startPlayback(this.itemId, targetPosition); } else { // Small seek: let HLS player handle this.videoElement.currentTime = targetPosition; } this.lastPosition = targetPosition; } async stopPlayback() { if (this.pingInterval) { clearInterval(this.pingInterval); } if (this.sessionId) { await fetch(`/api/transcoding/${this.sessionId}`, { method: 'DELETE' }); } } } ``` ### 4. **Configuration Management** ```typescript // Environment configuration interface TranscodingConfig { ffmpegPath: string; transcodingTempPath: string; maxConcurrentStreams: number; segmentDuration: number; // 6 seconds recommended segmentKeepCount: number; throttleThresholdSeconds: number; // 60 seconds recommended pingTimeoutProgressive: number; // 10 seconds pingTimeoutHls: number; // 60 seconds hardwareAcceleration: 'auto' | 'nvidia' | 'intel' | 'amd' | 'none'; } // Load from environment const config: TranscodingConfig = { ffmpegPath: process.env.FFMPEG_PATH || '/usr/bin/ffmpeg', transcodingTempPath: process.env.TRANSCODING_TEMP_PATH || '/tmp/transcoding', maxConcurrentStreams: parseInt(process.env.MAX_CONCURRENT_STREAMS || '3'), segmentDuration: 6, segmentKeepCount: parseInt(process.env.SEGMENT_KEEP_COUNT || '10'), throttleThresholdSeconds: 60, pingTimeoutProgressive: 10000, pingTimeoutHls: 60000, hardwareAcceleration: (process.env.HARDWARE_ACCEL as any) || 'auto' }; ``` ## Production Deployment Considerations ### 1. **Performance Monitoring** ```typescript interface TranscodingMetrics { activeJobs: number; averageStartupTime: number; successRate: number; cpuUsage: number; memoryUsage: number; diskIORate: number; clientTimeouts: number; } class MetricsCollector { collectMetrics(): TranscodingMetrics { return { activeJobs: this.orchestrator.getActiveJobCount(), averageStartupTime: this.calculateAverageStartup(), successRate: this.calculateSuccessRate(), cpuUsage: process.cpuUsage().user / 1000000, // Convert to seconds memoryUsage: process.memoryUsage().heapUsed / 1024 / 1024, // MB diskIORate: this.calculateDiskIO(), clientTimeouts: this.timeoutCounter }; } } ``` ### 2. **Error Handling Strategy** ```typescript class TranscodingErrorHandler { async handleFFmpegFailure(job: TranscodingSession, error: Error) { // Log error with context logger.error('FFmpeg process failed', { jobId: job.id, inputPath: job.inputPath, error: error.message, exitCode: job.process.exitCode }); // Attempt recovery strategies if (this.isRetryableError(error)) { return this.retryWithFallback(job); } // Clean up resources await this.cleanupFailedJob(job); // Notify client this.notifyClientError(job.sessionId, error); } private async retryWithFallback(job: TranscodingSession): Promise { // Try software encoding if hardware failed if (job.hardwareAcceleration && this.isHardwareError(job.lastError)) { logger.info('Retrying with software encoding', { jobId: job.id }); return this.restartWithSoftwareEncoding(job); } // Try lower quality settings if (job.retryCount < 2) { logger.info('Retrying with reduced quality', { jobId: job.id }); return this.restartWithReducedQuality(job); } return false; } } ``` ### 3. **Security Considerations** ```typescript class SecurityValidator { validateTranscodingRequest(request: TranscodingRequest): ValidationResult { // Path traversal protection if (this.containsPathTraversal(request.inputPath)) { return { valid: false, error: 'Invalid input path' }; } // File type validation if (!this.isAllowedMediaType(request.inputPath)) { return { valid: false, error: 'Unsupported media type' }; } // Resource limits if (this.exceedsResourceLimits(request)) { return { valid: false, error: 'Resource limits exceeded' }; } // Rate limiting per client if (this.isRateLimited(request.clientId)) { return { valid: false, error: 'Rate limit exceeded' }; } return { valid: true }; } private containsPathTraversal(path: string): boolean { return path.includes('..') || path.includes('~') || path.startsWith('/'); } private isAllowedMediaType(path: string): boolean { const allowedExtensions = ['.mp4', '.mkv', '.avi', '.mov', '.m4v']; return allowedExtensions.some(ext => path.toLowerCase().endsWith(ext)); } } ``` ### 4. **Scaling Strategies** ```typescript // Horizontal scaling with Redis coordination class DistributedTranscodingOrchestrator { private redis: Redis; private nodeId: string; async startTranscoding(request: TranscodingRequest): Promise { // Check global capacity const globalLoad = await this.getGlobalLoad(); if (globalLoad > 0.8) { throw new Error('System at capacity'); } // Assign to least loaded node const targetNode = await this.findLeastLoadedNode(); if (targetNode === this.nodeId) { return this.startLocalTranscoding(request); } else { return this.delegateToNode(targetNode, request); } } private async getGlobalLoad(): Promise { const nodes = await this.redis.smembers('transcoding:nodes'); let totalJobs = 0; let totalCapacity = 0; for (const node of nodes) { const nodeLoad = await this.redis.hgetall(`transcoding:node:${node}`); totalJobs += parseInt(nodeLoad.activeJobs || '0'); totalCapacity += parseInt(nodeLoad.maxCapacity || '0'); } return totalCapacity > 0 ? totalJobs / totalCapacity : 1; } } ``` ## Summary of Key Learnings ### **Architecture Principles** 1. **Process Isolation**: Each transcoding session gets independent FFmpeg process 2. **Linear Encoding**: FFmpeg cannot seek backwards, requiring process restart for large seeks 3. **Resource Management**: Proactive throttling and cleanup prevent resource exhaustion 4. **Client Lifecycle**: Ping-based session management with automatic cleanup ### **Performance Optimizations** 1. **Segment Strategy**: 6-second HLS segments balance seek performance with efficiency 2. **Throttling Logic**: Pause encoding when >60 seconds ahead of playback 3. **Cleanup Policies**: Automatic segment removal for content >5 minutes 4. **Hardware Acceleration**: Graceful fallback from hardware to software encoding ### **Implementation Strategy** 1. **API Design**: RESTful endpoints for session management and HLS segment delivery 2. **Client Integration**: Adaptive seeking strategy based on seek distance 3. **Error Handling**: Comprehensive retry logic with fallback options 4. **Security**: Input validation, path traversal protection, rate limiting ### **Deployment Considerations** 1. **Monitoring**: Track active jobs, success rates, resource usage 2. **Scaling**: Horizontal scaling with Redis coordination 3. **Storage**: Temporary file management and cleanup 4. **Network**: CDN integration for segment delivery This architecture provides a robust foundation for building production-grade media transcoding systems with proper resource management, client lifecycle handling, and performance optimization. ## Core Architecture ### 1. Main Components - **Role**: Central orchestrator for all transcoding operations - **Key Responsibilities**: - FFmpeg process lifecycle management - Job tracking and session management - Resource cleanup and monitoring - Client ping/keepalive handling #### MediaEncoder (`MediaEncoder.cs`) - **Role**: FFmpeg binary interface and process execution - **Key Responsibilities**: - FFmpeg path validation and capability detection - Process creation and monitoring - Hardware acceleration detection - Encoder/decoder capability enumeration #### TranscodingJob (`TranscodingJob.cs`) - **Role**: Individual transcoding session state management - **Key Responsibilities**: - Process lifecycle tracking - Resource usage monitoring (bytes, position, bitrate) - Timer management for auto-cleanup - Client connection state ### 2. Control Flow ```mermaid graph TB Client[Client Request] --> API[API Controller] API --> StreamState[Create StreamState] StreamState --> TranscodeManager[TranscodeManager.StartFfMpeg] TranscodeManager --> FFmpeg[Launch FFmpeg Process] FFmpeg --> Job[Create TranscodingJob] Job --> Monitor[Job Monitoring] Monitor --> Throttle[Throttling System] Monitor --> Cleanup[Segment Cleanup] Monitor --> Ping[Ping System] Ping --> Timeout[Kill Timer] ``` #### Detailed Flow: 1. **Request Reception**: API controllers (`DynamicHlsController`, `VideosController`, `AudioController`) receive transcoding requests 2. **State Creation**: `StreamingHelpers.GetStreamingState()` analyzes media and creates `StreamState` 3. **Job Initialization**: `TranscodeManager.StartFfMpeg()` creates and starts FFmpeg process 4. **Process Monitoring**: `TranscodingJob` tracks process state and resource usage 5. **Client Management**: Ping system keeps sessions alive, kill timers handle disconnections 6. **Resource Management**: Throttling and segment cleaning optimize performance 7. **Cleanup**: Automatic cleanup when sessions end or timeout ## Key Systems ### 1. Ping System - Keep Transcoding Alive **Purpose**: Prevents transcoding jobs from being killed when clients are actively consuming content. **Implementation**: ```csharp // PingTranscodingJob method in TranscodeManager public void PingTranscodingJob(string playSessionId, bool? isUserPaused) { var jobs = _activeTranscodingJobs.Where(j => string.Equals(playSessionId, j.PlaySessionId, StringComparison.OrdinalIgnoreCase)) .ToList(); foreach (var job in jobs) { if (isUserPaused.HasValue) { job.IsUserPaused = isUserPaused.Value; } PingTimer(job, true); } } private void PingTimer(TranscodingJob job, bool isProgressCheckIn) { if (job.HasExited) { job.StopKillTimer(); return; } // Different timeouts for different job types var timerDuration = job.Type != TranscodingJobType.Progressive ? 60000 : 10000; job.PingTimeout = timerDuration; job.LastPingDate = DateTime.UtcNow; job.StartKillTimer(OnTranscodeKillTimerStopped); } ``` **Key Parameters**: - **Progressive streams**: 10 second timeout (10000ms) - **HLS streams**: 60 second timeout (60000ms) - **Ping frequency**: Clients should ping every 30-45 seconds - **Auto-ping**: Triggered on playback progress events ### 2. Kill Timers - Automatic Cleanup **Purpose**: Automatically terminate abandoned transcoding jobs to free resources. **Implementation**: ```csharp private async void OnTranscodeKillTimerStopped(object? state) { var job = state as TranscodingJob; if (!job.HasExited && job.Type != TranscodingJobType.Progressive) { var timeSinceLastPing = (DateTime.UtcNow - job.LastPingDate).TotalMilliseconds; if (timeSinceLastPing < job.PingTimeout) { // Reset timer if ping is still fresh job.StartKillTimer(OnTranscodeKillTimerStopped, job.PingTimeout); return; } } // Kill the job await KillTranscodingJob(job, true, path => true).ConfigureAwait(false); } ``` **Configuration**: - **Grace period**: Jobs get multiple timeout intervals before termination - **Progressive vs HLS**: Different timeout strategies - **Resource cleanup**: Process termination, file cleanup, live stream closure ### 3. Throttling - Resource Management **Purpose**: Controls transcoding speed to prevent resource waste and improve efficiency. **Conditions for Throttling**: ```csharp private static bool EnableThrottling(StreamState state) => state.InputProtocol == MediaProtocol.File && state.RunTimeTicks.HasValue && state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks && state.IsInputVideo && state.VideoType == VideoType.VideoFile; ``` **Throttling Logic**: ```csharp private bool IsThrottleAllowed(TranscodingJob job, int thresholdSeconds) { var bytesDownloaded = job.BytesDownloaded; var transcodingPositionTicks = job.TranscodingPositionTicks ?? 0; var downloadPositionTicks = job.DownloadPositionTicks ?? 0; var gapLengthInTicks = TimeSpan.FromSeconds(thresholdSeconds).Ticks; if (downloadPositionTicks > 0 && transcodingPositionTicks > 0) { // HLS - time-based consideration var gap = transcodingPositionTicks - downloadPositionTicks; return gap > gapLengthInTicks; } // Progressive - byte-based consideration // Calculate if transcoding is ahead enough to throttle } ``` **Control Mechanism**: - **Pause command**: Send 'p' (or 'c' for older FFmpeg) to stdin - **Resume command**: Send 'u' (or newline for older FFmpeg) to stdin - **Threshold**: Minimum 60 seconds ahead before throttling kicks in ### 4. Segment Cleaning - HLS Management **Purpose**: Removes old HLS segments to prevent disk space exhaustion. **Conditions**: ```csharp private static bool EnableSegmentCleaning(StreamState state) => state.InputProtocol is MediaProtocol.File or MediaProtocol.Http && state.IsInputVideo && state.TranscodingType == TranscodingJobType.Hls && state.RunTimeTicks.HasValue && state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks; ``` **Implementation**: - **Segment retention**: Keeps last N segments (configurable) - **Cleanup frequency**: Runs periodically during transcoding - **File patterns**: Removes `.ts` or `.mp4` segments and related files - **Safety**: Includes retry logic and error handling ## FFmpeg Command Structure ### Complete Command Template ```bash {inputModifier} {inputArgument} -map_metadata -1 -map_chapters -1 -threads {threads} {mapArgs} {videoArguments} {audioArguments} -copyts -avoid_negative_ts disabled -max_muxing_queue_size {maxMuxingQueueSize} -f hls -max_delay 5000000 -hls_time {segmentLength} -hls_segment_type {segmentFormat} -start_number {startNumber} -hls_segment_filename "{segmentPath}" {hlsArguments} -y "{outputPath}" ``` ### Parameter Breakdown #### Input Modifiers ```bash -re # Read input at native framerate -hwaccel cuda # Hardware acceleration (optional) -fflags +genpts # Generate presentation timestamps -analyzeduration 5000000 # Analysis duration for streams -readrate 10 # Input read rate limit (for segment deletion) ``` #### Core Parameters ```bash -threads 0 # Auto-detect thread count -map_metadata -1 # Strip metadata -map_chapters -1 # Strip chapters -copyts # Copy timestamps -avoid_negative_ts disabled # Handle negative timestamps -max_muxing_queue_size 128 # Muxing queue size ``` #### HLS-Specific Parameters ```bash -f hls # Output format -hls_time 6 # Segment duration (seconds) -hls_segment_type mpegts # Segment container (mpegts/fmp4) -hls_playlist_type vod # Playlist type (vod/event) -hls_list_size 0 # Keep all segments in playlist -start_number 0 # Starting segment number -hls_segment_filename "output%d.ts" # Segment naming pattern -hls_base_url "hls/stream/" # Base URL for segments ``` #### Output Specification ```bash "output.m3u8" # Output playlist file ``` ### Video Encoding Parameters #### Quality Control ```bash -c:v libx264 # Video codec -preset veryfast # Encoding speed/quality trade-off -crf 23 # Constant rate factor (quality) -maxrate 2000k # Maximum bitrate -bufsize 4000k # Buffer size ``` #### Resolution and Framerate ```bash -vf "scale=1920:1080" # Scale to specific resolution -r 30 # Output framerate -pix_fmt yuv420p # Pixel format ``` ### Audio Encoding Parameters #### Basic Audio ```bash -c:a aac # Audio codec -ab 128k # Audio bitrate -ar 48000 # Sample rate -ac 2 # Audio channels ``` #### Advanced Audio Processing ```bash -af "volume=1.0" # Audio filters -acodec copy # Copy audio stream ``` ## Implementation Guide for Next.js ### 1. Core Architecture ```typescript // Core interfaces interface TranscodingJob { id: string; playSessionId: string; process: ChildProcess; lastPingDate: Date; pingTimeout: number; bytesDownloaded: number; transcodingPositionTicks: number; activeRequestCount: number; isUserPaused: boolean; hasExited: boolean; } interface StreamState { outputFilePath: string; segmentLength: number; inputProtocol: string; runTimeTicks: number; isInputVideo: boolean; transcodingType: 'hls' | 'progressive'; } ``` ### 2. TranscodeManager Implementation ```typescript class TranscodeManager { private activeJobs = new Map(); private killTimers = new Map(); async startFfmpeg(state: StreamState, playSessionId: string): Promise { const commandArgs = this.buildFfmpegCommand(state); const process = spawn('ffmpeg', commandArgs); const job: TranscodingJob = { id: crypto.randomUUID(), playSessionId, process, lastPingDate: new Date(), pingTimeout: state.transcodingType === 'progressive' ? 10000 : 60000, bytesDownloaded: 0, transcodingPositionTicks: 0, activeRequestCount: 1, isUserPaused: false, hasExited: false }; this.activeJobs.set(playSessionId, job); this.startKillTimer(job); return job; } pingTranscodingJob(playSessionId: string, isUserPaused?: boolean) { const job = this.activeJobs.get(playSessionId); if (!job || job.hasExited) return; if (isUserPaused !== undefined) { job.isUserPaused = isUserPaused; } job.lastPingDate = new Date(); this.resetKillTimer(job); } private startKillTimer(job: TranscodingJob) { this.clearKillTimer(job.playSessionId); const timer = setTimeout(() => { this.checkAndKillJob(job); }, job.pingTimeout); this.killTimers.set(job.playSessionId, timer); } private async checkAndKillJob(job: TranscodingJob) { const timeSinceLastPing = Date.now() - job.lastPingDate.getTime(); if (timeSinceLastPing < job.pingTimeout) { // Reset timer if ping is still fresh this.startKillTimer(job); return; } // Kill the job await this.killTranscodingJob(job); } } ``` ### 3. API Endpoints ```typescript // Next.js API routes // /api/transcoding/[playSessionId]/ping export async function POST(request: Request, { params }: { params: { playSessionId: string } }) { const { isUserPaused } = await request.json(); transcodeManager.pingTranscodingJob(params.playSessionId, isUserPaused); return Response.json({ success: true }); } // /api/hls/[...segments] export async function GET(request: Request, { params }: { params: { segments: string[] } }) { const [playSessionId, segmentFile] = params.segments; if (segmentFile.endsWith('.m3u8')) { // Return playlist return new Response(playlist, { headers: { 'Content-Type': 'application/vnd.apple.mpegurl' } }); } else { // Return segment file const filePath = path.join(transcodingDir, segmentFile); const fileStream = fs.createReadStream(filePath); return new Response(fileStream as any); } } ``` ### 4. Client Integration ```typescript // Client-side ping implementation class MediaPlayer { private pingInterval: NodeJS.Timeout | null = null; private playSessionId: string; startPinging() { this.pingInterval = setInterval(() => { fetch(`/api/transcoding/${this.playSessionId}/ping`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ isUserPaused: this.isPaused }) }); }, 30000); // Ping every 30 seconds } stopPinging() { if (this.pingInterval) { clearInterval(this.pingInterval); this.pingInterval = null; } } } ``` ## Seeking Implementation Jellyfin implements sophisticated seeking mechanisms for both progressive and HLS transcoding, handling different scenarios and optimizing for performance. ### 1. Core Seeking Logic **Central Method**: `GetFastSeekCommandLineParameter()` in `EncodingHelper.cs` ```csharp public string GetFastSeekCommandLineParameter(EncodingJobInfo state, EncodingOptions options, string segmentContainer) { var time = state.BaseRequest.StartTimeTicks ?? 0; var maxTime = state.RunTimeTicks ?? 0; var seekParam = string.Empty; if (time > 0) { // For direct streaming/remuxing, we seek at the exact position of the keyframe // However, ffmpeg will seek to previous keyframe when the exact time is the input // Workaround this by adding 0.5s offset to the seeking time to get the exact keyframe on most videos. // This will help subtitle syncing. var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec); var seekTick = isHlsRemuxing ? time + 5000000L : time; // Seeking beyond EOF makes no sense in transcoding. Clamp the seekTick value to // [0, RuntimeTicks - 5.0s], so that the muxer gets packets and avoid error codes. if (maxTime > 0) { seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0)); } seekParam += string.Format(CultureInfo.InvariantCulture, "-ss {0}", _mediaEncoder.GetTimeParameter(seekTick)); if (state.IsVideoRequest) { // Add -noaccurate_seek for specific conditions if (!string.Equals(state.InputContainer, "wtv", StringComparison.OrdinalIgnoreCase) && !string.Equals(segmentFormat, "ts", StringComparison.OrdinalIgnoreCase) && state.TranscodingType != TranscodingJobType.Progressive && !state.EnableBreakOnNonKeyFrames(outputVideoCodec) && (state.BaseRequest.StartTimeTicks ?? 0) > 0) { seekParam += " -noaccurate_seek"; } } } return seekParam; } ``` ### 2. Seeking Types #### A. **Input Seeking** (`-ss` before input) - **Purpose**: Seek to position before decoding starts - **Advantages**: Very fast, minimal CPU usage - **Disadvantages**: Less accurate, seeks to nearest keyframe - **Usage**: Primary method for initial positioning #### B. **Output Seeking** (`-ss` after input) - **Purpose**: Decode from beginning, then seek in output - **Advantages**: Frame-accurate positioning - **Disadvantages**: High CPU usage, slower startup - **Usage**: When precision is critical #### C. **Accurate vs Fast Seeking** - **Fast seeking** (`-noaccurate_seek`): Seeks to nearest keyframe (default) - **Accurate seeking**: Frame-precise but slower - **Dynamic selection**: Based on container and transcoding type ### 3. HLS Segment-Based Seeking #### **Segment URL Structure** ``` /hls/{itemId}/{playlistId}/{segmentId}.{container}?runtimeTicks={position}&actualSegmentLengthTicks={duration} ``` #### **Key Parameters**: - **`runtimeTicks`**: Starting position of segment in media timeline - **`actualSegmentLengthTicks`**: Precise duration of this specific segment - **`segmentId`**: Sequential segment number (0-based) #### **Segment Calculation**: ```csharp // Equal-length segments var segmentLengthTicks = TimeSpan.FromSeconds(segmentLength).Ticks; var wholeSegments = runtimeTicks / segmentLengthTicks; var remainingTicks = runtimeTicks % segmentLengthTicks; // Keyframe-aware segments (optimal) var result = new List(); var desiredSegmentLengthTicks = TimeSpan.FromMilliseconds(desiredSegmentLengthMs).Ticks; foreach (var keyframe in keyframeData.KeyframeTicks) { if (keyframe >= desiredCutTime) { var currentSegmentLength = keyframe - lastKeyframe; result.Add(TimeSpan.FromTicks(currentSegmentLength).TotalSeconds); lastKeyframe = keyframe; desiredCutTime += desiredSegmentLengthTicks; } } ``` ### 4. Seek Optimizations #### **HLS Remuxing Offset** ```csharp // Add 0.5s offset for HLS remuxing to hit exact keyframes var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec); var seekTick = isHlsRemuxing ? time + 5000000L : time; // +0.5s ``` #### **EOF Protection** ```csharp // Prevent seeking beyond end of file if (maxTime > 0) { seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0)); // -5s buffer } ``` #### **Container-Specific Rules** - **WTV containers**: Never use `-noaccurate_seek` (breaks seeking) - **MPEGTS segments**: Disable `-noaccurate_seek` for client compatibility - **fMP4 containers**: Require `-noaccurate_seek` for audio sync ### 5. Keyframe Extraction **Purpose**: Generate precise segment boundaries aligned with keyframes #### **FFprobe Method**: ```bash ffprobe -loglevel error -skip_frame nokey -select_streams v:0 -show_entries packet=pts_time,flags -of csv=print_section=0 "input.mp4" ``` #### **Matroska Method**: - Direct cue point extraction from container metadata - Much faster than FFprobe for MKV files - Reads cue tables for instant keyframe positions ### 6. Real-Time Seeking Scenarios #### **Progressive Transcoding** ```typescript // Client seeks to new position const seekTo = (positionSeconds: number) => { // Kill current transcoding job await fetch(`/api/transcode/kill/${playSessionId}`, { method: 'POST' }); // Start new transcoding from seek position const startTimeTicks = positionSeconds * 10000000; // Convert to ticks const newUrl = `/api/videos/${itemId}/stream?startTimeTicks=${startTimeTicks}`; // Update video source videoElement.src = newUrl; }; ``` #### **HLS Seeking** ```typescript // HLS seeking is handled by the player automatically // Server generates segments with proper seek points const hlsPlayer = new Hls(); hlsPlayer.loadSource('/api/videos/{itemId}/master.m3u8'); // Player handles seeking by requesting appropriate segments // No need to restart transcoding jobs ``` ### 7. External Media Handling #### **External Subtitles** ```csharp // Also seek external subtitle streams var seekSubParam = GetFastSeekCommandLineParameter(state, options, segmentContainer); if (!string.IsNullOrEmpty(seekSubParam)) { arg.Append(' ').Append(seekSubParam); } arg.Append(" -i file:\"").Append(subtitlePath).Append('"'); ``` #### **External Audio** ```csharp // Seek external audio streams to match video var seekAudioParam = GetFastSeekCommandLineParameter(state, options, segmentContainer); if (!string.IsNullOrEmpty(seekAudioParam)) { arg.Append(' ').Append(seekAudioParam); } arg.Append(" -i \"").Append(state.AudioStream.Path).Append('"'); ``` ### 8. Implementation Guide for Next.js #### **Progressive Seeking** ```typescript class ProgressiveTranscoder { async seekTo(positionTicks: number): Promise { // Kill existing job await this.killCurrentJob(); // Calculate seek parameters const seekSeconds = positionTicks / 10000000; const maxTime = this.mediaDuration; // Clamp to safe bounds const safeSeekTicks = Math.max(0, Math.min(positionTicks, maxTime - 50000000)); // Build FFmpeg command with seek const args = [ '-ss', this.formatTime(safeSeekTicks), '-i', this.inputPath, '-c:v', 'libx264', '-preset', 'veryfast', // ... other encoding params this.outputPath ]; return this.startTranscoding(args); } private formatTime(ticks: number): string { const totalSeconds = ticks / 10000000; const hours = Math.floor(totalSeconds / 3600); const minutes = Math.floor((totalSeconds % 3600) / 60); const seconds = totalSeconds % 60; return `${hours}:${minutes.toString().padStart(2, '0')}:${seconds.toFixed(6).padStart(9, '0')}`; } } ``` #### **HLS Seeking** ```typescript class HLSTranscoder { generateSegmentUrl(segmentId: number, runtimeTicks: number, segmentDurationTicks: number): string { const params = new URLSearchParams({ runtimeTicks: runtimeTicks.toString(), actualSegmentLengthTicks: segmentDurationTicks.toString() }); return `/api/hls/${this.itemId}/${this.playlistId}/${segmentId}.ts?${params}`; } async generateSegment(segmentId: number, runtimeTicks: number): Promise { const seekSeconds = runtimeTicks / 10000000; const args = [ '-ss', this.formatTime(runtimeTicks), '-i', this.inputPath, '-t', this.segmentDuration.toString(), '-c:v', 'libx264', '-preset', 'veryfast', '-force_key_frames', `expr:gte(t,n_forced*${this.segmentDuration})`, '-f', 'mpegts', '-' ]; return this.executeFFmpeg(args); } } ``` ### 9. Performance Considerations #### **Seek Performance Tips**: 1. **Use input seeking** (`-ss` before `-i`) when possible 2. **Cache keyframe data** for containers that support it 3. **Implement seek debouncing** to prevent rapid job restarts 4. **Use appropriate segment duration** (6s recommended for seek performance) 5. **Pre-generate keyframe indexes** for frequently accessed content #### **Client-Side Optimizations**: ```typescript // Debounce seeking to prevent excessive requests const debouncedSeek = debounce((position: number) => { this.performSeek(position); }, 300); // Progressive seeking strategy if (Math.abs(targetPosition - currentPosition) < 30) { // Small seeks: let player buffer naturally player.currentTime = targetPosition; } else { // Large seeks: restart transcoding this.seekTo(targetPosition); } ``` ## Progress Tracking and Seeking During Transcoding **The Challenge**: Unlike direct play where duration and seek positions are straightforward, transcoding creates a "streaming-like" scenario where the real duration is not immediately available and progress tracking becomes complex. ### 1. Core Progress Tracking Architecture **Key Components**: - **`TranscodingPositionTicks`**: Where FFmpeg transcoding has currently reached - **`DownloadPositionTicks`**: Where the client has consumed content to - **`CompletionPercentage`**: Calculated progress based on runtime vs current position - **`RunTimeTicks`**: Total media duration from metadata #### **Progress Calculation Logic**: ```csharp // From JobLogger.ParseLogLine() - extracts progress from FFmpeg output var totalMs = state.RunTimeTicks.HasValue ? TimeSpan.FromTicks(state.RunTimeTicks.Value).TotalMilliseconds : 0; var currentMs = /* parsed from FFmpeg time output */; if (totalMs > 0) { percent = 100.0 * currentMs / totalMs; transcodingPosition = TimeSpan.FromMilliseconds(currentMs); } ``` ### 2. Real-Time Progress Updates #### **FFmpeg Output Parsing**: ```csharp // JobLogger monitors FFmpeg stderr output for progress private void ParseLogLine(string line, EncodingJobInfo state) { // Parse: frame= 123 fps= 25 q=28.0 size= 1024kB time=00:01:23.45 bitrate= 512.0kbits/s // Extract: time value for current transcoding position // Extract: size value for bytes transcoded // Extract: bitrate for current encoding rate } ``` #### **Progress Reporting Chain**: ```typescript // 1. FFmpeg outputs progress to stderr // 2. JobLogger.ParseLogLine() extracts values // 3. TranscodeManager.ReportTranscodingProgress() updates job state // 4. SessionManager.ReportTranscodingInfo() updates client session // 5. TranscodingInfo DTO sent to client via WebSocket/API interface TranscodingInfo { CompletionPercentage?: number; // 0-100 progress Bitrate?: number; // Current bitrate Framerate?: number; // Current FPS Width?: number; // Video width Height?: number; // Video height AudioCodec: string; // Audio codec in use VideoCodec: string; // Video codec in use Container: string; // Output container } ``` ### 3. Client Progress Bar Implementation #### **Progressive Transcoding**: ```typescript class ProgressiveTranscodingProgress { private transcodingInfo: TranscodingInfo; private mediaRunTimeTicks: number; updateProgressBar(): void { if (this.transcodingInfo?.CompletionPercentage) { // Use transcoding percentage directly const progress = this.transcodingInfo.CompletionPercentage / 100; this.progressBar.value = progress; // Estimate current playable duration const availableDuration = this.mediaRunTimeTicks * progress; this.updateSeekableRange(0, availableDuration); } } handleSeek(targetPositionTicks: number): void { const transcodedTicks = this.mediaRunTimeTicks * (this.transcodingInfo.CompletionPercentage / 100); if (targetPositionTicks <= transcodedTicks) { // Seek within transcoded content this.player.currentTime = targetPositionTicks / 10000000; } else { // Restart transcoding from seek position this.startTranscodingFromPosition(targetPositionTicks); } } } ``` #### **HLS Transcoding**: ```typescript class HLSTranscodingProgress { private segmentDuration: number = 6; // seconds private totalSegments: number; calculateProgress(): ProgressInfo { // HLS progress based on segment availability const availableSegments = this.getAvailableSegmentCount(); const progress = availableSegments / this.totalSegments; return { percentage: progress * 100, availableDuration: availableSegments * this.segmentDuration, seekableEnd: availableSegments * this.segmentDuration }; } updateSegmentDownloadPosition(): void { // Update DownloadPositionTicks when segments are consumed const segmentEndTicks = this.currentRuntimeTicks + this.actualSegmentLengthTicks; this.transcodingJob.DownloadPositionTicks = Math.max( this.transcodingJob.DownloadPositionTicks ?? segmentEndTicks, segmentEndTicks ); } } ``` ### 4. Solving the "Real Duration is Not Real" Problem #### **Duration Estimation Strategies**: **1. Metadata-Based Duration**: ```typescript // Use media file metadata as baseline const estimatedDuration = mediaSource.RunTimeTicks; if (estimatedDuration) { this.totalDuration = estimatedDuration; this.progressBar.max = estimatedDuration; } ``` **2. Progressive Duration Discovery**: ```typescript // Update duration as transcoding progresses if (transcodingInfo.CompletionPercentage > 0) { const currentTranscodedTicks = /* current position from transcoding */; const estimatedTotal = currentTranscodedTicks / (transcodingInfo.CompletionPercentage / 100); // Only update if estimate seems reliable (>10% transcoded) if (transcodingInfo.CompletionPercentage > 10) { this.totalDuration = estimatedTotal; } } ``` **3. HLS Segment-Based Calculation**: ```typescript // For HLS, calculate from segment structure const calculateHLSDuration = (segments: SegmentInfo[]): number => { return segments.reduce((total, segment) => { return total + segment.actualSegmentLengthTicks; }, 0); }; ``` ### 5. Advanced Progress Management #### **Buffering and Availability**: ```typescript class TranscodingBuffer { private bufferAheadSeconds: number = 30; isPositionAvailable(targetPositionTicks: number): boolean { const transcodedTicks = this.getTranscodedPositionTicks(); return targetPositionTicks <= transcodedTicks; } calculateSeekableRange(): { start: number; end: number } { return { start: 0, end: this.getTranscodedPositionTicks() - (this.bufferAheadSeconds * 10000000) }; } shouldThrottleTranscoding(): boolean { const gap = this.transcodingPositionTicks - this.downloadPositionTicks; const targetGap = this.bufferAheadSeconds * 10000000; // 30s in ticks return gap > targetGap; } } ``` #### **Smooth Progress Updates**: ```typescript class SmoothProgressUpdater { private interpolationInterval: number = 1000; // 1 second private lastKnownPosition: number = 0; private lastUpdateTime: number = Date.now(); interpolateProgress(): number { if (!this.isPlaying) return this.lastKnownPosition; const now = Date.now(); const elapsed = now - this.lastUpdateTime; const estimatedProgress = this.lastKnownPosition + elapsed; // Don't exceed known transcoded position const maxAvailable = this.getTranscodedPositionTicks(); return Math.min(estimatedProgress, maxAvailable); } } ``` ### 6. Error Handling and Edge Cases #### **Transcoding Failures**: ```typescript class TranscodingErrorHandler { handleTranscodingError(error: TranscodingError): void { switch (error.type) { case 'SEEK_BEYOND_DURATION': // Clamp seek to valid range this.seekTo(Math.min(this.targetPosition, this.maxAvailablePosition)); break; case 'TRANSCODING_STALLED': // Restart transcoding this.restartTranscodingFromLastKnownPosition(); break; case 'INVALID_DURATION': // Fall back to live estimation this.enableLiveDurationEstimation(); break; } } } ``` #### **Network Issues**: ```typescript class NetworkResilienceHandler { private retryPolicy = { maxRetries: 3, backoffMs: [1000, 2000, 4000] }; async handleProgressUpdateFailure(attempt: number): Promise { if (attempt < this.retryPolicy.maxRetries) { await this.delay(this.retryPolicy.backoffMs[attempt]); return this.fetchProgressUpdate(); } else { // Switch to local time-based estimation this.enableLocalProgressEstimation(); } } } ``` ### 7. Next.js Implementation Guide #### **Complete Progress Management**: ```typescript class NextJSTranscodingProgressManager { private wsConnection: WebSocket; private progressUpdateInterval: NodeJS.Timeout; constructor(private videoElement: HTMLVideoElement) { this.setupWebSocketUpdates(); this.setupProgressInterpolation(); } private setupWebSocketUpdates(): void { this.wsConnection.onmessage = (event) => { const message = JSON.parse(event.data); if (message.MessageType === 'TranscodingInfo') { this.updateTranscodingInfo(message.Data); } }; } private updateTranscodingInfo(info: TranscodingInfo): void { // Update progress bar if (info.CompletionPercentage) { this.progressBar.value = info.CompletionPercentage; } // Update seekable range const seekableEnd = this.calculateSeekableEnd(info); this.videoElement.setAttribute('data-seekable-end', seekableEnd.toString()); // Update duration if we have better estimate this.updateDurationEstimate(info); } async handleSeek(targetSeconds: number): Promise { const targetTicks = targetSeconds * 10000000; const transcodedTicks = this.getTranscodedPositionTicks(); if (targetTicks <= transcodedTicks) { // Seek within available content this.videoElement.currentTime = targetSeconds; } else { // Show loading state this.showBufferingState(); // Request new transcoding position await this.requestTranscodingFromPosition(targetTicks); // Update video source this.videoElement.src = this.generateStreamUrl(targetTicks); } } } ``` ## Important Configuration ### Environment Variables ```bash FFMPEG_PATH=/usr/bin/ffmpeg TRANSCODING_TEMP_PATH=/tmp/jellyfin/transcoding MAX_CONCURRENT_STREAMS=3 SEGMENT_KEEP_SECONDS=300 THROTTLE_DELAY_SECONDS=60 ``` ### Performance Tuning - **Thread count**: Auto-detect based on CPU cores - **Buffer sizes**: Adjust based on available memory - **Segment duration**: 6 seconds for good seek performance - **Concurrent streams**: Limit based on system resources ### Security Considerations - **Input validation**: Sanitize all file paths and parameters - **Resource limits**: Prevent DOS through excessive transcoding - **Access control**: Validate session ownership - **File cleanup**: Remove orphaned files regularly ## Monitoring and Logging ### Key Metrics to Track - Active transcoding jobs count - Resource usage (CPU, memory, disk I/O) - Average transcoding speed vs playback speed - Client ping frequency and timeouts - Segment cleanup efficiency ### Error Handling - FFmpeg process failures - Disk space exhaustion - Network timeouts - Invalid media files - Hardware acceleration failures This architecture provides a robust foundation for building a media transcoding system with proper resource management, client lifecycle handling, and performance optimization.