nextav/docs/archive/transcoding-legacy/JELLYFIN_TRANSCODING_ARCHIT...

1510 lines
48 KiB
Markdown

# Jellyfin Transcoding Architecture Documentation
## Executive Summary
Jellyfin's transcoding system is a sophisticated media processing pipeline built around **FFmpeg** that provides real-time video and audio conversion for various client devices. This document details the core architecture, control flow, and implementation patterns valuable for building a similar system in Next.js.
**Key Findings:**
- **Engine**: FFmpeg managed through MediaEncoder class with hardware acceleration support
- **Job Management**: TranscodeManager orchestrates process lifecycle, throttling, and cleanup
- **Session Control**: Client ping system (10s/60s timeouts) with automatic kill timers
- **Resource Management**: Intelligent throttling based on playback position (+60s threshold)
- **Segment Management**: Automatic cleanup for HLS content >5 minutes duration
- **Seeking Strategy**: FFmpeg process restart required for large seeks due to linear encoding nature
## Key Implementation Insights
### 1. **FFmpeg Process Management Strategy**
**Core Principle**: Each transcoding job is a **linear, stateful FFmpeg process** that cannot seek to arbitrary positions after encoding starts. This fundamental limitation drives the entire architecture design.
**Implications**:
- **Large seeks require process restart**: When users seek >30 seconds, kill current job and start new one
- **Small seeks use player buffering**: Let HLS players handle <30 second seeks naturally
- **Process isolation**: Each session gets independent FFmpeg process for resource control
### 2. **Resource Management Philosophy**
**Throttling Strategy**: Prevent transcoding from running too far ahead of playback
- **Threshold**: Pause encoding when >60 seconds ahead of client consumption
- **Control Method**: Send FFmpeg stdin commands ('p' for pause, 'u' for resume)
- **Benefits**: Reduces CPU/disk usage, prevents unnecessary work
**Segment Cleanup Policy**: Automatic disk space management for HLS
- **Trigger Conditions**: Video content with duration >5 minutes
- **Retention**: Configurable number of segments to keep
- **Safety**: Retry logic with graceful error handling
### 3. **Session Lifecycle Management**
**Ping System Design**: Keep-alive mechanism prevents orphaned processes
- **Progressive streams**: 10-second timeout (fast response needed)
- **HLS streams**: 60-second timeout (more buffering tolerance)
- **Client responsibility**: Ping every 30-45 seconds during active playback
**Kill Timer Implementation**: Automatic cleanup for abandoned sessions
- **Grace periods**: Multiple timeout intervals before termination
- **Resource cleanup**: Process termination + file cleanup + stream closure
### 4. **FFmpeg Command Architecture**
**Standard HLS Command Structure** (per memory specifications):
```bash
ffmpeg [input_modifiers] -i input.mp4 [encoding_params] -f hls -hls_time 6 -hls_segment_filename "segment%d.ts" output.m3u8
```
**Essential Parameters**:
- **`-f hls`**: HLS output format
- **`-hls_time 6`**: 6-second segments for optimal seek performance
- **`-hls_segment_filename`**: Consistent naming pattern
- **`output.m3u8`**: Playlist file output
## Critical Design Decisions
### 1. **When to Restart vs Continue Transcoding**
**Process Restart Required**:
- Large seek operations (>30 seconds)
- Quality/resolution changes
- Audio track switching
- Subtitle track changes
**Continue Existing Process**:
- Small seeks (<30 seconds) - let player handle
- Pause/resume operations
- Client reconnections within timeout window
### 2. **Segment Duration Strategy**
**6-Second Standard**: Optimal balance between:
- **Seek performance**: Reasonable granularity for user seeking
- **Network efficiency**: Not too many small requests
- **Startup time**: Quick initial buffering
- **Storage overhead**: Manageable file count
### 3. **Hardware Acceleration Integration**
**MediaEncoder Responsibilities**:
- **Capability detection**: Probe available hardware encoders
- **Process execution**: Manage FFmpeg with hardware flags
- **Fallback handling**: Graceful degradation to software encoding
- **Performance monitoring**: Track encoding speeds and success rates
## Architecture Patterns for Next.js Implementation
### 1. **Core Classes Structure**
```typescript
// Main orchestrator (equivalent to TranscodeManager)
class TranscodingOrchestrator {
private activeJobs = new Map<string, TranscodingJob>();
private killTimers = new Map<string, NodeJS.Timeout>();
async startTranscoding(request: TranscodingRequest): Promise<TranscodingJob>
pingSession(sessionId: string, isUserPaused?: boolean): void
async killSession(sessionId: string): Promise<void>
}
// FFmpeg interface (equivalent to MediaEncoder)
class MediaProcessor {
private ffmpegPath: string;
private hardwareCapabilities: HardwareCapabilities;
async probe(inputPath: string): Promise<MediaInfo>
async startProcess(args: string[]): Promise<ChildProcess>
detectHardwareSupport(): HardwareCapabilities
}
// Individual job tracking (equivalent to TranscodingJob)
class TranscodingSession {
id: string;
process: ChildProcess;
lastPingDate: Date;
throttler?: TranscodingThrottler;
segmentCleaner?: SegmentCleaner;
startKillTimer(callback: () => void): void
stopKillTimer(): void
pause(): Promise<void>
resume(): Promise<void>
}
```
### 2. **API Endpoint Design**
```typescript
// Next.js API routes structure
// POST /api/transcoding/start
export async function POST(request: Request) {
const { itemId, startTimeTicks, quality } = await request.json();
const job = await orchestrator.startTranscoding({
itemId,
startTimeTicks,
quality,
sessionId: generateSessionId()
});
return Response.json({ sessionId: job.id, playlistUrl: job.playlistPath });
}
// POST /api/transcoding/[sessionId]/ping
export async function POST(request: Request, { params }: { params: { sessionId: string } }) {
const { isUserPaused } = await request.json();
orchestrator.pingSession(params.sessionId, isUserPaused);
return Response.json({ success: true });
}
// GET /api/hls/[sessionId]/[segmentId].ts
export async function GET(request: Request, { params }: { params: { sessionId: string, segmentId: string } }) {
const segmentPath = getSegmentPath(params.sessionId, params.segmentId);
const stream = fs.createReadStream(segmentPath);
return new Response(stream as any);
}
// DELETE /api/transcoding/[sessionId]
export async function DELETE(request: Request, { params }: { params: { sessionId: string } }) {
await orchestrator.killSession(params.sessionId);
return Response.json({ success: true });
}
```
### 3. **Client Integration Patterns**
```typescript
class AdaptiveMediaPlayer {
private sessionId: string;
private pingInterval: NodeJS.Timeout;
private lastPosition: number = 0;
async startPlayback(itemId: string, startPosition: number = 0) {
// Start transcoding session
const response = await fetch('/api/transcoding/start', {
method: 'POST',
body: JSON.stringify({ itemId, startTimeTicks: startPosition * 10000000 })
});
const { sessionId, playlistUrl } = await response.json();
this.sessionId = sessionId;
this.startPinging();
// Load HLS playlist
this.hls.loadSource(playlistUrl);
}
private startPinging() {
this.pingInterval = setInterval(() => {
fetch(`/api/transcoding/${this.sessionId}/ping`, {
method: 'POST',
body: JSON.stringify({ isUserPaused: this.videoElement.paused })
});
}, 30000); // Ping every 30 seconds
}
async seekTo(targetPosition: number) {
const seekDistance = Math.abs(targetPosition - this.lastPosition);
if (seekDistance > 30) {
// Large seek: restart transcoding
await this.stopPlayback();
await this.startPlayback(this.itemId, targetPosition);
} else {
// Small seek: let HLS player handle
this.videoElement.currentTime = targetPosition;
}
this.lastPosition = targetPosition;
}
async stopPlayback() {
if (this.pingInterval) {
clearInterval(this.pingInterval);
}
if (this.sessionId) {
await fetch(`/api/transcoding/${this.sessionId}`, { method: 'DELETE' });
}
}
}
```
### 4. **Configuration Management**
```typescript
// Environment configuration
interface TranscodingConfig {
ffmpegPath: string;
transcodingTempPath: string;
maxConcurrentStreams: number;
segmentDuration: number; // 6 seconds recommended
segmentKeepCount: number;
throttleThresholdSeconds: number; // 60 seconds recommended
pingTimeoutProgressive: number; // 10 seconds
pingTimeoutHls: number; // 60 seconds
hardwareAcceleration: 'auto' | 'nvidia' | 'intel' | 'amd' | 'none';
}
// Load from environment
const config: TranscodingConfig = {
ffmpegPath: process.env.FFMPEG_PATH || '/usr/bin/ffmpeg',
transcodingTempPath: process.env.TRANSCODING_TEMP_PATH || '/tmp/transcoding',
maxConcurrentStreams: parseInt(process.env.MAX_CONCURRENT_STREAMS || '3'),
segmentDuration: 6,
segmentKeepCount: parseInt(process.env.SEGMENT_KEEP_COUNT || '10'),
throttleThresholdSeconds: 60,
pingTimeoutProgressive: 10000,
pingTimeoutHls: 60000,
hardwareAcceleration: (process.env.HARDWARE_ACCEL as any) || 'auto'
};
```
## Production Deployment Considerations
### 1. **Performance Monitoring**
```typescript
interface TranscodingMetrics {
activeJobs: number;
averageStartupTime: number;
successRate: number;
cpuUsage: number;
memoryUsage: number;
diskIORate: number;
clientTimeouts: number;
}
class MetricsCollector {
collectMetrics(): TranscodingMetrics {
return {
activeJobs: this.orchestrator.getActiveJobCount(),
averageStartupTime: this.calculateAverageStartup(),
successRate: this.calculateSuccessRate(),
cpuUsage: process.cpuUsage().user / 1000000, // Convert to seconds
memoryUsage: process.memoryUsage().heapUsed / 1024 / 1024, // MB
diskIORate: this.calculateDiskIO(),
clientTimeouts: this.timeoutCounter
};
}
}
```
### 2. **Error Handling Strategy**
```typescript
class TranscodingErrorHandler {
async handleFFmpegFailure(job: TranscodingSession, error: Error) {
// Log error with context
logger.error('FFmpeg process failed', {
jobId: job.id,
inputPath: job.inputPath,
error: error.message,
exitCode: job.process.exitCode
});
// Attempt recovery strategies
if (this.isRetryableError(error)) {
return this.retryWithFallback(job);
}
// Clean up resources
await this.cleanupFailedJob(job);
// Notify client
this.notifyClientError(job.sessionId, error);
}
private async retryWithFallback(job: TranscodingSession): Promise<boolean> {
// Try software encoding if hardware failed
if (job.hardwareAcceleration && this.isHardwareError(job.lastError)) {
logger.info('Retrying with software encoding', { jobId: job.id });
return this.restartWithSoftwareEncoding(job);
}
// Try lower quality settings
if (job.retryCount < 2) {
logger.info('Retrying with reduced quality', { jobId: job.id });
return this.restartWithReducedQuality(job);
}
return false;
}
}
```
### 3. **Security Considerations**
```typescript
class SecurityValidator {
validateTranscodingRequest(request: TranscodingRequest): ValidationResult {
// Path traversal protection
if (this.containsPathTraversal(request.inputPath)) {
return { valid: false, error: 'Invalid input path' };
}
// File type validation
if (!this.isAllowedMediaType(request.inputPath)) {
return { valid: false, error: 'Unsupported media type' };
}
// Resource limits
if (this.exceedsResourceLimits(request)) {
return { valid: false, error: 'Resource limits exceeded' };
}
// Rate limiting per client
if (this.isRateLimited(request.clientId)) {
return { valid: false, error: 'Rate limit exceeded' };
}
return { valid: true };
}
private containsPathTraversal(path: string): boolean {
return path.includes('..') || path.includes('~') || path.startsWith('/');
}
private isAllowedMediaType(path: string): boolean {
const allowedExtensions = ['.mp4', '.mkv', '.avi', '.mov', '.m4v'];
return allowedExtensions.some(ext => path.toLowerCase().endsWith(ext));
}
}
```
### 4. **Scaling Strategies**
```typescript
// Horizontal scaling with Redis coordination
class DistributedTranscodingOrchestrator {
private redis: Redis;
private nodeId: string;
async startTranscoding(request: TranscodingRequest): Promise<TranscodingJob> {
// Check global capacity
const globalLoad = await this.getGlobalLoad();
if (globalLoad > 0.8) {
throw new Error('System at capacity');
}
// Assign to least loaded node
const targetNode = await this.findLeastLoadedNode();
if (targetNode === this.nodeId) {
return this.startLocalTranscoding(request);
} else {
return this.delegateToNode(targetNode, request);
}
}
private async getGlobalLoad(): Promise<number> {
const nodes = await this.redis.smembers('transcoding:nodes');
let totalJobs = 0;
let totalCapacity = 0;
for (const node of nodes) {
const nodeLoad = await this.redis.hgetall(`transcoding:node:${node}`);
totalJobs += parseInt(nodeLoad.activeJobs || '0');
totalCapacity += parseInt(nodeLoad.maxCapacity || '0');
}
return totalCapacity > 0 ? totalJobs / totalCapacity : 1;
}
}
```
## Summary of Key Learnings
### **Architecture Principles**
1. **Process Isolation**: Each transcoding session gets independent FFmpeg process
2. **Linear Encoding**: FFmpeg cannot seek backwards, requiring process restart for large seeks
3. **Resource Management**: Proactive throttling and cleanup prevent resource exhaustion
4. **Client Lifecycle**: Ping-based session management with automatic cleanup
### **Performance Optimizations**
1. **Segment Strategy**: 6-second HLS segments balance seek performance with efficiency
2. **Throttling Logic**: Pause encoding when >60 seconds ahead of playback
3. **Cleanup Policies**: Automatic segment removal for content >5 minutes
4. **Hardware Acceleration**: Graceful fallback from hardware to software encoding
### **Implementation Strategy**
1. **API Design**: RESTful endpoints for session management and HLS segment delivery
2. **Client Integration**: Adaptive seeking strategy based on seek distance
3. **Error Handling**: Comprehensive retry logic with fallback options
4. **Security**: Input validation, path traversal protection, rate limiting
### **Deployment Considerations**
1. **Monitoring**: Track active jobs, success rates, resource usage
2. **Scaling**: Horizontal scaling with Redis coordination
3. **Storage**: Temporary file management and cleanup
4. **Network**: CDN integration for segment delivery
This architecture provides a robust foundation for building production-grade media transcoding systems with proper resource management, client lifecycle handling, and performance optimization.
## Core Architecture
### 1. Main Components
- **Role**: Central orchestrator for all transcoding operations
- **Key Responsibilities**:
- FFmpeg process lifecycle management
- Job tracking and session management
- Resource cleanup and monitoring
- Client ping/keepalive handling
#### MediaEncoder (`MediaEncoder.cs`)
- **Role**: FFmpeg binary interface and process execution
- **Key Responsibilities**:
- FFmpeg path validation and capability detection
- Process creation and monitoring
- Hardware acceleration detection
- Encoder/decoder capability enumeration
#### TranscodingJob (`TranscodingJob.cs`)
- **Role**: Individual transcoding session state management
- **Key Responsibilities**:
- Process lifecycle tracking
- Resource usage monitoring (bytes, position, bitrate)
- Timer management for auto-cleanup
- Client connection state
### 2. Control Flow
```mermaid
graph TB
Client[Client Request] --> API[API Controller]
API --> StreamState[Create StreamState]
StreamState --> TranscodeManager[TranscodeManager.StartFfMpeg]
TranscodeManager --> FFmpeg[Launch FFmpeg Process]
FFmpeg --> Job[Create TranscodingJob]
Job --> Monitor[Job Monitoring]
Monitor --> Throttle[Throttling System]
Monitor --> Cleanup[Segment Cleanup]
Monitor --> Ping[Ping System]
Ping --> Timeout[Kill Timer]
```
#### Detailed Flow:
1. **Request Reception**: API controllers (`DynamicHlsController`, `VideosController`, `AudioController`) receive transcoding requests
2. **State Creation**: `StreamingHelpers.GetStreamingState()` analyzes media and creates `StreamState`
3. **Job Initialization**: `TranscodeManager.StartFfMpeg()` creates and starts FFmpeg process
4. **Process Monitoring**: `TranscodingJob` tracks process state and resource usage
5. **Client Management**: Ping system keeps sessions alive, kill timers handle disconnections
6. **Resource Management**: Throttling and segment cleaning optimize performance
7. **Cleanup**: Automatic cleanup when sessions end or timeout
## Key Systems
### 1. Ping System - Keep Transcoding Alive
**Purpose**: Prevents transcoding jobs from being killed when clients are actively consuming content.
**Implementation**:
```csharp
// PingTranscodingJob method in TranscodeManager
public void PingTranscodingJob(string playSessionId, bool? isUserPaused)
{
var jobs = _activeTranscodingJobs.Where(j =>
string.Equals(playSessionId, j.PlaySessionId, StringComparison.OrdinalIgnoreCase))
.ToList();
foreach (var job in jobs)
{
if (isUserPaused.HasValue)
{
job.IsUserPaused = isUserPaused.Value;
}
PingTimer(job, true);
}
}
private void PingTimer(TranscodingJob job, bool isProgressCheckIn)
{
if (job.HasExited)
{
job.StopKillTimer();
return;
}
// Different timeouts for different job types
var timerDuration = job.Type != TranscodingJobType.Progressive ? 60000 : 10000;
job.PingTimeout = timerDuration;
job.LastPingDate = DateTime.UtcNow;
job.StartKillTimer(OnTranscodeKillTimerStopped);
}
```
**Key Parameters**:
- **Progressive streams**: 10 second timeout (10000ms)
- **HLS streams**: 60 second timeout (60000ms)
- **Ping frequency**: Clients should ping every 30-45 seconds
- **Auto-ping**: Triggered on playback progress events
### 2. Kill Timers - Automatic Cleanup
**Purpose**: Automatically terminate abandoned transcoding jobs to free resources.
**Implementation**:
```csharp
private async void OnTranscodeKillTimerStopped(object? state)
{
var job = state as TranscodingJob;
if (!job.HasExited && job.Type != TranscodingJobType.Progressive)
{
var timeSinceLastPing = (DateTime.UtcNow - job.LastPingDate).TotalMilliseconds;
if (timeSinceLastPing < job.PingTimeout)
{
// Reset timer if ping is still fresh
job.StartKillTimer(OnTranscodeKillTimerStopped, job.PingTimeout);
return;
}
}
// Kill the job
await KillTranscodingJob(job, true, path => true).ConfigureAwait(false);
}
```
**Configuration**:
- **Grace period**: Jobs get multiple timeout intervals before termination
- **Progressive vs HLS**: Different timeout strategies
- **Resource cleanup**: Process termination, file cleanup, live stream closure
### 3. Throttling - Resource Management
**Purpose**: Controls transcoding speed to prevent resource waste and improve efficiency.
**Conditions for Throttling**:
```csharp
private static bool EnableThrottling(StreamState state)
=> state.InputProtocol == MediaProtocol.File
&& state.RunTimeTicks.HasValue
&& state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks
&& state.IsInputVideo
&& state.VideoType == VideoType.VideoFile;
```
**Throttling Logic**:
```csharp
private bool IsThrottleAllowed(TranscodingJob job, int thresholdSeconds)
{
var bytesDownloaded = job.BytesDownloaded;
var transcodingPositionTicks = job.TranscodingPositionTicks ?? 0;
var downloadPositionTicks = job.DownloadPositionTicks ?? 0;
var gapLengthInTicks = TimeSpan.FromSeconds(thresholdSeconds).Ticks;
if (downloadPositionTicks > 0 && transcodingPositionTicks > 0)
{
// HLS - time-based consideration
var gap = transcodingPositionTicks - downloadPositionTicks;
return gap > gapLengthInTicks;
}
// Progressive - byte-based consideration
// Calculate if transcoding is ahead enough to throttle
}
```
**Control Mechanism**:
- **Pause command**: Send 'p' (or 'c' for older FFmpeg) to stdin
- **Resume command**: Send 'u' (or newline for older FFmpeg) to stdin
- **Threshold**: Minimum 60 seconds ahead before throttling kicks in
### 4. Segment Cleaning - HLS Management
**Purpose**: Removes old HLS segments to prevent disk space exhaustion.
**Conditions**:
```csharp
private static bool EnableSegmentCleaning(StreamState state)
=> state.InputProtocol is MediaProtocol.File or MediaProtocol.Http
&& state.IsInputVideo
&& state.TranscodingType == TranscodingJobType.Hls
&& state.RunTimeTicks.HasValue
&& state.RunTimeTicks.Value >= TimeSpan.FromMinutes(5).Ticks;
```
**Implementation**:
- **Segment retention**: Keeps last N segments (configurable)
- **Cleanup frequency**: Runs periodically during transcoding
- **File patterns**: Removes `.ts` or `.mp4` segments and related files
- **Safety**: Includes retry logic and error handling
## FFmpeg Command Structure
### Complete Command Template
```bash
{inputModifier} {inputArgument} -map_metadata -1 -map_chapters -1 -threads {threads} {mapArgs} {videoArguments} {audioArguments} -copyts -avoid_negative_ts disabled -max_muxing_queue_size {maxMuxingQueueSize} -f hls -max_delay 5000000 -hls_time {segmentLength} -hls_segment_type {segmentFormat} -start_number {startNumber} -hls_segment_filename "{segmentPath}" {hlsArguments} -y "{outputPath}"
```
### Parameter Breakdown
#### Input Modifiers
```bash
-re # Read input at native framerate
-hwaccel cuda # Hardware acceleration (optional)
-fflags +genpts # Generate presentation timestamps
-analyzeduration 5000000 # Analysis duration for streams
-readrate 10 # Input read rate limit (for segment deletion)
```
#### Core Parameters
```bash
-threads 0 # Auto-detect thread count
-map_metadata -1 # Strip metadata
-map_chapters -1 # Strip chapters
-copyts # Copy timestamps
-avoid_negative_ts disabled # Handle negative timestamps
-max_muxing_queue_size 128 # Muxing queue size
```
#### HLS-Specific Parameters
```bash
-f hls # Output format
-hls_time 6 # Segment duration (seconds)
-hls_segment_type mpegts # Segment container (mpegts/fmp4)
-hls_playlist_type vod # Playlist type (vod/event)
-hls_list_size 0 # Keep all segments in playlist
-start_number 0 # Starting segment number
-hls_segment_filename "output%d.ts" # Segment naming pattern
-hls_base_url "hls/stream/" # Base URL for segments
```
#### Output Specification
```bash
"output.m3u8" # Output playlist file
```
### Video Encoding Parameters
#### Quality Control
```bash
-c:v libx264 # Video codec
-preset veryfast # Encoding speed/quality trade-off
-crf 23 # Constant rate factor (quality)
-maxrate 2000k # Maximum bitrate
-bufsize 4000k # Buffer size
```
#### Resolution and Framerate
```bash
-vf "scale=1920:1080" # Scale to specific resolution
-r 30 # Output framerate
-pix_fmt yuv420p # Pixel format
```
### Audio Encoding Parameters
#### Basic Audio
```bash
-c:a aac # Audio codec
-ab 128k # Audio bitrate
-ar 48000 # Sample rate
-ac 2 # Audio channels
```
#### Advanced Audio Processing
```bash
-af "volume=1.0" # Audio filters
-acodec copy # Copy audio stream
```
## Implementation Guide for Next.js
### 1. Core Architecture
```typescript
// Core interfaces
interface TranscodingJob {
id: string;
playSessionId: string;
process: ChildProcess;
lastPingDate: Date;
pingTimeout: number;
bytesDownloaded: number;
transcodingPositionTicks: number;
activeRequestCount: number;
isUserPaused: boolean;
hasExited: boolean;
}
interface StreamState {
outputFilePath: string;
segmentLength: number;
inputProtocol: string;
runTimeTicks: number;
isInputVideo: boolean;
transcodingType: 'hls' | 'progressive';
}
```
### 2. TranscodeManager Implementation
```typescript
class TranscodeManager {
private activeJobs = new Map<string, TranscodingJob>();
private killTimers = new Map<string, NodeJS.Timeout>();
async startFfmpeg(state: StreamState, playSessionId: string): Promise<TranscodingJob> {
const commandArgs = this.buildFfmpegCommand(state);
const process = spawn('ffmpeg', commandArgs);
const job: TranscodingJob = {
id: crypto.randomUUID(),
playSessionId,
process,
lastPingDate: new Date(),
pingTimeout: state.transcodingType === 'progressive' ? 10000 : 60000,
bytesDownloaded: 0,
transcodingPositionTicks: 0,
activeRequestCount: 1,
isUserPaused: false,
hasExited: false
};
this.activeJobs.set(playSessionId, job);
this.startKillTimer(job);
return job;
}
pingTranscodingJob(playSessionId: string, isUserPaused?: boolean) {
const job = this.activeJobs.get(playSessionId);
if (!job || job.hasExited) return;
if (isUserPaused !== undefined) {
job.isUserPaused = isUserPaused;
}
job.lastPingDate = new Date();
this.resetKillTimer(job);
}
private startKillTimer(job: TranscodingJob) {
this.clearKillTimer(job.playSessionId);
const timer = setTimeout(() => {
this.checkAndKillJob(job);
}, job.pingTimeout);
this.killTimers.set(job.playSessionId, timer);
}
private async checkAndKillJob(job: TranscodingJob) {
const timeSinceLastPing = Date.now() - job.lastPingDate.getTime();
if (timeSinceLastPing < job.pingTimeout) {
// Reset timer if ping is still fresh
this.startKillTimer(job);
return;
}
// Kill the job
await this.killTranscodingJob(job);
}
}
```
### 3. API Endpoints
```typescript
// Next.js API routes
// /api/transcoding/[playSessionId]/ping
export async function POST(request: Request, { params }: { params: { playSessionId: string } }) {
const { isUserPaused } = await request.json();
transcodeManager.pingTranscodingJob(params.playSessionId, isUserPaused);
return Response.json({ success: true });
}
// /api/hls/[...segments]
export async function GET(request: Request, { params }: { params: { segments: string[] } }) {
const [playSessionId, segmentFile] = params.segments;
if (segmentFile.endsWith('.m3u8')) {
// Return playlist
return new Response(playlist, {
headers: { 'Content-Type': 'application/vnd.apple.mpegurl' }
});
} else {
// Return segment file
const filePath = path.join(transcodingDir, segmentFile);
const fileStream = fs.createReadStream(filePath);
return new Response(fileStream as any);
}
}
```
### 4. Client Integration
```typescript
// Client-side ping implementation
class MediaPlayer {
private pingInterval: NodeJS.Timeout | null = null;
private playSessionId: string;
startPinging() {
this.pingInterval = setInterval(() => {
fetch(`/api/transcoding/${this.playSessionId}/ping`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ isUserPaused: this.isPaused })
});
}, 30000); // Ping every 30 seconds
}
stopPinging() {
if (this.pingInterval) {
clearInterval(this.pingInterval);
this.pingInterval = null;
}
}
}
```
## Seeking Implementation
Jellyfin implements sophisticated seeking mechanisms for both progressive and HLS transcoding, handling different scenarios and optimizing for performance.
### 1. Core Seeking Logic
**Central Method**: `GetFastSeekCommandLineParameter()` in `EncodingHelper.cs`
```csharp
public string GetFastSeekCommandLineParameter(EncodingJobInfo state, EncodingOptions options, string segmentContainer)
{
var time = state.BaseRequest.StartTimeTicks ?? 0;
var maxTime = state.RunTimeTicks ?? 0;
var seekParam = string.Empty;
if (time > 0)
{
// For direct streaming/remuxing, we seek at the exact position of the keyframe
// However, ffmpeg will seek to previous keyframe when the exact time is the input
// Workaround this by adding 0.5s offset to the seeking time to get the exact keyframe on most videos.
// This will help subtitle syncing.
var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec);
var seekTick = isHlsRemuxing ? time + 5000000L : time;
// Seeking beyond EOF makes no sense in transcoding. Clamp the seekTick value to
// [0, RuntimeTicks - 5.0s], so that the muxer gets packets and avoid error codes.
if (maxTime > 0)
{
seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0));
}
seekParam += string.Format(CultureInfo.InvariantCulture, "-ss {0}", _mediaEncoder.GetTimeParameter(seekTick));
if (state.IsVideoRequest)
{
// Add -noaccurate_seek for specific conditions
if (!string.Equals(state.InputContainer, "wtv", StringComparison.OrdinalIgnoreCase)
&& !string.Equals(segmentFormat, "ts", StringComparison.OrdinalIgnoreCase)
&& state.TranscodingType != TranscodingJobType.Progressive
&& !state.EnableBreakOnNonKeyFrames(outputVideoCodec)
&& (state.BaseRequest.StartTimeTicks ?? 0) > 0)
{
seekParam += " -noaccurate_seek";
}
}
}
return seekParam;
}
```
### 2. Seeking Types
#### A. **Input Seeking** (`-ss` before input)
- **Purpose**: Seek to position before decoding starts
- **Advantages**: Very fast, minimal CPU usage
- **Disadvantages**: Less accurate, seeks to nearest keyframe
- **Usage**: Primary method for initial positioning
#### B. **Output Seeking** (`-ss` after input)
- **Purpose**: Decode from beginning, then seek in output
- **Advantages**: Frame-accurate positioning
- **Disadvantages**: High CPU usage, slower startup
- **Usage**: When precision is critical
#### C. **Accurate vs Fast Seeking**
- **Fast seeking** (`-noaccurate_seek`): Seeks to nearest keyframe (default)
- **Accurate seeking**: Frame-precise but slower
- **Dynamic selection**: Based on container and transcoding type
### 3. HLS Segment-Based Seeking
#### **Segment URL Structure**
```
/hls/{itemId}/{playlistId}/{segmentId}.{container}?runtimeTicks={position}&actualSegmentLengthTicks={duration}
```
#### **Key Parameters**:
- **`runtimeTicks`**: Starting position of segment in media timeline
- **`actualSegmentLengthTicks`**: Precise duration of this specific segment
- **`segmentId`**: Sequential segment number (0-based)
#### **Segment Calculation**:
```csharp
// Equal-length segments
var segmentLengthTicks = TimeSpan.FromSeconds(segmentLength).Ticks;
var wholeSegments = runtimeTicks / segmentLengthTicks;
var remainingTicks = runtimeTicks % segmentLengthTicks;
// Keyframe-aware segments (optimal)
var result = new List<double>();
var desiredSegmentLengthTicks = TimeSpan.FromMilliseconds(desiredSegmentLengthMs).Ticks;
foreach (var keyframe in keyframeData.KeyframeTicks)
{
if (keyframe >= desiredCutTime)
{
var currentSegmentLength = keyframe - lastKeyframe;
result.Add(TimeSpan.FromTicks(currentSegmentLength).TotalSeconds);
lastKeyframe = keyframe;
desiredCutTime += desiredSegmentLengthTicks;
}
}
```
### 4. Seek Optimizations
#### **HLS Remuxing Offset**
```csharp
// Add 0.5s offset for HLS remuxing to hit exact keyframes
var isHlsRemuxing = state.IsVideoRequest && state.TranscodingType is TranscodingJobType.Hls && IsCopyCodec(state.OutputVideoCodec);
var seekTick = isHlsRemuxing ? time + 5000000L : time; // +0.5s
```
#### **EOF Protection**
```csharp
// Prevent seeking beyond end of file
if (maxTime > 0)
{
seekTick = Math.Clamp(seekTick, 0, Math.Max(maxTime - 50000000L, 0)); // -5s buffer
}
```
#### **Container-Specific Rules**
- **WTV containers**: Never use `-noaccurate_seek` (breaks seeking)
- **MPEGTS segments**: Disable `-noaccurate_seek` for client compatibility
- **fMP4 containers**: Require `-noaccurate_seek` for audio sync
### 5. Keyframe Extraction
**Purpose**: Generate precise segment boundaries aligned with keyframes
#### **FFprobe Method**:
```bash
ffprobe -loglevel error -skip_frame nokey -select_streams v:0 -show_entries packet=pts_time,flags -of csv=print_section=0 "input.mp4"
```
#### **Matroska Method**:
- Direct cue point extraction from container metadata
- Much faster than FFprobe for MKV files
- Reads cue tables for instant keyframe positions
### 6. Real-Time Seeking Scenarios
#### **Progressive Transcoding**
```typescript
// Client seeks to new position
const seekTo = (positionSeconds: number) => {
// Kill current transcoding job
await fetch(`/api/transcode/kill/${playSessionId}`, { method: 'POST' });
// Start new transcoding from seek position
const startTimeTicks = positionSeconds * 10000000; // Convert to ticks
const newUrl = `/api/videos/${itemId}/stream?startTimeTicks=${startTimeTicks}`;
// Update video source
videoElement.src = newUrl;
};
```
#### **HLS Seeking**
```typescript
// HLS seeking is handled by the player automatically
// Server generates segments with proper seek points
const hlsPlayer = new Hls();
hlsPlayer.loadSource('/api/videos/{itemId}/master.m3u8');
// Player handles seeking by requesting appropriate segments
// No need to restart transcoding jobs
```
### 7. External Media Handling
#### **External Subtitles**
```csharp
// Also seek external subtitle streams
var seekSubParam = GetFastSeekCommandLineParameter(state, options, segmentContainer);
if (!string.IsNullOrEmpty(seekSubParam))
{
arg.Append(' ').Append(seekSubParam);
}
arg.Append(" -i file:\"").Append(subtitlePath).Append('"');
```
#### **External Audio**
```csharp
// Seek external audio streams to match video
var seekAudioParam = GetFastSeekCommandLineParameter(state, options, segmentContainer);
if (!string.IsNullOrEmpty(seekAudioParam))
{
arg.Append(' ').Append(seekAudioParam);
}
arg.Append(" -i \"").Append(state.AudioStream.Path).Append('"');
```
### 8. Implementation Guide for Next.js
#### **Progressive Seeking**
```typescript
class ProgressiveTranscoder {
async seekTo(positionTicks: number): Promise<string> {
// Kill existing job
await this.killCurrentJob();
// Calculate seek parameters
const seekSeconds = positionTicks / 10000000;
const maxTime = this.mediaDuration;
// Clamp to safe bounds
const safeSeekTicks = Math.max(0, Math.min(positionTicks, maxTime - 50000000));
// Build FFmpeg command with seek
const args = [
'-ss', this.formatTime(safeSeekTicks),
'-i', this.inputPath,
'-c:v', 'libx264',
'-preset', 'veryfast',
// ... other encoding params
this.outputPath
];
return this.startTranscoding(args);
}
private formatTime(ticks: number): string {
const totalSeconds = ticks / 10000000;
const hours = Math.floor(totalSeconds / 3600);
const minutes = Math.floor((totalSeconds % 3600) / 60);
const seconds = totalSeconds % 60;
return `${hours}:${minutes.toString().padStart(2, '0')}:${seconds.toFixed(6).padStart(9, '0')}`;
}
}
```
#### **HLS Seeking**
```typescript
class HLSTranscoder {
generateSegmentUrl(segmentId: number, runtimeTicks: number, segmentDurationTicks: number): string {
const params = new URLSearchParams({
runtimeTicks: runtimeTicks.toString(),
actualSegmentLengthTicks: segmentDurationTicks.toString()
});
return `/api/hls/${this.itemId}/${this.playlistId}/${segmentId}.ts?${params}`;
}
async generateSegment(segmentId: number, runtimeTicks: number): Promise<Buffer> {
const seekSeconds = runtimeTicks / 10000000;
const args = [
'-ss', this.formatTime(runtimeTicks),
'-i', this.inputPath,
'-t', this.segmentDuration.toString(),
'-c:v', 'libx264',
'-preset', 'veryfast',
'-force_key_frames', `expr:gte(t,n_forced*${this.segmentDuration})`,
'-f', 'mpegts',
'-'
];
return this.executeFFmpeg(args);
}
}
```
### 9. Performance Considerations
#### **Seek Performance Tips**:
1. **Use input seeking** (`-ss` before `-i`) when possible
2. **Cache keyframe data** for containers that support it
3. **Implement seek debouncing** to prevent rapid job restarts
4. **Use appropriate segment duration** (6s recommended for seek performance)
5. **Pre-generate keyframe indexes** for frequently accessed content
#### **Client-Side Optimizations**:
```typescript
// Debounce seeking to prevent excessive requests
const debouncedSeek = debounce((position: number) => {
this.performSeek(position);
}, 300);
// Progressive seeking strategy
if (Math.abs(targetPosition - currentPosition) < 30) {
// Small seeks: let player buffer naturally
player.currentTime = targetPosition;
} else {
// Large seeks: restart transcoding
this.seekTo(targetPosition);
}
```
## Progress Tracking and Seeking During Transcoding
**The Challenge**: Unlike direct play where duration and seek positions are straightforward, transcoding creates a "streaming-like" scenario where the real duration is not immediately available and progress tracking becomes complex.
### 1. Core Progress Tracking Architecture
**Key Components**:
- **`TranscodingPositionTicks`**: Where FFmpeg transcoding has currently reached
- **`DownloadPositionTicks`**: Where the client has consumed content to
- **`CompletionPercentage`**: Calculated progress based on runtime vs current position
- **`RunTimeTicks`**: Total media duration from metadata
#### **Progress Calculation Logic**:
```csharp
// From JobLogger.ParseLogLine() - extracts progress from FFmpeg output
var totalMs = state.RunTimeTicks.HasValue
? TimeSpan.FromTicks(state.RunTimeTicks.Value).TotalMilliseconds
: 0;
var currentMs = /* parsed from FFmpeg time output */;
if (totalMs > 0)
{
percent = 100.0 * currentMs / totalMs;
transcodingPosition = TimeSpan.FromMilliseconds(currentMs);
}
```
### 2. Real-Time Progress Updates
#### **FFmpeg Output Parsing**:
```csharp
// JobLogger monitors FFmpeg stderr output for progress
private void ParseLogLine(string line, EncodingJobInfo state)
{
// Parse: frame= 123 fps= 25 q=28.0 size= 1024kB time=00:01:23.45 bitrate= 512.0kbits/s
// Extract: time value for current transcoding position
// Extract: size value for bytes transcoded
// Extract: bitrate for current encoding rate
}
```
#### **Progress Reporting Chain**:
```typescript
// 1. FFmpeg outputs progress to stderr
// 2. JobLogger.ParseLogLine() extracts values
// 3. TranscodeManager.ReportTranscodingProgress() updates job state
// 4. SessionManager.ReportTranscodingInfo() updates client session
// 5. TranscodingInfo DTO sent to client via WebSocket/API
interface TranscodingInfo {
CompletionPercentage?: number; // 0-100 progress
Bitrate?: number; // Current bitrate
Framerate?: number; // Current FPS
Width?: number; // Video width
Height?: number; // Video height
AudioCodec: string; // Audio codec in use
VideoCodec: string; // Video codec in use
Container: string; // Output container
}
```
### 3. Client Progress Bar Implementation
#### **Progressive Transcoding**:
```typescript
class ProgressiveTranscodingProgress {
private transcodingInfo: TranscodingInfo;
private mediaRunTimeTicks: number;
updateProgressBar(): void {
if (this.transcodingInfo?.CompletionPercentage) {
// Use transcoding percentage directly
const progress = this.transcodingInfo.CompletionPercentage / 100;
this.progressBar.value = progress;
// Estimate current playable duration
const availableDuration = this.mediaRunTimeTicks * progress;
this.updateSeekableRange(0, availableDuration);
}
}
handleSeek(targetPositionTicks: number): void {
const transcodedTicks = this.mediaRunTimeTicks * (this.transcodingInfo.CompletionPercentage / 100);
if (targetPositionTicks <= transcodedTicks) {
// Seek within transcoded content
this.player.currentTime = targetPositionTicks / 10000000;
} else {
// Restart transcoding from seek position
this.startTranscodingFromPosition(targetPositionTicks);
}
}
}
```
#### **HLS Transcoding**:
```typescript
class HLSTranscodingProgress {
private segmentDuration: number = 6; // seconds
private totalSegments: number;
calculateProgress(): ProgressInfo {
// HLS progress based on segment availability
const availableSegments = this.getAvailableSegmentCount();
const progress = availableSegments / this.totalSegments;
return {
percentage: progress * 100,
availableDuration: availableSegments * this.segmentDuration,
seekableEnd: availableSegments * this.segmentDuration
};
}
updateSegmentDownloadPosition(): void {
// Update DownloadPositionTicks when segments are consumed
const segmentEndTicks = this.currentRuntimeTicks + this.actualSegmentLengthTicks;
this.transcodingJob.DownloadPositionTicks = Math.max(
this.transcodingJob.DownloadPositionTicks ?? segmentEndTicks,
segmentEndTicks
);
}
}
```
### 4. Solving the "Real Duration is Not Real" Problem
#### **Duration Estimation Strategies**:
**1. Metadata-Based Duration**:
```typescript
// Use media file metadata as baseline
const estimatedDuration = mediaSource.RunTimeTicks;
if (estimatedDuration) {
this.totalDuration = estimatedDuration;
this.progressBar.max = estimatedDuration;
}
```
**2. Progressive Duration Discovery**:
```typescript
// Update duration as transcoding progresses
if (transcodingInfo.CompletionPercentage > 0) {
const currentTranscodedTicks = /* current position from transcoding */;
const estimatedTotal = currentTranscodedTicks / (transcodingInfo.CompletionPercentage / 100);
// Only update if estimate seems reliable (>10% transcoded)
if (transcodingInfo.CompletionPercentage > 10) {
this.totalDuration = estimatedTotal;
}
}
```
**3. HLS Segment-Based Calculation**:
```typescript
// For HLS, calculate from segment structure
const calculateHLSDuration = (segments: SegmentInfo[]): number => {
return segments.reduce((total, segment) => {
return total + segment.actualSegmentLengthTicks;
}, 0);
};
```
### 5. Advanced Progress Management
#### **Buffering and Availability**:
```typescript
class TranscodingBuffer {
private bufferAheadSeconds: number = 30;
isPositionAvailable(targetPositionTicks: number): boolean {
const transcodedTicks = this.getTranscodedPositionTicks();
return targetPositionTicks <= transcodedTicks;
}
calculateSeekableRange(): { start: number; end: number } {
return {
start: 0,
end: this.getTranscodedPositionTicks() - (this.bufferAheadSeconds * 10000000)
};
}
shouldThrottleTranscoding(): boolean {
const gap = this.transcodingPositionTicks - this.downloadPositionTicks;
const targetGap = this.bufferAheadSeconds * 10000000; // 30s in ticks
return gap > targetGap;
}
}
```
#### **Smooth Progress Updates**:
```typescript
class SmoothProgressUpdater {
private interpolationInterval: number = 1000; // 1 second
private lastKnownPosition: number = 0;
private lastUpdateTime: number = Date.now();
interpolateProgress(): number {
if (!this.isPlaying) return this.lastKnownPosition;
const now = Date.now();
const elapsed = now - this.lastUpdateTime;
const estimatedProgress = this.lastKnownPosition + elapsed;
// Don't exceed known transcoded position
const maxAvailable = this.getTranscodedPositionTicks();
return Math.min(estimatedProgress, maxAvailable);
}
}
```
### 6. Error Handling and Edge Cases
#### **Transcoding Failures**:
```typescript
class TranscodingErrorHandler {
handleTranscodingError(error: TranscodingError): void {
switch (error.type) {
case 'SEEK_BEYOND_DURATION':
// Clamp seek to valid range
this.seekTo(Math.min(this.targetPosition, this.maxAvailablePosition));
break;
case 'TRANSCODING_STALLED':
// Restart transcoding
this.restartTranscodingFromLastKnownPosition();
break;
case 'INVALID_DURATION':
// Fall back to live estimation
this.enableLiveDurationEstimation();
break;
}
}
}
```
#### **Network Issues**:
```typescript
class NetworkResilienceHandler {
private retryPolicy = {
maxRetries: 3,
backoffMs: [1000, 2000, 4000]
};
async handleProgressUpdateFailure(attempt: number): Promise<void> {
if (attempt < this.retryPolicy.maxRetries) {
await this.delay(this.retryPolicy.backoffMs[attempt]);
return this.fetchProgressUpdate();
} else {
// Switch to local time-based estimation
this.enableLocalProgressEstimation();
}
}
}
```
### 7. Next.js Implementation Guide
#### **Complete Progress Management**:
```typescript
class NextJSTranscodingProgressManager {
private wsConnection: WebSocket;
private progressUpdateInterval: NodeJS.Timeout;
constructor(private videoElement: HTMLVideoElement) {
this.setupWebSocketUpdates();
this.setupProgressInterpolation();
}
private setupWebSocketUpdates(): void {
this.wsConnection.onmessage = (event) => {
const message = JSON.parse(event.data);
if (message.MessageType === 'TranscodingInfo') {
this.updateTranscodingInfo(message.Data);
}
};
}
private updateTranscodingInfo(info: TranscodingInfo): void {
// Update progress bar
if (info.CompletionPercentage) {
this.progressBar.value = info.CompletionPercentage;
}
// Update seekable range
const seekableEnd = this.calculateSeekableEnd(info);
this.videoElement.setAttribute('data-seekable-end', seekableEnd.toString());
// Update duration if we have better estimate
this.updateDurationEstimate(info);
}
async handleSeek(targetSeconds: number): Promise<void> {
const targetTicks = targetSeconds * 10000000;
const transcodedTicks = this.getTranscodedPositionTicks();
if (targetTicks <= transcodedTicks) {
// Seek within available content
this.videoElement.currentTime = targetSeconds;
} else {
// Show loading state
this.showBufferingState();
// Request new transcoding position
await this.requestTranscodingFromPosition(targetTicks);
// Update video source
this.videoElement.src = this.generateStreamUrl(targetTicks);
}
}
}
```
## Important Configuration
### Environment Variables
```bash
FFMPEG_PATH=/usr/bin/ffmpeg
TRANSCODING_TEMP_PATH=/tmp/jellyfin/transcoding
MAX_CONCURRENT_STREAMS=3
SEGMENT_KEEP_SECONDS=300
THROTTLE_DELAY_SECONDS=60
```
### Performance Tuning
- **Thread count**: Auto-detect based on CPU cores
- **Buffer sizes**: Adjust based on available memory
- **Segment duration**: 6 seconds for good seek performance
- **Concurrent streams**: Limit based on system resources
### Security Considerations
- **Input validation**: Sanitize all file paths and parameters
- **Resource limits**: Prevent DOS through excessive transcoding
- **Access control**: Validate session ownership
- **File cleanup**: Remove orphaned files regularly
## Monitoring and Logging
### Key Metrics to Track
- Active transcoding jobs count
- Resource usage (CPU, memory, disk I/O)
- Average transcoding speed vs playback speed
- Client ping frequency and timeouts
- Segment cleanup efficiency
### Error Handling
- FFmpeg process failures
- Disk space exhaustion
- Network timeouts
- Invalid media files
- Hardware acceleration failures
This architecture provides a robust foundation for building a media transcoding system with proper resource management, client lifecycle handling, and performance optimization.