nextav/docs/active/media-streaming/TS_HLS_TECH.md

14 KiB
Raw Blame History

The Anatomy of a .ts File and Why Your Merged HLS Chunks are Failing

You've encountered a common and technically nuanced problem in the world of video streaming. Your intuition is correct: the simple act of merging individual HTTP Live Streaming (HLS) .ts (Transport Stream) files, as a download manager like IDM likely does, and then trying to play them back as a single HLS stream, is indeed breaking fundamental aspects of the .ts file protocol. This leads to the parsing errors you're seeing in hls.js.

Here is a detailed breakdown of the .ts file protocol, why your current approach is failing, and how to correctly handle this situation for your personal web video site.

Understanding the MPEG Transport Stream (.ts) Protocol

At its core, a .ts file is a container format designed for transmitting and storing audio, video, and data in a way that is resilient to errors. It's the standard for broadcast systems like DVB and ATSC, and a cornerstone of HLS. The key to understanding your issue lies in the structure of these files: they are composed of a series of 188-byte packets.

Each of these small packets contains a header and a payload. The header is crucial as it contains vital information for the video player to correctly decode and synchronize the audio and video streams. The most important elements for your case are:

  • Sync Byte: Every packet starts with a synchronization byte (0x47), which allows the player to find the beginning of a packet in the stream.
  • Packet Identifier (PID): This is a 13-bit number that identifies the type of data the packet's payload contains. For example, there will be a specific PID for the video stream, another for the audio stream, and others for metadata like the Program Map Table (PMT) which tells the player which PIDs belong to which streams.
  • Continuity Counter: This is a 4-bit counter that increments for each packet belonging to the same PID. If a player sees a jump in this counter (e.g., it goes from 2 to 4), it assumes a packet has been lost and will try to compensate, which can lead to glitches.
  • Timestamps (PTS and DTS): The payload of some packets contains Presentation Timestamps (PTS) and Decoding Timestamps (DTS). These are highly precise markers that tell the player exactly when to present a video frame or play an audio sample. In HLS, these timestamps are typically continuous across segments.

Why Merging .ts Files with IDM Breaks HLS Playback

When a website serves a video via HLS, it provides a manifest file (usually with an .m3u8 extension) that lists the individual .ts segments in the correct order. Each of these segments is a self-contained, playable piece of the video, but they are designed to be played sequentially by an HLS-aware player.

When a download manager like IDM downloads these .ts chunks and merges them, it often performs a simple file concatenation essentially pasting the files together one after another. This crude merging process breaks the delicate structure of the Transport Stream in several critical ways from the perspective of an HLS player like hls.js:

  1. Discontinuous Timestamps: Each .ts segment has its own set of PTS and DTS values that start from a specific point in the overall timeline of the video. When you concatenate these files, the timestamps will not be continuous. For example, the first segment might have timestamps from 0 to 6 seconds, and the second from 6.01 to 12 seconds. A simple concatenation will result in a single file where the timestamps jump from the end of one segment to the beginning of the next. This "non-monotonous DTS" is a significant issue that causes most players, including hls.js, to fail.

  2. Reset Continuity Counters: At the beginning of each downloaded .ts chunk, the continuity counter for each PID will reset. When these chunks are concatenated, the player will see these resets as a massive loss of packets, leading to continuity counter errors and playback failure.

  3. Redundant Metadata: Each .ts segment often contains its own metadata tables like the PAT and PMT at the beginning. In a merged file, this redundant information can confuse the player.

Your experience of the first chunk playing and then encountering an error is a classic symptom of this problem. Hls.js loads the initial part of the merged file, plays it, but as it continues to buffer and parse the data, it hits the point where the first segment was joined with the second. At this boundary, it encounters the timestamp and continuity counter discontinuities and throws a parsing error because the stream is no longer valid from its perspective.

The Correct Approach: Re-segmenting with FFmpeg

Your idea to virtually split the large .ts file is on the right track, but a simple byte-range split won't work because it doesn't respect the packet structure and will likely create fragmented, unplayable chunks.

To properly serve this downloaded video on your personal HLS site, you need to re-process the merged .ts file to create new, valid HLS segments with a corresponding .m3u8 playlist. The industry-standard tool for this is FFmpeg.

FFmpeg can take your single large .ts file as input and correctly segment it for HLS. It will re-mux the file, which means it will repackage the audio and video streams into new .ts segments, generating new, continuous timestamps and correct continuity counters in the process.

Here is a basic FFmpeg command to achieve this:

ffmpeg -i your_single_large_file.ts -c:v copy -c:a copy -f hls -hls_time 10 -hls_list_size 0 -hls_segment_filename "segment%03d.ts" playlist.m3u8

Let's break down this command:

  • -i your_single_large_file.ts: Specifies your large, merged .ts file as the input.
  • -c:v copy -c:a copy: This tells FFmpeg to copy the video and audio streams without re-encoding them, which is fast and preserves the original quality.
  • -f hls: Specifies that the output format should be HLS.
  • -hls_time 10: This sets the target duration of each segment in seconds (Apple recommends around 6 seconds).
  • -hls_list_size 0: This creates a playlist that includes all the segments, making it a complete video-on-demand (VOD) playlist.
  • -hls_segment_filename "segment%03d.ts": This defines the naming pattern for the output .ts files (e.g., segment000.ts, segment001.ts, etc.).
  • playlist.m3u8: This is the name of the master playlist file that FFmpeg will generate.

After running this command, you will have a set of correctly segmented .ts files and a playlist.m3u8 file. You can then upload all of these files to your web server, and point your hls.js player to the playlist.m3u8 file. This will provide hls.js with a valid HLS stream that it can parse and play correctly.

Implementation Plan for NextAV

Overview

Based on the technical analysis above, we need to implement a dynamic HLS segmentation system that:

  1. Detects merged .ts files that need re-segmentation
  2. Uses FFmpeg to create proper HLS segments with continuous timestamps
  3. Manages temporary files with lifecycle management
  4. Serves the generated segments through existing HLS endpoints
  5. Cleans up resources when streaming session ends

Architecture Components

1. TS Segmentation Service (/src/lib/ts-segmentation-service.ts)

Purpose: Core service that handles FFmpeg-based segmentation of merged .ts files

Key Features:

  • Detects if a .ts file needs re-segmentation (check for timestamp discontinuities)
  • Creates temporary directory for each video session
  • Executes FFmpeg command to generate proper HLS segments
  • Returns metadata about generated segments

API:

interface SegmentationSession {
  videoId: number;
  sessionId: string;
  tempDir: string;
  playlistPath: string;
  segmentCount: number;
  totalDuration: number;
  createdAt: Date;
  lastAccessed: Date;
}

class TSSegmentationService {
  async createSegmentationSession(videoId: number, videoPath: string): Promise<SegmentationSession>
  async getSession(videoId: number): Promise<SegmentationSession | null>
  async getSegmentPath(videoId: number, segmentIndex: number): Promise<string | null>
  async cleanupSession(videoId: number): Promise<void>
  async cleanupExpiredSessions(): Promise<void>
}

2. Session Management (/src/lib/hls-session-manager.ts)

Purpose: Manages lifecycle of HLS segmentation sessions

Key Features:

  • Tracks active segmentation sessions in memory
  • Implements TTL (Time To Live) for sessions
  • Handles cleanup of expired sessions
  • Provides session heartbeat mechanism

Session Lifecycle:

  1. Creation: When first HLS request is made for a .ts file
  2. Active: While segments are being requested
  3. Idle: After last segment request (with TTL timer)
  4. Cleanup: Remove temporary files and session data

3. Enhanced HLS API Routes

Modified Routes:

  • /api/stream/hls/[id]/playlist.m3u8/route.ts - Check if segmentation needed, create session
  • /api/stream/hls/[id]/segment/[segment]/route.ts - Serve from temp directory or create session

New Route:

  • /api/stream/hls/[id]/cleanup/route.ts - Manual cleanup endpoint

4. Background Cleanup Service (/src/lib/cleanup-scheduler.ts)

Purpose: Periodic cleanup of expired sessions

Features:

  • Runs every 5 minutes to check for expired sessions
  • Configurable TTL (default: 30 minutes of inactivity)
  • Graceful shutdown handling

Implementation Details

FFmpeg Command Strategy

ffmpeg -i input.ts \
  -c:v copy -c:a copy \
  -f hls \
  -hls_time 6 \
  -hls_list_size 0 \
  -hls_segment_filename "segment_%03d.ts" \
  -hls_flags delete_segments+append_list \
  -y playlist.m3u8

Parameters Explained:

  • -hls_time 6: 6-second segments (Apple recommended)
  • -hls_list_size 0: Include all segments in playlist
  • -hls_flags delete_segments+append_list: Better resource management
  • -y: Overwrite existing files

Directory Structure

/tmp/nextav-hls/
├── video-{videoId}-{sessionId}/
│   ├── playlist.m3u8
│   ├── segment_000.ts
│   ├── segment_001.ts
│   └── ...
└── cleanup.log

Session TTL Strategy

  1. Creation TTL: 5 minutes to complete initial segmentation
  2. Active TTL: 30 minutes of inactivity before cleanup
  3. Heartbeat: Each segment request extends TTL
  4. Force Cleanup: Manual cleanup API for immediate removal

Error Handling Strategy

Segmentation Failures

  • FFmpeg Error: Log error, fall back to direct streaming
  • Disk Space: Check available space before segmentation
  • Permission Error: Use fallback temp directory

Session Management Failures

  • Session Not Found: Create new session on-demand
  • Corrupted Temp Files: Clean up and regenerate
  • Concurrent Access: Use file locking for session creation

Performance Optimizations

Caching Strategy

  • Session Cache: Keep session metadata in memory
  • Segment Cache: HTTP cache headers for segments (1 hour)
  • Playlist Cache: Short cache for playlist (30 seconds)

Resource Management

  • Concurrent Segmentation: Limit to 2 simultaneous FFmpeg processes
  • Disk Space Monitoring: Prevent segmentation if disk space < 1GB
  • Memory Management: Use streams for large file operations

API Integration Points

Video Format Detector

// Enhanced detection for merged .ts files
function detectTSSegmentationNeeded(videoPath: string): Promise<boolean> {
  // Use ffprobe to detect timestamp discontinuities
  // Return true if re-segmentation is needed
}

HLS Route Updates

// In playlist.m3u8 route
if (await detectTSSegmentationNeeded(videoPath)) {
  const session = await segmentationService.createSegmentationSession(videoId, videoPath);
  return serveGeneratedPlaylist(session.playlistPath);
} else {
  return serveDirectHLS(videoPath); // Existing virtual segmentation
}

Configuration

interface HLSSegmentationConfig {
  tempDir: string;                    // Default: '/tmp/nextav-hls'
  segmentDuration: number;            // Default: 6 seconds
  sessionTTL: number;                 // Default: 30 minutes
  maxConcurrentJobs: number;          // Default: 2
  minDiskSpace: number;               // Default: 1GB
  cleanupInterval: number;            // Default: 5 minutes
  enableAutoCleanup: boolean;         // Default: true
}

Monitoring and Logging

Metrics to Track

  • Number of active segmentation sessions
  • Segmentation success/failure rates
  • Average segmentation time
  • Disk space usage in temp directories
  • Session cleanup statistics

Log Events

  • Session creation/destruction
  • FFmpeg command execution
  • Cleanup operations
  • Error conditions

Testing Strategy

Unit Tests

  • FFmpeg command generation
  • Session lifecycle management
  • Cleanup scheduling
  • Error handling scenarios

Integration Tests

  • End-to-end HLS playback
  • Concurrent session handling
  • Resource cleanup verification
  • Performance under load

Deployment Considerations

Requirements

  • FFmpeg installed on server
  • Writable temporary directory
  • Sufficient disk space for temporary files
  • Background process for cleanup

Environment Variables

HLS_TEMP_DIR=/tmp/nextav-hls
HLS_SEGMENT_DURATION=6
HLS_SESSION_TTL=1800
HLS_MAX_CONCURRENT_JOBS=2
HLS_CLEANUP_INTERVAL=300

Migration Path

Phase 1: Core Implementation

  1. Implement TSSegmentationService
  2. Add session management
  3. Update HLS routes
  4. Basic error handling

Phase 2: Production Features

  1. Background cleanup service
  2. Performance optimizations
  3. Monitoring and logging
  4. Configuration management

Phase 3: Advanced Features

  1. Load balancing for multiple FFmpeg processes
  2. Distributed session management
  3. Advanced caching strategies
  4. Real-time monitoring dashboard

This implementation plan provides a robust, scalable solution for handling merged .ts files while maintaining good performance and resource management.