tubewatch/PLAYLIST_MONITOR_ARCHITECTU...

24 KiB

Playlist Monitor Service - Architecture & Design Document

Executive Summary

This document outlines the architecture for a Playlist Monitor Service that extends MeTube's capabilities by adding automated playlist monitoring, periodic checking, and intelligent download management. The service will monitor YouTube playlists, track which videos have been downloaded, and automatically download new videos using MeTube as the download engine.


1. Current MeTube Capabilities (Base Service)

1.1 Core Features

  • Video Download Engine: Uses yt-dlp to download videos from YouTube and 100+ sites
  • REST API:
    • POST /add - Add download with parameters (url, quality, format, folder, etc.)
    • POST /delete - Cancel/clear downloads
    • POST /start - Start pending downloads
    • GET /history - Get download history
  • WebSocket Events: Real-time updates (added, updated, completed, canceled, cleared)
  • Queue Management:
    • Sequential, concurrent, or limited concurrent download modes
    • Persistent queue storage using shelve
    • Pending, active, and completed download tracking
  • Cookie Support: Authentication via cookie files stored in STATE_DIR/cookies/
  • Playlist Support: Can download entire playlists with item limits
  • Custom Output Templates: Flexible file naming and directory structure

1.2 Key Technical Components

  • Backend: Python 3.13, aiohttp, socketio, yt-dlp
  • Frontend: Angular, TypeScript
  • Storage: Shelve-based persistent queues
  • Download Info Structure:
    DownloadInfo(id, title, url, quality, format, folder, 
                 custom_name_prefix, error, entry, playlist_item_limit)
    

1.3 API Integration Points

# Add download to MeTube
POST /add
{
  "url": "https://youtube.com/watch?v=...",
  "quality": "best",
  "format": "mp4",
  "folder": "playlist_name",
  "auto_start": true
}

2. New Playlist Monitor Service - Architecture

2.1 Service Overview

The Playlist Monitor Service is a separate microservice that:

  1. Manages playlist subscriptions
  2. Periodically checks playlists for new videos
  3. Tracks download status of each video
  4. Delegates actual downloads to MeTube
  5. Maintains persistent state across restarts

2.2 High-Level Architecture

┌─────────────────────────────────────────────────────────────┐
│                    User Interface (Web)                      │
│  - Add/Remove Playlists                                     │
│  - Configure Check Intervals                                │
│  - Set Start Points                                         │
│  - View Video Status                                        │
│  - Manual Re-download                                       │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│           Playlist Monitor Service (REST API)                │
│  ┌──────────────────┐  ┌─────────────────┐                 │
│  │ Playlist Manager │  │ Video Tracker   │                 │
│  └──────────────────┘  └─────────────────┘                 │
│  ┌──────────────────┐  ┌─────────────────┐                 │
│  │ Scheduler Engine │  │ State Manager   │                 │
│  └──────────────────┘  └─────────────────┘                 │
└────────────────────┬────────────────────┬───────────────────┘
                     │                    │
                     ▼                    ▼
         ┌───────────────────┐  ┌──────────────────┐
         │  MeTube Service   │  │ Database/Storage │
         │  (Download Engine)│  │ (SQLite/Shelve)  │
         └───────────────────┘  └──────────────────┘
                     │
                     ▼
         ┌───────────────────┐
         │  Download Files   │
         │  (Filesystem)     │
         └───────────────────┘

2.3 Component Breakdown

2.3.1 Playlist Manager

  • Responsibilities:

    • Add/remove/update playlist subscriptions
    • Configure playlist-specific settings (check interval, quality, format, folder)
    • Fetch playlist metadata using yt-dlp
    • Extract video list from playlists
  • Data Model:

    class PlaylistSubscription:
        id: str                    # Unique playlist ID
        url: str                   # Playlist URL
        title: str                 # Playlist title
        check_interval: int        # Minutes between checks
        last_checked: datetime     # Last check timestamp
        start_point: str           # Video ID/index to start from
        quality: str               # Download quality (best, 1080, etc.)
        format: str                # Download format (mp4, any, etc.)
        folder: str                # Download folder
        enabled: bool              # Active/paused
        created_at: datetime
        updated_at: datetime
    

2.3.2 Video Tracker

  • Responsibilities:

    • Track status of each video in each playlist
    • Persist video metadata and download status
    • Handle file movement tracking (detached from filesystem)
    • Prevent re-downloads unless explicitly requested
  • Data Model:

    class VideoRecord:
        id: str                    # Unique video ID
        playlist_id: str           # Foreign key to playlist
        video_url: str             # Direct video URL
        video_id: str              # YouTube video ID
        title: str                 # Video title
        playlist_index: int        # Position in playlist
        upload_date: datetime      # Video upload date
    
        # Download tracking
        status: VideoStatus        # PENDING, DOWNLOADING, COMPLETED, FAILED, SKIPPED
        download_requested_at: datetime
        download_completed_at: datetime
        metube_download_id: str    # Reference to MeTube download
    
        # File tracking (decoupled from actual file)
        original_filename: str     # Filename when downloaded
        file_moved: bool           # Whether user moved the file
        file_location_note: str    # Optional note about file location
    
        # Error handling
        error_message: str
        retry_count: int
        last_error_at: datetime
    
        created_at: datetime
        updated_at: datetime
    
    enum VideoStatus:
        PENDING       # Not yet downloaded
        DOWNLOADING   # Currently being downloaded
        COMPLETED     # Successfully downloaded
        FAILED        # Download failed
        SKIPPED       # Before start_point or manually skipped
    

2.3.3 Scheduler Engine

  • Responsibilities:

    • Periodic task execution (check playlists)
    • Background job queue management
    • Rate limiting and retry logic
    • Health monitoring
  • Implementation Options:

    • Option A: APScheduler (Python async scheduler)
    • Option B: Celery + Redis (production-grade)
    • Recommended: APScheduler for simplicity
  • Tasks:

    # Scheduled Tasks
    1. check_playlist(playlist_id)     # Check single playlist for new videos
    2. check_all_playlists()           # Check all enabled playlists
    3. retry_failed_downloads()        # Retry failed downloads
    4. cleanup_old_records()           # Archive old data
    5. sync_metube_status()            # Sync status from MeTube
    

2.3.4 State Manager

  • Responsibilities:

    • Persistent storage of playlists and videos
    • Database migrations
    • Data backup and recovery
    • Import/export functionality
  • Storage Options:

    • Option A: SQLite (recommended for simplicity)
    • Option B: PostgreSQL (for scalability)
    • Option C: Shelve (consistency with MeTube)
  • Recommended: SQLite with SQLAlchemy ORM

2.3.5 MeTube Client

  • Responsibilities:

    • HTTP client to communicate with MeTube API
    • WebSocket client to receive real-time updates
    • Status synchronization
    • Error handling and retry logic
  • Implementation:

    class MeTubeClient:
        def __init__(self, base_url: str):
            self.base_url = base_url
            self.session = aiohttp.ClientSession()
    
        async def add_download(self, url, quality, format, folder):
            response = await self.session.post(
                f"{self.base_url}/add",
                json={
                    "url": url,
                    "quality": quality,
                    "format": format,
                    "folder": folder,
                    "auto_start": True
                }
            )
            return await response.json()
    
        async def get_history(self):
            response = await self.session.get(f"{self.base_url}/history")
            return await response.json()
    
        async def listen_to_events(self, callback):
            # WebSocket connection to receive updates
            sio = socketio.AsyncClient()
    
            @sio.on('completed')
            async def on_completed(data):
                await callback('completed', data)
    
            @sio.on('updated')
            async def on_updated(data):
                await callback('updated', data)
    
            await sio.connect(self.base_url)
    

3. Core Workflows

3.1 Add Playlist Workflow

User → Add Playlist (URL, settings)
  ↓
Validate URL & extract playlist info via yt-dlp
  ↓
Create PlaylistSubscription record
  ↓
Fetch current video list
  ↓
For each video:
  - Create VideoRecord
  - If before start_point: status = SKIPPED
  - If after start_point: status = PENDING
  ↓
Schedule periodic check job
  ↓
Return playlist info to user

3.2 Periodic Check Workflow

Scheduler → Trigger check_playlist(playlist_id)
  ↓
Fetch latest video list from YouTube (via yt-dlp)
  ↓
Compare with existing VideoRecords
  ↓
For each new video:
  - Check if after start_point
  - Create VideoRecord with status=PENDING
  ↓
For each PENDING video (respecting order):
  - Send download request to MeTube
  - Update status to DOWNLOADING
  - Store metube_download_id
  ↓
Update playlist.last_checked timestamp

3.3 Download Completion Workflow

MeTube → WebSocket event: 'completed'
  ↓
Extract video URL from event
  ↓
Find VideoRecord by metube_download_id or video_url
  ↓
Update VideoRecord:
  - status = COMPLETED
  - download_completed_at = now()
  - original_filename = filename from MeTube
  ↓
Log completion

3.4 Manual Re-download Workflow

User → Request re-download for video_id
  ↓
Find VideoRecord
  ↓
Reset status to PENDING
  ↓
Clear error fields
  ↓
Send download request to MeTube
  ↓
Update status to DOWNLOADING

3.5 File Movement Handling

User moves file manually (outside app)
  ↓
User marks video as "file moved" in UI
  ↓
Update VideoRecord:
  - file_moved = True
  - file_location_note = optional note
  ↓
Status remains COMPLETED (prevents re-download)

4. API Design

4.1 Playlist Endpoints

# List all playlists
GET /api/playlists
Response: { playlists: [PlaylistSubscription, ...] }

# Get single playlist with videos
GET /api/playlists/{playlist_id}
Response: { 
  playlist: PlaylistSubscription,
  videos: [VideoRecord, ...],
  stats: { total, pending, completed, failed }
}

# Add new playlist
POST /api/playlists
Body: {
  url: string,
  check_interval: int (default: 60),
  start_point: string (video_id or index),
  quality: string (default: "best"),
  format: string (default: "mp4"),
  folder: string
}
Response: PlaylistSubscription

# Update playlist
PUT /api/playlists/{playlist_id}
Body: { check_interval, quality, format, enabled, ... }
Response: PlaylistSubscription

# Delete playlist
DELETE /api/playlists/{playlist_id}?delete_videos=true
Response: { status: "ok" }

# Trigger manual check
POST /api/playlists/{playlist_id}/check
Response: { new_videos: int, status: "ok" }

# Update start point
POST /api/playlists/{playlist_id}/start-point
Body: { video_id: string }
Response: { updated_videos: int }

4.2 Video Endpoints

# List videos for playlist
GET /api/playlists/{playlist_id}/videos?status=PENDING&limit=50&offset=0
Response: { videos: [VideoRecord, ...], total: int }

# Get single video
GET /api/videos/{video_id}
Response: VideoRecord

# Request download/re-download
POST /api/videos/{video_id}/download
Response: { status: "ok", metube_id: string }

# Mark file as moved
POST /api/videos/{video_id}/file-moved
Body: { location_note: string }
Response: VideoRecord

# Skip video (mark as SKIPPED)
POST /api/videos/{video_id}/skip
Response: VideoRecord

# Reset video (back to PENDING)
POST /api/videos/{video_id}/reset
Response: VideoRecord

4.3 System Endpoints

# Get system status
GET /api/status
Response: {
  total_playlists: int,
  active_playlists: int,
  total_videos: int,
  pending_downloads: int,
  active_downloads: int,
  completed_downloads: int,
  failed_downloads: int,
  metube_status: { connected: bool, version: string }
}

# Get scheduler status
GET /api/scheduler/status
Response: {
  jobs: [{ id, playlist_id, next_run, ... }],
  running: bool
}

# Trigger manual sync with MeTube
POST /api/sync-metube
Response: { synced_videos: int, status: "ok" }

5. Database Schema (SQLite)

-- Playlists
CREATE TABLE playlists (
    id TEXT PRIMARY KEY,
    url TEXT NOT NULL UNIQUE,
    title TEXT,
    check_interval INTEGER DEFAULT 60,
    last_checked TIMESTAMP,
    start_point TEXT,
    quality TEXT DEFAULT 'best',
    format TEXT DEFAULT 'mp4',
    folder TEXT,
    enabled BOOLEAN DEFAULT 1,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Videos
CREATE TABLE videos (
    id TEXT PRIMARY KEY,
    playlist_id TEXT NOT NULL,
    video_url TEXT NOT NULL,
    video_id TEXT NOT NULL,
    title TEXT,
    playlist_index INTEGER,
    upload_date TIMESTAMP,
    
    status TEXT DEFAULT 'PENDING',
    download_requested_at TIMESTAMP,
    download_completed_at TIMESTAMP,
    metube_download_id TEXT,
    
    original_filename TEXT,
    file_moved BOOLEAN DEFAULT 0,
    file_location_note TEXT,
    
    error_message TEXT,
    retry_count INTEGER DEFAULT 0,
    last_error_at TIMESTAMP,
    
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (playlist_id) REFERENCES playlists(id) ON DELETE CASCADE
);

-- Indexes
CREATE INDEX idx_videos_playlist_id ON videos(playlist_id);
CREATE INDEX idx_videos_status ON videos(status);
CREATE INDEX idx_videos_video_id ON videos(video_id);
CREATE INDEX idx_videos_playlist_index ON videos(playlist_id, playlist_index);

-- Activity Log (optional, for audit trail)
CREATE TABLE activity_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    event_type TEXT,  -- playlist_added, video_downloaded, check_completed, etc.
    playlist_id TEXT,
    video_id TEXT,
    details TEXT,  -- JSON blob
    FOREIGN KEY (playlist_id) REFERENCES playlists(id) ON DELETE CASCADE,
    FOREIGN KEY (video_id) REFERENCES videos(id) ON DELETE CASCADE
);

CREATE INDEX idx_activity_log_timestamp ON activity_log(timestamp);
CREATE INDEX idx_activity_log_event_type ON activity_log(event_type);

6. Technology Stack

6.1 Backend

  • Framework: FastAPI (Python 3.13+)

    • Modern async support
    • Auto-generated OpenAPI docs
    • WebSocket support
    • Dependency injection
  • Database: SQLAlchemy + SQLite

    • ORM for easy data modeling
    • Migration support with Alembic
  • Scheduler: APScheduler

    • Async job scheduling
    • Persistent job storage
    • Cron-like intervals
  • MeTube Integration:

    • aiohttp (HTTP client)
    • python-socketio (WebSocket client)
  • Dependencies:

    fastapi
    uvicorn[standard]
    sqlalchemy
    alembic
    apscheduler
    aiohttp
    python-socketio[client]
    yt-dlp
    pydantic
    python-dotenv
    

6.2 Frontend (Optional - can reuse MeTube's Angular)

  • Option A: Extend MeTube's Angular UI

    • Add new routes/components for playlists
    • Integrate with existing UI
  • Option B: Separate Vue.js/React SPA

    • Independent frontend
    • Modern UI framework
  • Recommended: Extend MeTube's Angular for consistency

6.3 Deployment

  • Docker Compose:
    services:
      metube:
        image: ghcr.io/alexta69/metube
        ports: ["8081:8081"]
        volumes:
          - ./downloads:/downloads
    
      playlist-monitor:
        build: ./playlist-monitor
        ports: ["8082:8082"]
        environment:
          - METUBE_URL=http://metube:8081
          - DATABASE_URL=sqlite:///data/playlists.db
          - CHECK_INTERVAL=60
        volumes:
          - ./monitor-data:/data
        depends_on:
          - metube
    

7. Configuration

# config.yaml
metube:
  url: http://localhost:8081
  reconnect_interval: 5  # seconds

database:
  url: sqlite:///data/playlists.db
  echo: false

scheduler:
  enabled: true
  default_check_interval: 60  # minutes
  max_concurrent_downloads: 3
  retry_failed_after: 24  # hours

downloads:
  default_quality: best
  default_format: mp4
  default_folder: playlists/{playlist_title}

logging:
  level: INFO
  file: logs/playlist-monitor.log

server:
  host: 0.0.0.0
  port: 8082

8. Implementation Plan (Phases)

Phase 1: Core Infrastructure (Week 1-2)

  • Set up FastAPI project structure
  • Database schema and SQLAlchemy models
  • Basic CRUD operations for playlists
  • MeTube client implementation
  • Configuration management

Phase 2: Playlist Management (Week 2-3)

  • Add playlist endpoint (with yt-dlp integration)
  • Fetch and parse playlist videos
  • Implement start_point logic
  • Basic video tracking

Phase 3: Scheduler & Automation (Week 3-4)

  • APScheduler integration
  • Periodic check implementation
  • Download triggering logic
  • Status synchronization with MeTube

Phase 4: Advanced Features (Week 4-5)

  • File movement tracking
  • Manual re-download
  • Error handling and retry logic
  • Activity logging

Phase 5: UI Integration (Week 5-6)

  • REST API documentation (OpenAPI)
  • Angular components (if extending MeTube UI)
  • Playlist listing and details view
  • Video status dashboard

Phase 6: Testing & Deployment (Week 6-7)

  • Unit tests
  • Integration tests
  • Docker containerization
  • Documentation and user guide

9. Key Design Decisions

9.1 Why Separate Service?

  • Modularity: Keep concerns separated
  • Independence: Can run without modifying MeTube core
  • Scalability: Can scale independently
  • Maintenance: Easier to update/maintain

9.2 Why SQLite?

  • Simplicity: No external DB server needed
  • Portability: Single file database
  • Sufficient: Adequate for single-user/small-team use
  • Upgrade Path: Can migrate to PostgreSQL if needed

9.3 File Movement Tracking

  • Decoupled Design: Don't check filesystem, rely on metadata
  • User Control: User explicitly marks files as moved
  • Prevents Re-downloads: Once marked completed, won't re-download
  • Flexibility: Files can be moved/organized freely

9.4 Start Point Implementation

  • Options:
    • Video ID (specific video)
    • Playlist index (position number)
    • Upload date (date cutoff)
  • Recommended: Video ID (most reliable)
  • Behavior: Videos before start point marked as SKIPPED

10. Future Enhancements

10.1 Advanced Features

  • Smart Download Scheduling: Download during off-peak hours
  • Bandwidth Management: Rate limiting per playlist
  • Multi-platform Support: Support non-YouTube playlists
  • Notification System: Email/webhook on new videos
  • Archive Mode: Download only specific date ranges
  • Duplicate Detection: Prevent duplicate downloads across playlists

10.2 UI Enhancements

  • Timeline View: Visual timeline of uploads
  • Bulk Operations: Batch skip/reset/download
  • Statistics Dashboard: Charts and graphs
  • Search & Filters: Advanced video filtering

10.3 Integration Features

  • Plex/Jellyfin Integration: Auto-update media libraries
  • RSS Feed: Generate RSS feeds for playlists
  • API Webhooks: Notify external systems
  • Cloud Sync: Backup/sync across instances

11. Security Considerations

11.1 Authentication

  • Implement API key or JWT authentication
  • Integrate with MeTube's auth (if available)
  • Rate limiting to prevent abuse

11.2 Data Privacy

  • Encrypt sensitive data (cookies, tokens)
  • Regular security audits
  • HTTPS only in production

11.3 Resource Management

  • Limit number of playlists per instance
  • Limit number of videos per playlist
  • Disk space monitoring

12. Monitoring & Observability

12.1 Metrics

  • Playlist check frequency
  • Download success/failure rates
  • MeTube API response times
  • Database query performance

12.2 Logging

  • Structured logging (JSON)
  • Log levels: DEBUG, INFO, WARNING, ERROR
  • Log rotation and retention

12.3 Health Checks

  • Database connectivity
  • MeTube connectivity
  • Scheduler status
  • Disk space availability

13. Summary

This architecture provides a robust, scalable, and maintainable solution for automated playlist monitoring that:

Integrates seamlessly with MeTube via REST API and WebSocket
Tracks video status independently of filesystem
Prevents re-downloads even when files are moved
Supports flexible start points for granular control
Provides periodic scheduling with configurable intervals
Maintains persistent state across restarts
Offers comprehensive API for programmatic control
Scales gracefully with proper database design

The service is designed to be deployed alongside MeTube as a companion microservice, leveraging MeTube's powerful download capabilities while adding intelligent playlist monitoring on top.


Appendix A: Example Use Cases

Use Case 1: YouTube Channel Monitoring

Scenario: Monitor a YouTube channel for new uploads
Setup:

  • Add channel's "Uploads" playlist
  • Set start_point to latest video
  • Check interval: 30 minutes
  • All new videos downloaded automatically

Use Case 2: Partial Playlist Download

Scenario: Download only recent videos from a large playlist
Setup:

  • Add playlist URL
  • Set start_point to video from 6 months ago
  • Older videos marked as SKIPPED
  • New videos downloaded as they appear

Use Case 3: Archive & Organize

Scenario: Download and organize by playlist
Setup:

  • Multiple playlists with custom folders
  • Quality: best
  • Format: mp4
  • After download, move files to NAS
  • Mark as "file moved" to prevent re-download

Appendix B: API Examples

Adding a Playlist

curl -X POST http://localhost:8082/api/playlists \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/playlist?list=PLxxxxxx",
    "check_interval": 60,
    "start_point": "dQw4w9WgXcQ",
    "quality": "1080",
    "format": "mp4",
    "folder": "my-playlist"
  }'

Listing Videos

curl http://localhost:8082/api/playlists/{playlist_id}/videos?status=PENDING&limit=20

Marking File as Moved

curl -X POST http://localhost:8082/api/videos/{video_id}/file-moved \
  -H "Content-Type: application/json" \
  -d '{
    "location_note": "Moved to /mnt/nas/videos/"
  }'

Document Version: 1.0
Last Updated: 2025-01-19
Author: AI Architecture Team
Status: Draft for Review