nextav/docs/active/library-clusters/LIBRARY_SCAN_REDESIGN_OVERV...

7.7 KiB

Library Scan Enhancement - Redesign Overview

📋 What Changed

The library scan enhancement has been completely redesigned from a comprehensive multi-phase feature (18-23 hours) to a focused, pragmatic solution (6-8 hours) that addresses only the two core requirements you specified.


🎯 Original vs Redesigned Scope

Original Plan (Removed Features)

The original design included many advanced features that are NOT needed:

  • Real-time progress reporting with WebSocket updates
  • Scan session tracking and history database
  • Concurrent processing with worker threads
  • Incremental scanning (only changed files)
  • Content-based duplicate detection
  • Advanced error recovery mechanisms
  • Soft delete with rollback capability
  • Complex transaction management
  • Performance monitoring and metrics
  • Advanced reporting system
  • Progress UI components
  • New database tables and schema changes

Why removed: These features add significant complexity without addressing the core problems.

Redesigned Plan (Core Features Only)

The new design focuses exclusively on your two requirements:

  1. File Deletion Cleanup

    • Detect files that exist in database but not on disk
    • Remove orphaned database records
    • Log cleanup actions
  2. Thumbnail Recovery

    • Check if thumbnail files exist for each media record
    • Regenerate missing thumbnails
    • Use fallback thumbnails on failure

Why better: Simple, focused, quick to implement, solves the actual problems.


📊 Comparison Summary

Aspect Original Design Redesigned
Scope 7 major features 2 core features
Implementation Time 18-23 hours 6-8 hours
Code Changes Multiple files, new modules Single file (scanner.ts)
Database Changes New tables, schema updates None
Complexity High (worker threads, WebSockets) Low (simple functions)
Testing Comprehensive suite Basic manual tests
Documentation 4 detailed docs 4 focused docs

🏗️ Technical Approach

Redesigned Architecture

Minimal changes to existing scanner:

// File: src/lib/scanner.ts

// Add 2 helper functions
async function cleanupDeletedFiles(...) { }
async function verifyAndRegenerateThumbnail(...) { }

// Enhance existing scanLibrary function
const scanLibrary = async (library) => {
  // 1. File discovery (existing)
  const mediaFiles = await glob(...);
  
  // 2. Cleanup deleted files (NEW)
  await cleanupDeletedFiles(db, library.id, mediaFiles);
  
  // 3. Process files (existing + enhanced)
  for (const file of mediaFiles) {
    const existing = db.get(file);
    
    if (existing) {
      // Verify thumbnail (NEW)
      await verifyAndRegenerateThumbnail(existing);
    } else {
      // Insert new file (existing)
    }
  }
};

That's it! No worker threads, no WebSockets, no new tables.


📝 Documentation Updates

All 4 documentation files have been rewritten:

1. Requirements Document

  • Removed: 5 complex requirements with sub-requirements
  • Kept: 2 core requirements with clear acceptance criteria
  • Added: Non-requirements section (what's explicitly excluded)

2. Architecture Document

  • Removed: Complex multi-component architecture diagrams
  • Kept: Simple enhancement to existing scanner
  • Simplified: No worker pools, no WebSockets, no transactions

3. Implementation Plan

  • Removed: 4 phases over 18-23 hours
  • Kept: 4 simple steps over 6-8 hours
  • Focused: Actual code to add to scanner.ts

4. Summary Document

  • Updated: All metrics and timelines
  • Simplified: Feature comparison table
  • Clarified: Business impact focuses on data integrity

🎯 What You Get

Problem 1 Solution: File Deletion Cleanup

// When you delete files from disk and re-scan:
// Before: Files stay in database forever (orphaned records)
// After:  Files automatically removed from database

// Console output:
// ✓ Removed orphaned record: /path/to/deleted/file.mp4
// 📊 Cleanup complete: 5 orphaned record(s) removed

Problem 2 Solution: Thumbnail Recovery

// When thumbnails are missing and you re-scan:
// Before: Thumbnails stay missing forever
// After:  Thumbnails automatically regenerated

// Console output:
// 🔄 Regenerating missing thumbnail for: video.mp4
// ✓ Successfully regenerated thumbnail: video.mp4

Bonus: Enhanced Logging

// Scan statistics logged at end:
// 📊 Scan Complete:
//    Files Processed: 150
//    Files Added: 10
//    Files Removed: 5
//    Thumbnails Regenerated: 3

Implementation Steps

Step 1: Add cleanupDeletedFiles() helper function (2-3 hours)
Step 2: Add verifyAndRegenerateThumbnail() helper function (2-3 hours)
Step 3: Enhance scanLibrary() to call these functions (1-2 hours)
Step 4: Test with real library (1 hour)

Total: 6-8 hours


🧪 Testing

Simple Manual Tests

Test 1: File Deletion

1. Add files to library and scan
2. Delete some files from disk
3. Re-scan
4. Verify: Files removed from database ✓

Test 2: Thumbnail Recovery

1. Add files to library and scan
2. Delete thumbnail files
3. Re-scan
4. Verify: Thumbnails regenerated ✓

Test 3: Error Handling

1. Create corrupt file
2. Scan
3. Verify: Scan completes despite error ✓

🔍 What's NOT Included

To keep this simple and focused, the following are explicitly excluded:

  • Progress bars or real-time UI updates
  • Scan history or session tracking
  • Performance optimizations (concurrent processing)
  • Incremental scanning (only changed files)
  • Duplicate file detection
  • Advanced error recovery
  • Database transactions
  • Soft delete functionality
  • WebSocket progress updates
  • New API endpoints
  • New database tables

Rationale: These features don't solve your two core problems and would add 12-15 hours of additional work.


📁 Documentation Files

All documentation has been rewritten and is ready to use:

  1. LIBRARY_SCAN_ENHANCEMENT_SUMMARY.md
    High-level overview of the redesigned feature

  2. LIBRARY_SCAN_ENHANCEMENT_REQUIREMENTS.md
    Focused requirements for the 2 core features

  3. LIBRARY_SCAN_ENHANCEMENT_ARCHITECTURE.md
    Simple technical design with code examples

  4. LIBRARY_SCAN_ENHANCEMENT_IMPLEMENTATION.md
    Step-by-step implementation guide with actual code


Next Steps

You can now proceed with implementation following the simplified plan:

  1. Read the Implementation Plan
  2. Implement Step 1: Add cleanupDeletedFiles() function
  3. Implement Step 2: Add verifyAndRegenerateThumbnail() function
  4. Implement Step 3: Enhance scanLibrary() function
  5. Test with your media library
  6. Deploy - it's a single file change!

🎉 Benefits of Redesign

Simpler: No complex architecture
Faster: 6-8 hours vs 18-23 hours
Focused: Solves actual problems
Maintainable: Single file change
Testable: Simple manual testing
Practical: No over-engineering


Document Status: Complete
Redesign Date: October 14, 2025
Ready to Implement: Yes

Questions? Review the detailed implementation plan for step-by-step guidance.