nextav/docs/active/library-clusters/LIBRARY_SCAN_ENHANCEMENT_IM...

16 KiB
Raw Permalink Blame History

Library Scan Enhancement Implementation Plan

🚀 Implementation Overview

This document provides a step-by-step implementation guide for the simplified library scan enhancement, focusing on two core features:

  1. File Deletion Cleanup - Remove orphaned database entries
  2. Thumbnail Recovery - Regenerate missing thumbnails

Estimated Total Time: 6-8 hours


📋 Implementation Steps

Step 1: Add Helper Function - File Deletion Cleanup

Estimated Time: 2-3 hours
File: src/lib/scanner.ts

Implementation:

import { promises as fs } from 'fs';

// Add this helper function after imports
async function cleanupDeletedFiles(
  db: DatabaseType,
  libraryId: number,
  currentFiles: string[]
): Promise<{ removed: number }> {
  // Get all media records for this library
  const dbRecords = db.prepare(
    "SELECT id, path FROM media WHERE library_id = ?"
  ).all(libraryId) as { id: number; path: string }[];

  // Create set of current file paths for fast lookup
  const currentFileSet = new Set(currentFiles);

  let removed = 0;

  // Check each database record
  for (const record of dbRecords) {
    // If file doesn't exist in current scan
    if (!currentFileSet.has(record.path)) {
      try {
        // Double-check file truly doesn't exist on disk
        await fs.access(record.path);
        // File exists but wasn't in scan - possibly outside glob pattern
        console.log(`File exists but not scanned: ${record.path}`);
        continue;
      } catch {
        // File doesn't exist - remove from database
        try {
          db.prepare("DELETE FROM media WHERE id = ?").run(record.id);
          console.log(`✓ Removed orphaned record: ${record.path}`);
          removed++;
        } catch (error) {
          console.error(`✗ Failed to remove record ${record.path}:`, error);
        }
      }
    }
  }

  if (removed > 0) {
    console.log(`📊 Cleanup complete: ${removed} orphaned record(s) removed`);
  }

  return { removed };
}

Testing:

// Manual test
// 1. Add files to library and scan
// 2. Delete some files manually
// 3. Re-scan
// 4. Verify records removed from database

Step 2: Add Helper Function - Thumbnail Verification

Estimated Time: 2-3 hours
File: src/lib/scanner.ts

Implementation:

import path from 'path';

// Add helper to convert thumbnail URL to file path
function getThumbnailPathFromUrl(url: string): string {
  // Convert URL like /thumbnails/ab/cd/file.png
  // to full path like /path/to/public/thumbnails/ab/cd/file.png
  return path.join(process.cwd(), 'public', url);
}

// Add thumbnail verification function
async function verifyAndRegenerateThumbnail(
  media: { id: number; path: string; type: string; thumbnail: string }
): Promise<{ regenerated: boolean }> {
  // Skip if using fallback thumbnail
  if (media.thumbnail.includes('/fallback/')) {
    return { regenerated: false };
  }

  // Get full path from URL
  const thumbnailPath = getThumbnailPathFromUrl(media.thumbnail);

  try {
    // Check if thumbnail file exists
    await fs.access(thumbnailPath);
    return { regenerated: false }; // Thumbnail exists, no action needed
  } catch {
    // Thumbnail missing - regenerate
    console.log(`🔄 Regenerating missing thumbnail for: ${path.basename(media.path)}`);

    try {
      const { folderPath, fullPath, url } = ThumbnailManager.getThumbnailPath(media.path);
      ThumbnailManager.ensureDirectory(folderPath);

      let regenerated = false;

      if (media.type === 'video') {
        await generateVideoThumbnail(media.path, fullPath);
        regenerated = true;
      } else if (media.type === 'photo') {
        await generatePhotoThumbnail(media.path, fullPath);
        regenerated = true;
      }

      if (regenerated) {
        // Update database with new thumbnail path
        const db = getDatabase();
        db.prepare("UPDATE media SET thumbnail = ? WHERE id = ?")
          .run(url, media.id);

        console.log(`✓ Successfully regenerated thumbnail: ${path.basename(media.path)}`);
        return { regenerated: true };
      }

      return { regenerated: false };

    } catch (error) {
      console.warn(`✗ Failed to regenerate thumbnail for ${path.basename(media.path)}:`, error);

      // Use fallback thumbnail
      const db = getDatabase();
      const fallbackUrl = ThumbnailManager.getFallbackThumbnailUrl(media.type);
      db.prepare("UPDATE media SET thumbnail = ? WHERE id = ?")
        .run(fallbackUrl, media.id);

      return { regenerated: false };
    }
  }
}

Testing:

// Manual test
// 1. Add files to library and scan
// 2. Delete thumbnail files manually (in public/thumbnails/)
// 3. Re-scan
// 4. Verify thumbnails regenerated

Step 3: Enhance Main Scan Function

Estimated Time: 1-2 hours
File: src/lib/scanner.ts

Modifications to scanLibrary function:

const scanLibrary = async (library: { id: number; path: string }) => {
  const db = getDatabase();
  
  // Initialize statistics tracking
  const stats = {
    filesProcessed: 0,
    filesAdded: 0,
    filesRemoved: 0,
    thumbnailsRegenerated: 0,
    errors: 0
  };
  
  console.log(`\n📚 Starting scan for library: ${library.path}`);
  
  // EXISTING: File discovery
  const allFiles = await glob(`${library.path}/**/*.*`, { nodir: true });
  
  // EXISTING: Filter files by extension
  const filteredVideoFiles = allFiles.filter(file => {
    const ext = path.extname(file).toLowerCase().replace('.', '');
    return VIDEO_EXTENSIONS.includes(ext);
  });
  
  const filteredPhotoFiles = allFiles.filter(file => {
    const ext = path.extname(file).toLowerCase().replace('.', '');
    return PHOTO_EXTENSIONS.includes(ext);
  });
  
  const filteredTextFiles = allFiles.filter(file => {
    const ext = path.extname(file).toLowerCase().replace('.', '');
    return TEXT_EXTENSIONS.includes(ext);
  });

  const mediaFiles = [...filteredVideoFiles, ...filteredPhotoFiles, ...filteredTextFiles];
  
  console.log(`📁 Found ${mediaFiles.length} media files`);
  
  // NEW: Cleanup deleted files
  console.log(`\n🧹 Checking for deleted files...`);
  try {
    const cleanupResult = await cleanupDeletedFiles(db, library.id, mediaFiles);
    stats.filesRemoved = cleanupResult.removed;
  } catch (error) {
    console.error('Error during cleanup:', error);
    stats.errors++;
  }
  
  // EXISTING + ENHANCED: Process each file
  console.log(`\n⚙  Processing files...`);
  for (const file of mediaFiles) {
    stats.filesProcessed++;
    
    try {
      const stats = fs.statSync(file);
      const title = path.basename(file);
      const ext = path.extname(file).toLowerCase();
      const cleanExt = ext.replace('.', '').toLowerCase();
      const isVideo = VIDEO_EXTENSIONS.some(v => v.toLowerCase() === cleanExt);
      const isPhoto = PHOTO_EXTENSIONS.some(p => p.toLowerCase() === cleanExt);
      const isText = TEXT_EXTENSIONS.some(t => t.toLowerCase() === cleanExt);
      
      const mediaType = isVideo ? "video" : isPhoto ? "photo" : "text";
      
      const { folderPath, fullPath, url } = ThumbnailManager.getThumbnailPath(file);
      
      const existingMedia = db.prepare("SELECT * FROM media WHERE path = ?").get(file);
      
      if (existingMedia) {
        // NEW: Verify thumbnail for existing media
        try {
          const thumbResult = await verifyAndRegenerateThumbnail(existingMedia);
          if (thumbResult.regenerated) {
            stats.thumbnailsRegenerated++;
          }
        } catch (error) {
          console.error(`Error verifying thumbnail for ${file}:`, error);
          stats.errors++;
        }
        continue;
      }

      // EXISTING: Process new files (thumbnail generation, insert, etc.)
      ThumbnailManager.ensureDirectory(folderPath);

      let finalThumbnailUrl = url;
      let thumbnailGenerated = false;

      try {
        if (isVideo) {
          await generateVideoThumbnail(file, fullPath);
          thumbnailGenerated = true;
        } else if (isPhoto) {
          await generatePhotoThumbnail(file, fullPath);
          thumbnailGenerated = true;
        }
      } catch (thumbnailError) {
        console.warn(`Thumbnail generation failed for ${file}:`, thumbnailError);
        finalThumbnailUrl = ThumbnailManager.getFallbackThumbnailUrl(mediaType);
      }

      // EXISTING: Analyze video codec
      let codecInfo = '{}';
      if (isVideo) {
        try {
          const videoInfo = await VideoAnalyzer.analyzeVideo(file);
          codecInfo = JSON.stringify({
            codec: videoInfo.codec,
            container: videoInfo.container,
            duration: videoInfo.duration,
            width: videoInfo.width,
            height: videoInfo.height,
            bitrate: videoInfo.bitrate,
            audioCodec: videoInfo.audioCodec,
            needsTranscoding: videoInfo.needsTranscoding
          });
        } catch (error) {
          console.warn(`Could not analyze video codec for ${file}:`, error);
        }
      }

      // EXISTING: Insert into database
      const media = {
        library_id: library.id,
        path: file,
        type: mediaType,
        title: title,
        size: stats.size,
        thumbnail: finalThumbnailUrl,
        codec_info: codecInfo,
      };

      db.prepare(
        "INSERT INTO media (library_id, path, type, title, size, thumbnail, codec_info) VALUES (?, ?, ?, ?, ?, ?, ?)"
      ).run(media.library_id, media.path, media.type, media.title, media.size, media.thumbnail, media.codec_info);

      stats.filesAdded++;
      console.log(`✓ Added ${mediaType}: ${title}${thumbnailGenerated ? ' with thumbnail' : ' with fallback'}`);
      
    } catch (error: any) {
      if (error.code !== "SQLITE_CONSTRAINT_UNIQUE") {
        console.error(`Error processing ${file}:`, error);
        stats.errors++;
      }
    }
  }
  
  // NEW: Log final statistics
  console.log(`\n📊 Scan Complete:`);
  console.log(`   Files Processed: ${stats.filesProcessed}`);
  console.log(`   Files Added: ${stats.filesAdded}`);
  console.log(`   Files Removed: ${stats.filesRemoved}`);
  console.log(`   Thumbnails Regenerated: ${stats.thumbnailsRegenerated}`);
  if (stats.errors > 0) {
    console.log(`   Errors: ${stats.errors}`);
  }
  
  return stats;
};

Key Changes:

  1. Added statistics tracking object
  2. Call cleanupDeletedFiles() before processing
  3. Call verifyAndRegenerateThumbnail() for existing files
  4. Enhanced console logging with emojis and clear formatting
  5. Return statistics object

Step 4: Update API Response (Optional)

Estimated Time: 30 minutes
File: src/app/api/scan/route.ts

Enhancement (if you want to return stats):

// In the scan API endpoint
export async function POST(request: Request) {
  try {
    const body = await request.json();
    const { libraryId } = body;

    let stats;
    if (libraryId) {
      stats = await scanSelectedLibrary(libraryId);
    } else {
      stats = await scanAllLibraries();
    }

    return NextResponse.json({
      success: true,
      message: 'Scan completed successfully',
      stats: stats || { filesProcessed: 0, filesAdded: 0, filesRemoved: 0, thumbnailsRegenerated: 0 }
    });
  } catch (error) {
    console.error('Scan error:', error);
    return NextResponse.json(
      { success: false, error: error.message },
      { status: 500 }
    );
  }
}

Update return type of scan functions:

// Update scanAllLibraries to aggregate stats
export const scanAllLibraries = async () => {
  const db = getDatabase();
  const libraries = db.prepare("SELECT * FROM libraries").all() as { id: number; path: string }[];
  
  const aggregateStats = {
    filesProcessed: 0,
    filesAdded: 0,
    filesRemoved: 0,
    thumbnailsRegenerated: 0,
    errors: 0
  };
  
  for (const library of libraries) {
    const stats = await scanLibrary(library);
    aggregateStats.filesProcessed += stats.filesProcessed;
    aggregateStats.filesAdded += stats.filesAdded;
    aggregateStats.filesRemoved += stats.filesRemoved;
    aggregateStats.thumbnailsRegenerated += stats.thumbnailsRegenerated;
    aggregateStats.errors += stats.errors;
  }
  
  return aggregateStats;
};

// scanSelectedLibrary already returns stats from scanLibrary

🧪 Testing Plan

Manual Testing Procedure

Test 1: File Deletion Cleanup

# 1. Setup test library
mkdir -p /tmp/test-library
cp some-videos.mp4 /tmp/test-library/

# 2. Add library in UI and scan
# Verify files appear in UI

# 3. Delete files from disk
rm /tmp/test-library/some-videos.mp4

# 4. Re-scan library
# Expected: Console shows "Removed orphaned record"
# Expected: Files no longer appear in UI
# Expected: Database query shows files removed

Test 2: Thumbnail Recovery

# 1. Setup test library and scan
mkdir -p /tmp/test-library
cp some-videos.mp4 /tmp/test-library/
# Scan via UI

# 2. Delete thumbnails
rm -rf public/thumbnails/*

# 3. Re-scan library
# Expected: Console shows "Regenerating missing thumbnail"
# Expected: Thumbnails regenerated
# Expected: Files display with thumbnails in UI

Test 3: Error Handling

# 1. Create file that causes thumbnail failure
# (corrupt video file)
cp corrupt.mp4 /tmp/test-library/

# 2. Scan library
# Expected: Scan completes despite error
# Expected: Error logged to console
# Expected: Fallback thumbnail used
# Expected: Other files processed normally

Unit Tests (Optional)

Create file: src/lib/__tests__/scanner.test.ts

import { describe, it, expect, beforeEach, afterEach } from '@jest/globals';
import fs from 'fs/promises';
import path from 'path';

describe('Enhanced Scanner', () => {
  const testLibraryPath = path.join(__dirname, 'test-library');
  
  beforeEach(async () => {
    // Setup test library
    await fs.mkdir(testLibraryPath, { recursive: true });
  });
  
  afterEach(async () => {
    // Cleanup test library
    await fs.rm(testLibraryPath, { recursive: true, force: true });
  });
  
  it('should remove orphaned database entries', async () => {
    // Test implementation
  });
  
  it('should regenerate missing thumbnails', async () => {
    // Test implementation
  });
  
  it('should continue scan on individual file errors', async () => {
    // Test implementation
  });
});

Acceptance Checklist

Before marking implementation complete, verify:

Functional Requirements

  • Deleted files are removed from database during scan
  • Missing thumbnails are regenerated during scan
  • Scan completes even with individual file errors
  • Statistics are logged to console
  • No regression in existing scan functionality

Code Quality

  • Code follows existing style and patterns
  • Error handling implemented for all new code
  • Console logging provides clear feedback
  • No new dependencies added

Testing

  • Manual test: File deletion cleanup works
  • Manual test: Thumbnail recovery works
  • Manual test: Error handling works
  • No TypeScript errors
  • No runtime errors in normal operation

Documentation

  • Code comments added for new functions
  • README updated if needed
  • This implementation guide completed

📊 Implementation Timeline

Step Task Time Status
1 Add cleanupDeletedFiles function 2-3h Pending
2 Add verifyAndRegenerateThumbnail function 2-3h Pending
3 Enhance scanLibrary function 1-2h Pending
4 Update API response (optional) 0.5h Pending
5 Testing & validation 1h Pending

Total Estimated Time: 6-8 hours


🐛 Troubleshooting

Common Issues

Issue: Thumbnails not regenerating
Solution: Check thumbnail path conversion logic, verify FFmpeg installed

Issue: Database errors during cleanup
Solution: Verify database connection, check for locked database

Issue: Scan takes too long
Solution: Normal for large libraries, consider running in background

Issue: TypeScript errors with fs.promises
Solution: Import as import { promises as fs } from 'fs';



Document Status: Complete
Implementation Complexity: Low-Medium
Ready to Begin: Yes
Last Updated: October 14, 2025

Next Steps: Begin Step 1 - Add cleanupDeletedFiles helper function