17 KiB
Library Scan Enhancement Architecture
🏗️ System Architecture Overview
Simplified Scan Enhancement
┌────────────────────────────────────────────────────────────┐
│ Enhanced Scanner │
│ (scanner.ts) │
├────────────────────────────────────────────────────────────┤
│ 1. File Discovery (existing) │
│ 2. File Deletion Cleanup (NEW) │
│ 3. File Processing (existing) │
│ 4. Thumbnail Verification (NEW) │
└────────────────────────────────────────────────────────────┘
Design Philosophy: Minimal changes to existing scanner, add two new verification steps
🔧 Component Enhancements
Enhanced Scanner Flow
``typescript // File: src/lib/scanner.ts
const scanLibrary = async (library: { id: number; path: string }) => { const db = getDatabase();
// 1. FILE DISCOVERY (existing)
const allFiles = await glob(${library.path}/**/*.*, { nodir: true });
const mediaFiles = [...filteredVideoFiles, ...filteredPhotoFiles, ...filteredTextFiles];
// 2. FILE DELETION CLEANUP (NEW) await cleanupDeletedFiles(db, library.id, mediaFiles);
// 3. FILE PROCESSING (existing + enhanced) for (const file of mediaFiles) { const existingMedia = db.prepare("SELECT * FROM media WHERE path = ?").get(file);
if (existingMedia) {
// 4. THUMBNAIL VERIFICATION (NEW)
await verifyAndRegenerateThumbnail(existingMedia);
continue;
}
// Existing: Insert new media with thumbnail generation
} };
### **1. File Deletion Cleanup** (NEW)
**Purpose**: Remove database entries for files that no longer exist on disk
**Implementation**:
```typescript
async function cleanupDeletedFiles(
db: Database,
libraryId: number,
currentFiles: string[]
): Promise<{ removed: number }> {
// Get all media records for this library
const dbRecords = db.prepare(
"SELECT id, path FROM media WHERE library_id = ?"
).all(libraryId) as { id: number; path: string }[];
// Create set of current file paths for fast lookup
const currentFileSet = new Set(currentFiles);
let removed = 0;
// Check each database record
for (const record of dbRecords) {
// If file doesn't exist in current scan
if (!currentFileSet.has(record.path)) {
try {
// Verify file truly doesn't exist on disk
await fs.access(record.path);
// File exists but wasn't in scan - possibly outside glob pattern
continue;
} catch {
// File doesn't exist - remove from database
db.prepare("DELETE FROM media WHERE id = ?").run(record.id);
console.log(`Removed orphaned record: ${record.path}`);
removed++;
}
}
}
console.log(`Cleanup complete: ${removed} orphaned records removed`);
return { removed };
}
Key Features:
- Double-checks file existence before deletion (safety)
- Handles cases where files exist but weren't scanned
- Logs each deletion for transparency
- Returns statistics for reporting
2. Thumbnail Verification (NEW)
Purpose: Detect and regenerate missing thumbnails for existing media
Implementation: ``typescript async function verifyAndRegenerateThumbnail( media: MediaRecord ): Promise<{ regenerated: boolean }> { // Skip if using fallback thumbnail if (media.thumbnail.includes('/fallback/')) { return { regenerated: false }; }
// Get full path from URL const thumbnailPath = getThumbnailPathFromUrl(media.thumbnail);
try {
// Check if thumbnail file exists
await fs.access(thumbnailPath);
return { regenerated: false }; // Thumbnail exists, no action needed
} catch {
// Thumbnail missing - regenerate
console.log(Regenerating missing thumbnail for: ${media.path});
try {
const { folderPath, fullPath, url } = ThumbnailManager.getThumbnailPath(media.path);
ThumbnailManager.ensureDirectory(folderPath);
if (media.type === 'video') {
await generateVideoThumbnail(media.path, fullPath);
} else if (media.type === 'photo') {
await generatePhotoThumbnail(media.path, fullPath);
}
// Update database with new thumbnail path
db.prepare("UPDATE media SET thumbnail = ? WHERE id = ?")
.run(url, media.id);
console.log(`Successfully regenerated thumbnail: ${media.path}`);
return { regenerated: true };
} catch (error) {
console.warn(`Failed to regenerate thumbnail for ${media.path}:`, error);
// Use fallback thumbnail
const fallbackUrl = ThumbnailManager.getFallbackThumbnailUrl(media.type);
db.prepare("UPDATE media SET thumbnail = ? WHERE id = ?")
.run(fallbackUrl, media.id);
return { regenerated: false };
}
} }
function getThumbnailPathFromUrl(url: string): string { // Convert URL like /thumbnails/ab/cd/file.png // to full path like /path/to/public/thumbnails/ab/cd/file.png return path.join(process.cwd(), 'public', url); }
**Key Features**:
- Verifies file existence before attempting regeneration
- Reuses existing thumbnail generation functions
- Updates database with new thumbnail path
- Falls back to type-based placeholder on failure
- Logs all actions for debugging
---
## 🔄 **Enhanced Scan Process Flow**
### **Detailed Process Steps**
┌─────────────────────────────────────────────────────────────┐ │ 1. START SCAN │ │ ├── Receive library ID or scan all │ │ └── Initialize statistics counters │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ 2. FILE DISCOVERY (existing) │ │ ├── Glob library path for all media files │ │ ├── Filter by video/photo/text extensions │ │ └── Build list of current files │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ 3. FILE DELETION CLEANUP (NEW) │ │ ├── Get all database records for library │ │ ├── For each database record: │ │ │ ├── Check if file in current scan results │ │ │ ├── If not: Verify file doesn't exist on disk │ │ │ └── If missing: DELETE from database │ │ └── Log cleanup statistics │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ 4. FILE PROCESSING LOOP │ │ For each discovered file: │ │ ├── Check if file already in database │ │ ├── If NEW: │ │ │ ├── Generate thumbnail (existing) │ │ │ ├── Analyze video codec (existing) │ │ │ └── INSERT into database (existing) │ │ └── If EXISTS: │ │ └── Verify & regenerate thumbnail (NEW) → │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ 5. THUMBNAIL VERIFICATION (NEW) │ │ ├── Check if thumbnail file exists on disk │ │ ├── If missing: Regenerate thumbnail │ │ ├── Update database with new path │ │ └── Log regeneration actions │ └─────────────────────────────────────────────────────────────┘ ↓ ┌─────────────────────────────────────────────────────────────┐ │ 6. COMPLETE SCAN │ │ ├── Log final statistics │ │ └── Return success │ └─────────────────────────────────────────────────────────────┘
---
## 🗄️ **Database Operations**
### **No Schema Changes Required**
Use existing tables and fields:
```sql
-- Existing media table (no changes)
CREATE TABLE media (
id INTEGER PRIMARY KEY AUTOINCREMENT,
library_id INTEGER,
path TEXT NOT NULL UNIQUE,
type TEXT NOT NULL,
title TEXT,
size INTEGER,
thumbnail TEXT,
codec_info TEXT DEFAULT '{}',
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (library_id) REFERENCES libraries (id)
);
Database Operations
// 1. Get all media for library
const records = db.prepare(
"SELECT id, path, type, thumbnail FROM media WHERE library_id = ?"
).all(libraryId);
// 2. Delete orphaned record
db.prepare("DELETE FROM media WHERE id = ?").run(recordId);
// 3. Update thumbnail path
db.prepare("UPDATE media SET thumbnail = ? WHERE id = ?")
.run(newThumbnailUrl, mediaId);
No transactions needed - Simple, independent operations
📊 Statistics Tracking
Scan Statistics Object
interface ScanStats {
filesProcessed: number; // Total files scanned
filesAdded: number; // New files inserted (existing)
filesRemoved: number; // Orphaned records deleted (NEW)
thumbnailsRegenerated: number; // Missing thumbnails recreated (NEW)
errors: number; // Total errors encountered
}
Statistics Collection
``typescript const scanLibrary = async (library: { id: number; path: string }) => { const stats: ScanStats = { filesProcessed: 0, filesAdded: 0, filesRemoved: 0, thumbnailsRegenerated: 0, errors: 0 };
// ... scan logic ...
// Log final statistics console.log('Scan complete:', stats); return stats; };
---
## 🔒 **Error Handling Strategy**
### **Error Tolerance Approach**
``typescript
// Principle: Individual failures should not stop entire scan
// File deletion cleanup
for (const record of dbRecords) {
try {
// Check and delete if needed
await cleanupRecord(record);
} catch (error) {
console.error(`Error cleaning up ${record.path}:`, error);
stats.errors++;
// Continue to next record
}
}
// Thumbnail verification
for (const file of mediaFiles) {
try {
const existingMedia = getExistingMedia(file);
if (existingMedia) {
await verifyAndRegenerateThumbnail(existingMedia);
}
} catch (error) {
console.error(`Error processing ${file}:`, error);
stats.errors++;
// Continue to next file
}
}
Key Principles:
- Wrap each file operation in try-catch
- Log errors but continue processing
- Track error count in statistics
- No transaction rollback (simple operations)
🚀 Implementation Strategy
Code Changes Required
Single file modification: src/lib/scanner.ts
``typescript // Add helper functions at top of file async function cleanupDeletedFiles(...) { /* ... / } async function verifyAndRegenerateThumbnail(...) { / ... / } function getThumbnailPathFromUrl(...) { / ... */ }
// Modify existing scanLibrary function const scanLibrary = async (library: { id: number; path: string }) => { // Initialize statistics const stats = { filesProcessed: 0, filesAdded: 0, filesRemoved: 0, thumbnailsRegenerated: 0, errors: 0 };
// Existing file discovery code
const allFiles = await glob(${library.path}/**/*.*, { nodir: true });
// ... existing filtering logic ...
// NEW: Cleanup deleted files const cleanupResult = await cleanupDeletedFiles(db, library.id, mediaFiles); stats.filesRemoved = cleanupResult.removed;
// Existing file processing loop for (const file of mediaFiles) { stats.filesProcessed++;
const existingMedia = db.prepare("SELECT * FROM media WHERE path = ?").get(file);
if (existingMedia) {
// NEW: Verify thumbnail for existing files
const thumbResult = await verifyAndRegenerateThumbnail(existingMedia);
if (thumbResult.regenerated) stats.thumbnailsRegenerated++;
continue;
}
// Existing new file processing
// ... existing thumbnail generation and insert logic ...
stats.filesAdded++;
}
// Log final statistics console.log('Scan complete:', stats); return stats; };
**No other files require changes**
---
## 🎯 **Performance Considerations**
### **Performance Impact**
| **Operation** | **Impact** | **Mitigation** |
|--------------|-----------|---------------|
| File existence checks | Low | Use fast `fs.access()` |
| Database queries | Low | Single query per library |
| Thumbnail regeneration | Medium | Only for missing thumbnails |
| Overall scan time | +10-20% | Acceptable for data integrity |
### **Optimization Notes**
- File existence checks are fast I/O operations
- Database deletions are simple, indexed operations
- Thumbnail regeneration only happens for missing files
- No additional memory overhead
- Sequential processing (same as current)
---
## 📈 **Testing Strategy**
### **Unit Testing**
``typescript
describe('cleanupDeletedFiles', () => {
it('should remove records for deleted files', async () => {
// Setup: Create DB records for files that don't exist
// Execute: Run cleanup
// Verify: Records removed from database
});
it('should keep records for existing files', async () => {
// Setup: Create DB records for existing files
// Execute: Run cleanup
// Verify: Records still in database
});
});
describe('verifyAndRegenerateThumbnail', () => {
it('should skip if thumbnail exists', async () => {
// Setup: Media record with existing thumbnail
// Execute: Verify
// Verify: No regeneration attempted
});
it('should regenerate if thumbnail missing', async () => {
// Setup: Media record with missing thumbnail
// Execute: Verify
// Verify: Thumbnail regenerated and DB updated
});
});
Integration Testing
``typescript describe('Enhanced Scanner', () => { it('should complete full scan with cleanup and verification', async () => { // Setup: Library with mixed scenarios (new, existing, deleted, missing thumbnails) // Execute: Full library scan // Verify: All scenarios handled correctly }); });
---
## 🔗 **Related Documentation**
- [Requirements Document](LIBRARY_SCAN_ENHANCEMENT_REQUIREMENTS.md)
- [Implementation Plan](LIBRARY_SCAN_ENHANCEMENT_IMPLEMENTATION.md)
- [Summary](LIBRARY_SCAN_ENHANCEMENT_SUMMARY.md)
---
*Document Status*: ✅ **Complete**
*Architecture Type*: Minimal, focused enhancement
*Implementation Complexity*: Low-Medium
*Last Updated*: October 14, 2025