# Library Scan Enhancement Requirements ## ๐Ÿ“‹ **Current State Analysis** ### **โœ… Existing Capabilities** - **File Discovery**: Recursive scanning of library paths using glob patterns - **Multi-format Support**: Videos (9 formats), Photos (8 formats), Text files (18 formats) - **Thumbnail Generation**: FFmpeg-based with hashed folder structure - **Video Analysis**: Codec detection and transcoding requirement analysis - **Database Integration**: Complete media metadata storage with proper indexing - **Batch Processing**: Both individual library and bulk scanning options ### **โŒ Critical Gaps** 1. **No File Deletion Handling**: Deleted files remain in database as orphaned records 2. **No Thumbnail Verification**: Missing/corrupted thumbnails aren't regenerated on re-scan --- ## ๐ŸŽฏ **Enhanced Requirements** ### **Requirement 1: File Deletion Cleanup** **Description**: Automatically detect and remove database entries for files that no longer exist on disk **Priority**: ๐Ÿ”ด **P0 - Critical** **Acceptance Criteria**: - [ ] Compare database records with actual file system state - [ ] Identify orphaned database entries (files that exist in DB but not on disk) - [ ] Remove orphaned entries from database - [ ] Log cleanup actions to console - [ ] Handle errors gracefully (continue scan if cleanup fails) **Technical Requirements**: - File existence verification using `fs.access()` or `fs.stat()` - Delete operation for each orphaned record - Error logging for debugging - No transaction rollback needed (simple delete operations) **User Stories**: - As a user, when I delete files from my library folder, I want them automatically removed from the database during the next scan - As a user, I want the database to accurately reflect what's actually on disk --- ### **Requirement 2: Thumbnail Recovery** **Description**: Detect and regenerate missing thumbnail files during library scan **Priority**: ๐Ÿ”ด **P0 - Critical** **Acceptance Criteria**: - [ ] Verify thumbnail file existence for each media record - [ ] Detect missing thumbnail files (path exists in DB but file missing on disk) - [ ] Regenerate missing thumbnails during scan - [ ] Continue processing if thumbnail generation fails (use fallback) - [ ] Log thumbnail regeneration actions **Technical Requirements**: - Thumbnail file validation using `fs.stat()` - Re-use existing thumbnail generation logic - Handle thumbnail generation failures gracefully - Use existing fallback thumbnail mechanism - No additional database fields needed **User Stories**: - As a user, when thumbnails are accidentally deleted, I want them automatically regenerated during the next scan - As a user, when thumbnail generation previously failed, I want the scan to retry automatically --- ## ๐Ÿ—๏ธ **Technical Architecture Requirements** ### **Database Schema** **No schema changes required** - Use existing tables: - `media` table already has `path` and `thumbnail` fields - No new fields needed ### **Scan Process Flow** ``` 1. File Discovery (existing) โ”œโ”€โ”€ Scan library path for media files โ””โ”€โ”€ Get existing database records 2. File Deletion Cleanup (NEW) โ”œโ”€โ”€ For each database record: โ”‚ โ”œโ”€โ”€ Check if file exists on disk โ”‚ โ””โ”€โ”€ If not: DELETE from database โ””โ”€โ”€ Log cleanup actions 3. File Processing (existing + enhanced) โ”œโ”€โ”€ For each discovered file: โ”‚ โ”œโ”€โ”€ Check if already in database (existing) โ”‚ โ”œโ”€โ”€ If new: Insert and generate thumbnail (existing) โ”‚ โ””โ”€โ”€ If exists: Verify thumbnail (NEW) 4. Thumbnail Verification (NEW) โ”œโ”€โ”€ For each existing media record: โ”‚ โ”œโ”€โ”€ Check if thumbnail file exists โ”‚ โ”œโ”€โ”€ If missing: Regenerate thumbnail โ”‚ โ”œโ”€โ”€ If generation fails: Use fallback โ”‚ โ””โ”€โ”€ Log regeneration actions ``` ### **API Enhancements** **No new API endpoints needed** - Enhance existing scan endpoint: ```typescript // Use existing endpoint POST /api/scan // No request body changes { "libraryId": number // Optional: specific library } // Response includes new statistics { "success": true, "message": "Scan completed", "stats": { "filesProcessed": number, "filesAdded": number, "filesRemoved": number, // NEW "thumbnailsRegenerated": number // NEW } } ``` --- ## ๐Ÿ“Š **Implementation Priority** | **Feature** | **Priority** | **Effort** | **Impact** | |-------------|--------------|------------|------------| | **File Deletion Detection** | ๐Ÿ”ด P0 | Medium (3-4h) | Critical | | **Missing Thumbnail Regeneration** | ๐Ÿ”ด P0 | Medium (3-4h) | Critical | **Total Estimated Time**: 6-8 hours --- ## ๐ŸŽฏ **Success Metrics** ### **Functional Metrics** - **Database Accuracy**: 100% of deleted files removed from database - **Thumbnail Recovery**: >90% of missing thumbnails regenerated successfully - **Error Tolerance**: Scan completes even if individual files fail ### **Quality Metrics** - **No Regressions**: Existing scan functionality works as before - **Error Handling**: Individual file failures don't stop entire scan - **Logging**: All actions logged for debugging --- ## ๐Ÿ” **Non-Requirements** The following are **explicitly excluded** from this enhancement: - โŒ Real-time progress reporting / WebSocket updates - โŒ Scan session tracking / history - โŒ Concurrent processing / worker threads - โŒ Incremental scanning (only changed files) - โŒ Content-based duplicate detection - โŒ Advanced error recovery / retry mechanisms - โŒ Soft delete / undo functionality - โŒ Performance optimizations beyond current implementation - โŒ UI changes / progress bars - โŒ Database transactions (use simple operations) --- ## ๐Ÿ“ **Technical Constraints** 1. **Backward Compatibility**: Must work with existing database schema 2. **Simple Implementation**: No complex architectural changes 3. **Error Tolerance**: Individual failures should not stop scan 4. **Minimal Dependencies**: Use existing libraries and utilities 5. **Code Reuse**: Leverage existing thumbnail generation code --- ## ๐Ÿงช **Testing Requirements** ### **Manual Testing Scenarios** 1. **File Deletion Test** - Add files to library and scan - Delete some files from disk - Re-scan library - Verify deleted files removed from database 2. **Thumbnail Recovery Test** - Add files to library and scan - Delete thumbnail files from disk - Re-scan library - Verify thumbnails regenerated 3. **Error Handling Test** - Create files that cause thumbnail failures - Run scan - Verify scan completes despite failures ### **Unit Tests** - Test file existence checking - Test thumbnail file verification - Test database deletion operations - Test error handling --- *Document Status*: โœ… **Complete** *Implementation Scope*: Focused on 2 core requirements *Estimated Time*: 6-8 hours *Last Updated*: October 14, 2025 **Next Steps**: Review architecture design document for technical implementation details.