🎯 Key Finding: Rich AI-Generated Content Tags Available!
Beyond the basic file and album structure, the database contains 5,312 AI-generated content tags that provide valuable searchable metadata for the recovered files.
1. AI Content Tags (FilesTags Table)
- 5,312 tag instances across files
- All auto-generated (computer vision/AI analysis)
- Top content categories:
- People: person (2,692), child (481), baby (38)
- Architecture: building (78), architecture (81)
- Nature: tree (79), beach (67), sea (37), oceanside (37)
- Objects: book (268), document (202), car (63), chair (53), laptop (48)
- Places: screenshots (67), dining table (39), sidewalk (35), swing (34)
- 30 entities representing users and devices
- Timezone data: America/Los_Angeles, UTC
- Language preferences: en-US, en_US
- Authentication data: auth0 user IDs, device identifiers
Available in Files table for each file:
- Camera data: make, model, aperture, exposure, ISO, focal length
- GPS coordinates: latitude, longitude, altitude
- Location names: city, province, country (both image and video)
- Dimensions: width, height for images and videos
- Technical specs: codec, bitrate, framerate, duration
- Audio metadata: title, album, artist, composer, genre, year
4. Timestamps and Versioning
- Birth time: Original file creation
- Creation time: Database entry time
- Modification time: Last update
- Image/Video dates: Content capture dates
- Version tracking: File modification history
Practical Value
For Photo Organization:
- Smart search by content: Find all photos with “person”, “beach”, “building”
- Location-based organization: Group by GPS coordinates and city names
- Camera-based sorting: Organize by camera equipment used
- Time-based analysis: Track photo evolution over time
For Data Recovery:
- Content verification: AI tags help verify file content without opening
- Quality assessment: Technical metadata indicates file integrity
- Relationship mapping: User/device associations show file origins
- Duplicate detection: Multiple metadata points enable deduplication
The extract_files.py
script preserves file organization while export_standard_formats.py
handles metadata export:
- Complete JSON metadata for each file
- CSV reports for spreadsheet analysis
- Tag summaries showing content distribution
- Location reports for geographic analysis
- Camera equipment reports for technical analysis
Example Use Cases
Family Photo Management:
# Find all photos with children
grep "child" metadata/file_tags.csv
# Get all vacation photos by location
grep "beach\|ocean" metadata/file_tags.csv
# Photos by specific camera
grep "iPhone\|Canon" metadata/files_with_metadata.csv
Content Analysis:
- Most photographed subjects: person (2,692 photos), children (481 photos)
- Popular locations: Beach/ocean scenes (104 photos)
- Documentation: Books/documents (470 photos)
- Architecture: Building photography (159 photos)
This rich metadata makes the recovered files not just recoverable, but highly organized and searchable - far beyond what typical photo recovery provides!