File Metadata AI Organizer (FMAO)

A comprehensive file metadata extraction and analysis tool optimized for AI/LLM context injection. Extracts detailed metadata from images, videos, audio files, documents, code, archives, and more.

Features

🎯 Core Capabilities

Incremental Scanning: Skip unchanged files for faster re-scans
Multi-format Support: Images, videos, audio, PDFs, Office docs, code, archives, fonts, markdown
Dual Storage: SQLite database + JSON export
LLM Optimization: Token-aware context generation for AI models
Rich Metadata: EXIF, ID3, GPS formatting, code metrics, perceptual hashes, and more
Magic Number Detection: Standards-compliant MIME type detection

📸 Image Analysis

Dimensions, color space, bit depth
EXIF/IPTC/XMP extraction (camera, GPS, copyright)
GPS Coordinate Formatting: DMS, decimal, Google Maps links, GeoJSON
Dominant color extraction
Thumbnail generation
Perceptual hash for similarity detection
Magic number MIME type detection

🎬 Video & Audio

Duration, codec, bitrate, resolution
Frame rate, aspect ratio
Audio tracks and metadata
ID3 tags (artist, album, genre)
Embedded artwork detection

📄 Document Processing

PDF metadata and text extraction
Markdown parsing and analysis
Page/word counts
Front matter extraction
Document summaries

📊 Office Document Support

Microsoft Office: DOCX, XLSX, PPTX (and legacy DOC, XLS, PPT)
Word document analysis (word count, page count, images, tables)
Excel spreadsheet analysis (sheet names, formulas, cell statistics)
PowerPoint presentation analysis (slide count, images)
LibreOffice/OpenOffice compatible formats

💻 Code Analysis

Lines of code (total, code, comments, blank)
Language detection (35+ languages)
Cyclomatic complexity
Import/dependency extraction
Function and class detection

📦 Archive Support

ZIP, TAR, GZ, BZ2, 7Z formats
File listing without extraction
Compression ratio calculation
Size analysis

🔤 Font File Support

Font Formats: TTF, OTF, WOFF, WOFF2
Font family, style, and weight detection
Glyph count and character set analysis
Language support detection
OpenType feature extraction

🤖 LLM Integration

Token-aware context generation
Selective metadata inclusion
Configurable context windows (4K-128K tokens)
Priority-based file ranking
Multiple output formats (Markdown, JSON)

📊 Visualizations

ASCII directory trees
Mermaid.js diagrams
HTML interactive trees
File type distributions
Statistics and reports

Installation

# Install dependencies
npm install

# Make CLI globally available (optional)
npm link

System Requirements

Node.js 16.0.0 or higher

ffprobe (for video analysis) - Install via ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

Quick Start

# Analyze a directory
fmao analyze /path/to/directory

# Query files
fmao query --category image --limit 10

# Show statistics
fmao stats

# Generate directory tree
fmao tree --format ascii

# Generate LLM context
fmao llm --max-tokens 8000 --output context.md

# Find duplicates
fmao duplicates

CLI Commands

`analyze <directory>`

Analyze a directory and extract metadata from all files.

Options:

-i, --incremental - Use incremental scanning (default: true)
--no-incremental - Force full rescan
-d, --max-depth <depth> - Maximum directory depth
-c, --concurrency <num> - Concurrent processes

Examples:

# Basic analysis
fmao analyze ./my-project

# Full rescan
fmao analyze ./my-project --no-incremental

# Limit depth
fmao analyze ./my-project --max-depth 3

`query [options]`

Query the metadata database.

Options:

-c, --category <category> - Filter by category (image, video, audio, document, code, archive)
-e, --extension <ext> - Filter by extension
--min-size <bytes> - Minimum file size
--max-size <bytes> - Maximum file size
-l, --limit <num> - Limit results
-s, --search <term> - Search term
--sort <field> - Sort by field
-o, --output <format> - Output format (json, table, markdown)

Examples:

# Find all images
fmao query --category image

# Find large videos
fmao query --category video --min-size 100000000

# Search for files
fmao query --search "vacation"

# Output as JSON
fmao query --category code --output json

`stats [options]`

Display file statistics.

Options:

-c, --category <category> - Filter by category

Examples:

# Overall stats
fmao stats

# Image stats
fmao stats --category image

`tree [options]`

Generate directory tree visualization.

Options:

-f, --format <format> - Output format (ascii, mermaid, html)
-o, --output <file> - Save to file
--no-size - Hide file sizes
-c, --category <category> - Filter by category

Examples:

# ASCII tree
fmao tree

# Mermaid diagram
fmao tree --format mermaid --output tree.mmd

# HTML interactive tree
fmao tree --format html --output tree.html

# Only show images
fmao tree --category image

`llm [options]`

Generate LLM-optimized context.

Options:

-t, --max-tokens <num> - Maximum tokens (default: 32000)
-f, --format <format> - Output format (markdown, json)
-o, --output <file> - Save to file
-c, --category <category> - Filter by category
--recent - Prioritize recent files

Examples:

# Generate context for GPT-4
fmao llm --max-tokens 8000 --output context.md

# JSON format for API
fmao llm --format json --output context.json

# Only code files
fmao llm --category code --max-tokens 32000

`duplicates`

Find duplicate files based on content hash.

Example:

fmao duplicates

Configuration

Configuration is loaded from multiple sources (in priority order):

Command-line arguments
Environment variables (prefix: FMAO_)
config.json (user config)
config.default.json (defaults)

Example config.json

{
  "scanning": {
    "maxDepth": -1,
    "respectGitignore": true,
    "maxConcurrency": 4
  },
  "storage": {
    "type": "both",
    "dbPath": "./data/metadata.db",
    "jsonPath": "./data/metadata.json"
  },
  "extractors": {
    "images": {
      "enabled": true,
      "extractExif": true,
      "generateThumbnails": true
    },
    "videos": {
      "enabled": true,
      "ffprobePath": "ffprobe"
    }
  },
  "llm": {
    "contextWindow": 32000,
    "tokenCountingModel": "gpt-4"
  }
}

Environment Variables

# Logging level
export FMAO_LOGGING_LEVEL=debug

# Storage type
export FMAO_STORAGE_TYPE=sqlite

# Max concurrency
export FMAO_SCANNING_MAXCONCURRENCY=8

Architecture

file-metadata-ai-organizer/
├── cli.js                      # CLI entry point
├── src/
│   ├── MetadataAnalyzer.js    # Main orchestrator
│   ├── processors/             # File type processors
│   │   ├── BaseProcessor.js
│   │   ├── ImageProcessor.js
│   │   ├── VideoProcessor.js
│   │   ├── AudioProcessor.js
│   │   ├── PDFProcessor.js
│   │   ├── CodeProcessor.js
│   │   ├── ArchiveProcessor.js
│   │   ├── MarkdownProcessor.js
│   │   ├── OfficeProcessor.js    # NEW: DOCX/XLSX/PPTX support
│   │   └── FontProcessor.js      # NEW: TTF/OTF/WOFF support
│   ├── storage/               # Data storage
│   │   ├── database.js
│   │   ├── schema.js
│   │   └── queryAPI.js
│   ├── formatters/            # Output formatters
│   │   └── LLMFormatter.js
│   ├── visualizers/           # Visualization tools
│   │   └── TreeVisualizer.js
│   └── utils/                 # Utilities
│       ├── config.js
│       ├── logger.js
│       ├── hash.js
│       ├── scanner.js
│       ├── progress.js
│       └── gps.js                # NEW: GPS coordinate utilities
├── data/                      # Generated data
│   ├── metadata.db           # SQLite database
│   └── metadata.json         # JSON export
├── thumbnails/               # Generated thumbnails
└── logs/                     # Application logs

Use Cases

1. LLM Context Generation

# Generate context for code review
fmao analyze ./my-project
fmao llm --category code --max-tokens 16000 > code-context.md

# Use with Claude/GPT
cat code-context.md | pbcopy  # Paste into LLM

2. Photo Organization

# Analyze photo library
fmao analyze ~/Photos

# Find photos by camera
fmao query --category image | grep "Canon"

# Find similar images (duplicates)
fmao duplicates

# Generate visual index
fmao tree --category image --format html --output photo-index.html

3. Project Documentation

# Analyze project
fmao analyze ./my-project

# Generate project overview
fmao stats > PROJECT_STATS.md
fmao tree --format markdown >> PROJECT_STATS.md

# Get code metrics
fmao query --category code --output json > code-metrics.json

4. Media Library Management

# Analyze media
fmao analyze ~/Media

# Find videos without metadata
fmao query --category video --output json | jq '.[] | select(.metadata.video.tags == null)'

# Generate catalog
fmao llm --max-tokens 50000 --output media-catalog.md

Database Schema

SQLite Tables

files - Basic file information
image_metadata - Image-specific data
video_metadata - Video-specific data
audio_metadata - Audio-specific data
document_metadata - Document-specific data
code_metadata - Code analysis data
archive_metadata - Archive information
tags - File tags
relationships - File relationships
exif_data - EXIF data (JSON)

JSON Structure

{
  "version": "1.0.0",
  "generatedAt": "2025-11-21T...",
  "summary": {
    "totalFiles": 1234,
    "totalSize": 123456789,
    "fileTypes": {...}
  },
  "files": [
    {
      "path": "/full/path/to/file.jpg",
      "relativePath": "photos/vacation.jpg",
      "category": "image",
      "metadata": {
        "image": {
          "width": 1920,
          "height": 1080,
          "exif": {...},
          "dominantColors": [...]
        }
      }
    }
  ]
}

Performance

Incremental scanning reduces re-scan time by 90%+
Parallel processing utilizes multiple CPU cores
Streaming for memory-efficient large file handling
Caching for expensive operations
Hash-based change detection

Development

Adding a New Processor

Create new processor in src/processors/:

const BaseProcessor = require('./BaseProcessor');

class MyProcessor extends BaseProcessor {
    canProcess(fileInfo) {
        return fileInfo.extension === 'myext';
    }

    async process(fileInfo) {
        // Extract metadata
        return fileInfo;
    }
}

Register in MetadataAnalyzer.js

Running Tests

npm test

Roadmap

See DEVELOPMENT_PLAN.md for the complete feature roadmap.

Upcoming features:

Office document support (DOCX, XLSX)
Machine learning-based image classification
Audio waveform visualization
Parallel processing with worker threads
Advanced caching strategies
Web UI for browsing metadata
API server mode

Troubleshooting

ffprobe not found

Video analysis requires ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu
sudo apt-get install ffmpeg

Out of memory errors

For large directories, reduce concurrency:

fmao analyze /path --concurrency 2

Slow analysis

Enable incremental scanning (default) and ensure you're not rescanning unchanged files.

Library Usage

FMAO can be used as a library in your Node.js applications:

Basic Usage

const MetadataAnalyzer = require('./src/MetadataAnalyzer');

async function main() {
    const analyzer = new MetadataAnalyzer({
        storage: {
            type: 'both',  // 'sqlite', 'json', or 'both'
            dbPath: './data/metadata.db',
            jsonPath: './data/metadata.json'
        },
        scanning: {
            maxDepth: -1,  // -1 for unlimited
            incremental: true,
            maxConcurrency: 4
        }
    });

    // Initialize
    await analyzer.init();

    // Analyze directory
    const result = await analyzer.analyze('/path/to/directory');
    console.log(`Processed ${result.filesProcessed} files`);

    // Query files
    const images = await analyzer.query({
        category: 'image',
        minSize: 1000000,
        limit: 10
    });

    // Close connections
    await analyzer.close();
}

main().catch(console.error);

Query API

const queryAPI = require('./src/storage/queryAPI');

// Initialize database
await queryAPI.init({
    type: 'sqlite',
    dbPath: './data/metadata.db'
});

// Query by category
const images = await queryAPI.query({ category: 'image' });

// Query by extension
const jpegs = await queryAPI.query({ extension: 'jpg' });

// Size filters
const largeFiles = await queryAPI.query({
    minSize: 10000000,  // 10MB
    maxSize: 100000000  // 100MB
});

// Full-text search
const results = await queryAPI.search('vacation photos');

// Statistics
const stats = await queryAPI.getStatistics();
console.log(`Total files: ${stats.totalFiles}`);
console.log(`Total size: ${stats.totalSize} bytes`);

// Find duplicates
const dupes = await queryAPI.findDuplicates();
dupes.forEach(group => {
    console.log(`Hash: ${group.hash}`);
    group.files.forEach(f => console.log(`  - ${f.path}`));
});

Individual Processors

const ImageProcessor = require('./src/processors/ImageProcessor');
const FontProcessor = require('./src/processors/FontProcessor');

// Process an image
const imageProc = new ImageProcessor({
    thumbnailDir: './thumbnails',
    thumbnailSizes: [150, 300, 600],
    extractExif: true,
    perceptualHash: true
});

const fileInfo = {
    path: '/photos/IMG_001.jpg',
    name: 'IMG_001.jpg',
    category: 'image',
    metadata: {}
};

await imageProc.process(fileInfo);
console.log(`Image: ${fileInfo.metadata.image.width}x${fileInfo.metadata.image.height}`);
console.log(`GPS: ${fileInfo.metadata.image.exif?.gps?.formatted}`);

// Process a font
const fontProc = new FontProcessor();
const fontInfo = {
    path: '/fonts/Roboto-Regular.ttf',
    name: 'Roboto-Regular.ttf',
    category: 'font',
    metadata: {}
};

await fontProc.process(fontInfo);
console.log(`Font: ${fontInfo.metadata.font.family}`);
console.log(`Weight: ${fontInfo.metadata.font.weight}`);
console.log(`Glyphs: ${fontInfo.metadata.font.glyphCount}`);

GPS Utilities

const gpsUtils = require('./src/utils/gps');

// Convert decimal to DMS
const dms = gpsUtils.decimalToDMS(43.467, 'N');
// Result: { degrees: 43, minutes: 28, seconds: 1.2, direction: 'N' }

// Format coordinates
const formatted = gpsUtils.formatCoordinates(43.467, 11.885, { format: 'DMS' });
// Result: '43°28\'1.2"N 11°53\'6.0"E'

// Generate map links
const googleMaps = gpsUtils.generateGoogleMapsLink(43.467, 11.885);
const osm = gpsUtils.generateOpenStreetMapLink(43.467, 11.885);

// GeoJSON
const geojson = gpsUtils.toGeoJSON(43.467, 11.885);
// Result: { type: 'Feature', geometry: { type: 'Point', coordinates: [11.885, 43.467] } }

// Calculate distance
const distance = gpsUtils.calculateDistance(43.467, 11.885, 43.500, 11.900);
console.log(`Distance: ${distance} km`);

Tree Visualization

const TreeVisualizer = require('./src/visualizers/TreeVisualizer');

const visualizer = new TreeVisualizer({
    showSize: true,
    maxDepth: 5,
    categoryFilter: 'image'
});

// ASCII tree
const ascii = await visualizer.generateASCII('/path/to/dir');
console.log(ascii);

// Mermaid diagram
const mermaid = await visualizer.generateMermaid('/path/to/dir');
await fs.writeFile('tree.mmd', mermaid);

// HTML interactive tree
const html = await visualizer.generateHTML('/path/to/dir');
await fs.writeFile('tree.html', html);

LLM Context Generation

const LLMFormatter = require('./src/formatters/LLMFormatter');

const formatter = new LLMFormatter({
    maxTokens: 8000,
    format: 'markdown',  // 'markdown' or 'json'
    prioritize: 'recent',  // 'recent', 'size', or 'complexity'
    includeContent: false
});

// Generate context from query results
const files = await queryAPI.query({ category: 'code' });
const context = formatter.format(files);

// Save for LLM
await fs.writeFile('context.md', context);

Advanced Features

Perceptual Hashing for Image Similarity

Find visually similar images using perceptual hashes:

const dbManager = require('./src/storage/database');

// Get all images with perceptual hashes
const images = await queryAPI.query({ category: 'image' });

// Calculate Hamming distance between images
function hammingDistance(hash1, hash2) {
    let distance = 0;
    for (let i = 0; i < hash1.length; i++) {
        const val1 = parseInt(hash1[i], 16);
        const val2 = parseInt(hash2[i], 16);
        const xor = val1 ^ val2;
        distance += xor.toString(2).split('1').length - 1;
    }
    return distance;
}

// Find similar images
const targetImage = images[0];
const similar = images.filter(img => {
    if (img.id === targetImage.id) return false;
    const distance = hammingDistance(
        targetImage.metadata.image.perceptualHash,
        img.metadata.image.perceptualHash
    );
    return distance <= 5;  // Very similar
});

console.log(`Found ${similar.length} similar images`);

Custom File Processors

Create a custom processor for a new file type:

const BaseProcessor = require('./src/processors/BaseProcessor');

class SVGProcessor extends BaseProcessor {
    canProcess(fileInfo) {
        return fileInfo.extension === 'svg' ||
               fileInfo.mimeType === 'image/svg+xml';
    }

    async extractMetadata(fileInfo) {
        const fs = require('fs').promises;
        const content = await fs.readFile(fileInfo.path, 'utf8');

        // Parse SVG
        const widthMatch = content.match(/width="(\d+)"/);
        const heightMatch = content.match(/height="(\d+)"/);
        const viewBoxMatch = content.match(/viewBox="([\d\s.]+)"/);

        fileInfo.metadata.svg = {
            width: widthMatch ? parseInt(widthMatch[1]) : null,
            height: heightMatch ? parseInt(heightMatch[1]) : null,
            viewBox: viewBoxMatch ? viewBoxMatch[1] : null,
            hasAnimations: content.includes('<animate'),
            elementCount: (content.match(/<(circle|rect|path|line|polygon)/g) || []).length
        };
    }

    getSupportedExtensions() {
        return ['svg', 'svgz'];
    }

    getSupportedMimeTypes() {
        return ['image/svg+xml'];
    }
}

module.exports = SVGProcessor;

Then register it in your analyzer:

const analyzer = new MetadataAnalyzer(config);
const SVGProcessor = require('./processors/SVGProcessor');
analyzer.registerProcessor(new SVGProcessor());

Database Schema Access

Direct database access for advanced queries:

const Database = require('better-sqlite3');
const db = new Database('./data/metadata.db');

// Complex query with joins
const results = db.prepare(`
    SELECT
        f.path,
        f.name,
        f.size,
        i.width,
        i.height,
        e.data as exif
    FROM files f
    LEFT JOIN image_metadata i ON f.id = i.file_id
    LEFT JOIN exif_data e ON f.id = e.file_id
    WHERE f.category = 'image'
    AND i.width > 1920
    ORDER BY f.size DESC
    LIMIT 10
`).all();

// Aggregate statistics
const stats = db.prepare(`
    SELECT
        category,
        COUNT(*) as count,
        SUM(size) as total_size,
        AVG(size) as avg_size,
        MIN(size) as min_size,
        MAX(size) as max_size
    FROM files
    GROUP BY category
`).all();

// Full-text search
const searchResults = db.prepare(`
    SELECT * FROM files_fts
    WHERE files_fts MATCH ?
    ORDER BY rank
    LIMIT 20
`).all('vacation photos beach');

API Reference

MetadataAnalyzer

Main class for analyzing directories.

Constructor

new MetadataAnalyzer(config)

Parameters:

config.storage - Storage configuration
- type - 'sqlite', 'json', or 'both'
- dbPath - Path to SQLite database
- jsonPath - Path to JSON file
config.scanning - Scanning options
- maxDepth - Maximum directory depth (-1 for unlimited)
- incremental - Enable incremental scanning
- maxConcurrency - Number of concurrent processors
- followSymlinks - Follow symbolic links
- ignorePatterns - Array of glob patterns to ignore

Methods

async init() Initialize the analyzer and database connections.

async analyze(directory, options) Analyze a directory and extract metadata.

async query(filters) Query stored metadata with filters.

async close() Close database connections and save data.

Processors

All processors extend BaseProcessor and implement:

canProcess(fileInfo) - Returns true if processor can handle the file
async process(fileInfo) - Extract metadata and populate fileInfo.metadata
getSupportedExtensions() - Return array of supported extensions
getSupportedMimeTypes() - Return array of supported MIME types

Available processors:

ImageProcessor - Images (JPEG, PNG, HEIC, WebP, TIFF, etc.)
VideoProcessor - Videos (MP4, MKV, AVI, MOV, WebM)
AudioProcessor - Audio (MP3, FLAC, WAV, M4A, OGG)
PDFProcessor - PDF documents
CodeProcessor - Source code (JS, TS, Python, Java, C++, etc.)
ArchiveProcessor - Archives (ZIP, TAR, GZ, 7Z, RAR)
MarkdownProcessor - Markdown files
OfficeProcessor - Office documents (DOCX, XLSX, PPTX)
FontProcessor - Fonts (TTF, OTF, WOFF, WOFF2)

Storage API

Database Manager (src/storage/database.js)

const dbManager = require('./src/storage/database');

// Initialize
await dbManager.init({
    type: 'both',
    dbPath: './data/metadata.db',
    jsonPath: './data/metadata.json'
});

// Upsert file
await dbManager.upsertFile(fileData);

// Get file by path
const file = dbManager.getFile('/path/to/file.jpg');

// Query files
const files = dbManager.queryFiles({
    category: 'image',
    extension: 'jpg',
    minSize: 1000000,
    limit: 100
});

// Save JSON to disk
await dbManager.saveJSON();

// Close
await dbManager.close();

Testing

The project includes comprehensive tests for all major components.

Running Tests

# Run all tests
npm test

# Run specific test file
npm test -- tests/image-processor.test.js

# Run with coverage
npm run test:coverage

# Watch mode for development
npm test -- --watch

Test Coverage

Tests cover:

✅ Image metadata extraction (EXIF, GPS, colors)
✅ Font metadata extraction (glyphs, features, character sets)
✅ Archive processing
✅ Code analysis
✅ GPS coordinate conversion and formatting
✅ Perceptual hashing
✅ Special character filenames (Unicode, emoji, spaces)
✅ Symlink handling with circular reference detection
✅ Database operations (SQLite and JSON)
✅ Query API
✅ Incremental scanning

Writing Tests

const ImageProcessor = require('../src/processors/ImageProcessor');
const fs = require('fs').promises;

describe('ImageProcessor', () => {
    let processor;

    beforeEach(() => {
        processor = new ImageProcessor({
            extractExif: true,
            perceptualHash: true
        });
    });

    test('should extract basic image metadata', async () => {
        const fileInfo = {
            path: './test-samples/images/sample.jpg',
            name: 'sample.jpg',
            category: 'image',
            metadata: {}
        };

        await processor.process(fileInfo);

        expect(fileInfo.metadata.image.width).toBeGreaterThan(0);
        expect(fileInfo.metadata.image.height).toBeGreaterThan(0);
        expect(fileInfo.metadata.image.format).toBe('jpeg');
    });
});

Performance Optimization

Incremental Scanning

Incremental scanning dramatically reduces re-scan time by only processing new or modified files:

# First scan: processes all files
fmao analyze ./project

# Second scan: only processes changed files
fmao analyze ./project  # 90%+ faster

The system uses a combination of:

File modification time (mtime)
File size
Path-based tracking

Concurrency Tuning

Adjust concurrency based on your system:

# Low-end systems
fmao analyze ./project --concurrency 2

# High-end systems with SSDs
fmao analyze ./project --concurrency 8

# Auto (default: 4)
fmao analyze ./project

Memory Management

For very large directories (100K+ files):

const analyzer = new MetadataAnalyzer({
    scanning: {
        maxConcurrency: 2,      // Reduce parallelism
        batchSize: 100,         // Process in batches
        incremental: true       // Skip unchanged files
    },
    storage: {
        type: 'sqlite'          // Use only SQLite (no in-memory JSON)
    }
});

Selective Processing

Disable expensive operations if not needed:

const analyzer = new MetadataAnalyzer({
    processors: {
        image: {
            generateThumbnails: false,  // Skip thumbnails
            perceptualHash: false,      // Skip p-hash
            extractColors: false        // Skip color analysis
        },
        video: {
            enabled: false              // Skip video processing entirely
        }
    }
});

Troubleshooting

ffprobe not found

Video analysis requires ffmpeg:

# macOS
brew install ffmpeg

# Ubuntu
sudo apt-get install ffmpeg

Out of memory errors

For large directories, reduce concurrency:

fmao analyze /path --concurrency 2

Slow analysis

Enable incremental scanning (default) and ensure you're not rescanning unchanged files.

Permission errors

Ensure read permissions for all files:

# Check permissions
ls -la /path/to/directory

# Fix permissions (if appropriate)
chmod -R +r /path/to/directory

Database locked errors

Close other connections to the database:

// Always close when done
await analyzer.close();

Missing dependencies

Install all required dependencies:

npm install

For optional features:

# Font processing
npm install fontkit

# Advanced image formats
npm install sharp

License

ISC

Contributing

Contributions welcome! Please see DEVELOPMENT_PLAN.md for planned features and architecture.

Development Setup

# Clone repository
git clone <repository-url>
cd file-metadata-ai-organizer

# Install dependencies
npm install

# Run tests
npm test

# Lint code
npm run lint

# Format code
npm run format

Pull Request Process

Create a feature branch
Write tests for new functionality
Ensure all tests pass
Update documentation
Submit pull request

Changelog

See CHANGELOG.md for version history.

Acknowledgments

Built with:

sharp - High-performance image processing
exifr - EXIF metadata extraction
fontkit - Font parsing and analysis
better-sqlite3 - Fast SQLite database
music-metadata - Audio metadata extraction

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.claude		.claude
.vscode		.vscode
docs		docs
node_modules		node_modules
src		src
test-samples		test-samples
thumbnails		thumbnails
.gitignore		.gitignore
DEVELOPMENT_PLAN.md		DEVELOPMENT_PLAN.md
ENHANCEMENTS_ROADMAP.md		ENHANCEMENTS_ROADMAP.md
FUNCTIONALITY.md		FUNCTIONALITY.md
GPS_API_REFERENCE.md		GPS_API_REFERENCE.md
GPS_INTEGRATION_EXAMPLES.md		GPS_INTEGRATION_EXAMPLES.md
GPS_MODULE_INDEX.md		GPS_MODULE_INDEX.md
GPS_MODULE_README.md		GPS_MODULE_README.md
GPS_USAGE_EXAMPLES.md		GPS_USAGE_EXAMPLES.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
QUICK_START.md		QUICK_START.md
README.md		README.md
TODO.md		TODO.md
USAGE_EXAMPLES.md		USAGE_EXAMPLES.md
cli.js		cli.js
config.default.json		config.default.json
fileMetadata.js		fileMetadata.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
test_gps_module.js		test_gps_module.js

hackur/file-metadata-ai-organizer

Folders and files

Latest commit

History

Repository files navigation

File Metadata AI Organizer (FMAO)

Features

🎯 Core Capabilities

📸 Image Analysis

🎬 Video & Audio

📄 Document Processing

📊 Office Document Support

💻 Code Analysis

📦 Archive Support

🔤 Font File Support

🤖 LLM Integration

📊 Visualizations

Installation

System Requirements

Quick Start

CLI Commands

analyze <directory>

query [options]

stats [options]

tree [options]

llm [options]

duplicates

Configuration

Example config.json

Environment Variables

Architecture

Use Cases

1. LLM Context Generation

2. Photo Organization

3. Project Documentation

4. Media Library Management

Database Schema

SQLite Tables

JSON Structure

Performance

Development

Adding a New Processor

Running Tests

Roadmap

Troubleshooting

ffprobe not found

Out of memory errors

Slow analysis

Library Usage

Basic Usage

Query API

Individual Processors

GPS Utilities

Tree Visualization

LLM Context Generation

Advanced Features

Perceptual Hashing for Image Similarity

Custom File Processors

Database Schema Access

API Reference

MetadataAnalyzer

Constructor

Methods

Processors

Storage API

Testing

Running Tests

Test Coverage

Writing Tests

Performance Optimization

Incremental Scanning

Concurrency Tuning

Memory Management

Selective Processing

Troubleshooting

ffprobe not found

Out of memory errors

Slow analysis

Permission errors

Database locked errors

`analyze <directory>`

`query [options]`

`stats [options]`

`tree [options]`

`llm [options]`

`duplicates`

Packages