This document explains Auto-M4B's system design, component structure, and data flow.
Auto-M4B is a Python-based audiobook conversion pipeline that runs in Docker. It continuously monitors an inbox folder, processes audiobooks through various stages, and outputs chapterized M4B files.
┌──────────────┐
│ User adds │
│ audiobook to │
│ inbox/ │
└──────┬───────┘
│
▼
┌──────────────────────────────────────────────┐
│ Auto-M4B Container │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ 1. Scanner (Inbox State) │ │
│ │ - Detect new books │ │
│ │ - Track processing state │ │
│ └──────────────┬─────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────┐ │
│ │ 2. Audiobook Parser │ │
│ │ - Extract metadata │ │
│ │ - Analyze structure │ │
│ └──────────────┬─────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────┐ │
│ │ 3. Converter (m4b-tool) │ │
│ │ - Merge audio files │ │
│ │ - Create chapters │ │
│ │ - Apply metadata │ │
│ └──────────────┬─────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────┐ │
│ │ 4. Post-Processor │ │
│ │ - Move to output │ │
│ │ - Archive originals │ │
│ │ - Cleanup │ │
│ └────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────┘
│
▼
┌──────────────┐
│ Converted │
│ M4B ready in │
│ converted/ │
└──────────────┘
auto-m4b/
├── src/
│ ├── __main__.py # CLI entry point
│ ├── auto_m4b.py # Application loop
│ └── lib/
│ ├── audiobook.py # Audiobook data model
│ ├── config.py # Configuration management
│ ├── inbox_state.py # State tracking & scanning
│ ├── inbox_item.py # Individual book items
│ ├── run.py # Main processing logic
│ ├── m4btool.py # m4b-tool wrapper
│ ├── fs_utils.py # File system utilities
│ ├── id3_utils.py # Metadata extraction
│ ├── ffmpeg_utils.py # Audio analysis
│ ├── parsers.py # Path/filename parsing
│ ├── hasher.py # File hashing for change detection
│ ├── logger.py # Logging utilities
│ ├── term.py # Terminal output formatting
│ └── typing.py # Type definitions
├── Dockerfile # Container definition
├── entrypoint.sh # Container startup script
├── pyproject.toml # Python dependencies
└── docs/ # Documentation
The main application loop that:
- Initializes configuration
- Runs startup checks
- Continuously processes the inbox
- Handles errors and recovery
Key Functions:
app(): Main entry pointuse_error_handler(): Error handling context manager
Flow:
while infinite_loop or loop_counter <= max_loops:
try:
process_inbox()
finally:
loop_counter += 1
sleep(SLEEP_TIME)Manages all settings and environment variables.
Key Classes:
Config: Singleton configuration managerAutoM4bArgs: Command-line arguments parser
Features:
- Environment variable loading
- Type conversion and validation
- Cached properties for performance
- Dynamic folder path resolution
Configuration Flow:
1. Load .env file (if specified)
2. Parse command-line arguments
3. Merge with environment variables
4. Apply defaults
5. Validate and resolve paths
Tracks which books are in the inbox and their processing status.
Key Classes:
InboxState: Singleton state manager (extendsHasher)InboxItem: Individual book tracking
Responsibilities:
- Scan inbox for new books
- Track processing status (pending, processing, failed)
- Detect file changes via hashing
- Prevent duplicate processing
State Tracking:
InboxItem:
- key: str # Unique identifier
- status: InboxItemStatus # pending|processing|failed
- path: Path # Book location
- last_updated: float # Timestamp
- is_series_parent: bool # Series detectionRepresents an audiobook with metadata and structure information.
Key Classes:
Audiobook: Pydantic model for audiobooks
Properties:
- Metadata: title, artist, album, year, cover art
- File Info: format, size, duration, bitrate
- Structure: standalone, folder, multi-disc, series
- Paths: inbox, merge, build, converted locations
Book Structures:
standalone: Single audio file (m4b, mp3, etc.)
folder: Directory with multiple audio files
multi-disc: Multiple subdirectories (Disc 1, Disc 2, etc.)
series: Series of books in subdirectories
The main processing logic that converts audiobooks.
Key Functions:
process_inbox(): Main loop - scans and processes booksconvert_book(): Orchestrates conversion of a single bookprocess_already_m4b(): Handles pre-converted M4B filesprocess_book_folder(): Converts multi-file booksfail_book(): Handles failures
Processing Flow:
┌─────────────────┐
│ Scan Inbox │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Filter Books │ (Skip already processed, match filters)
└────────┬────────┘
│
▼
┌─────────────────┐
│ For Each Book: │
└────────┬────────┘
│
▼
┌────────────────────┐
│ Already M4B? │
└─────┬──────┬───────┘
│ Yes │ No
│ └───────────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────────┐
│ Move to │ │ Parse Metadata │
│ Converted │ └────────┬─────────┘
└──────────────┘ │
▼
┌──────────────────┐
│ Extract Cover │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Merge to M4B │
│ (via m4b-tool) │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Verify Output │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Move to │
│ Converted │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Archive/Delete │
│ Originals │
└──────────────────┘
Interfaces with the sandreas/m4b-tool for audio conversion.
Key Classes:
M4bTool: Command builder and executor
Operations:
- merge: Combine audio files into M4B
- split: Split M4B by chapters
- chapters: Import/export chapter data
Example Command:
m4b-tool merge "input_dir/" \
--output-file="output.m4b" \
--jobs=4 \
--audio-bitrate=64k \
--max-chapter-length=1800 \
--use-filenames-as-chaptersLow-level file operations and audio file discovery.
Key Functions:
find_book_dirs_in_inbox(): Discover booksfind_audio_files(): Locate audio filesmv_file_to_dir(): Safe file moving with overwrite handlinghash_path_audio_files(): Generate content hashesclean_dir(): Clean up working directories
Extract and parse metadata from files and paths.
Sources:
- ID3 tags: From audio files
- Filenames: Parse author, title, year
- Folder structure: Detect multi-disc, series
Parsing Examples:
"Author - Title (Year)" → Author, Title, Year
"Book 01 - Chapter Name" → Track 1, "Chapter Name"
"Disc 1/Chapter 01.mp3" → Multi-disc structure
-
Discovery
- User copies book to
inbox/MyBook/ InboxStatescanner detects new directory- Creates
InboxItemwith statuspending
- User copies book to
-
Pre-Processing
- Hash files to detect completion
- Wait
WAIT_TIMEfor file transfers to finish - Create
Audiobookobject - Extract metadata from files and paths
-
Conversion
- Copy files to working directory (
/tmp/auto-m4b/build/) - Extract cover art if present
- Call
m4b-tool mergewith appropriate flags - Monitor progress and logs
- Copy files to working directory (
-
Post-Processing
- Verify output M4B exists and is valid
- Move M4B to
converted/MyBook/MyBook.m4b - Archive originals to
archive/MyBook/(if configured) - Create backup in
backup/(if enabled) - Update
InboxItemstatus tocompleted - Clean up working directories
-
Error Handling
- On failure, mark
InboxItemasfailed - Log error details
- Move book to failed state (currently stays in inbox)
- Future: Retry logic (Phase 1.2)
- On failure, mark
inbox/
└── MyBook/ # User's input
├── Chapter01.mp3
└── Chapter02.mp3
/tmp/auto-m4b/
├── build/ # Temporary build area
│ └── MyBook/
│ ├── Chapter01.mp3
│ ├── Chapter02.mp3
│ └── cover.jpg
├── merge/ # (Currently unused)
└── trash/ # Temporary cleanup
converted/
└── MyBook/ # Final output
├── MyBook.m4b
└── MyBook.chapters.txt
archive/
└── MyBook/ # Archived originals
├── Chapter01.mp3
└── Chapter02.mp3
backup/
└── MyBook/ # Backup copy
├── Chapter01.mp3
└── Chapter02.mp3
pending → Book detected, not yet processed
processing → Currently being converted
completed → Successfully converted and moved
failed → Error during processing
┌──────────┐
┌───►│ pending │◄───┐
│ └────┬─────┘ │
│ │ │
│ ▼ │
│ ┌──────────┐ │
│ │processing│ │
│ └────┬─────┘ │
│ │ │
│ ┌────┴─────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────┐ ┌──────┴───┐
└─┤ failed │ │completed │
└────────┘ └──────────┘
Auto-M4B uses file hashing to:
- Detect when file transfers are complete
- Prevent processing partial uploads
- Skip already-processed books
# Hash all audio files in a directory
hash = hash_path_audio_files(book_path)
# Compare with previous hash
if hash != previous_hash:
wait_for_stability()Auto-M4B processes books sequentially (one at a time):
- Simpler error handling
- Prevents resource contention
- Easier to debug
Future enhancement (Phase 3.1): Parallel processing of multiple books.
- Caching: Extensive use of
@cached_propertyfor expensive operations - Lazy Loading: Metadata extracted only when needed
- Hashing: Fast change detection without file comparison
- CPU Cores: m4b-tool uses multiple cores for audio encoding
- CPU: Configurable via
CPU_CORES(default: all cores) - Memory: Typically 500MB-2GB depending on book size
- Disk: Requires ~3x book size during processing:
- 1x in inbox
- 1x in working directory
- 1x in output/archive/backup
-
User Errors: Invalid paths, missing permissions
- Logged with clear messages
- Processing skipped
-
File Errors: Corrupted audio, missing files
- Book marked as failed
- Error details logged
- Manual intervention required
-
System Errors: Out of disk, m4b-tool crashes
- Fatal error file created
- Container stops (requires manual restart)
Currently manual recovery:
- Fix the issue (repair file, free space, etc.)
- Remove book from inbox
- Re-add book to inbox
- Processing resumes
Future (Phase 1.2): Automatic retry with exponential backoff.
- Pre-processors: Add logic before conversion in
run.py - Post-processors: Add logic after conversion in
run.py - Metadata Sources: Extend
parsers.pyorid3_utils.py - Output Formats: Extend
m4btool.pywrapper
Phase 3.2 will introduce:
- Hook system for pre/post processing
- Custom metadata enrichment
- Integration with external services (Audible, Goodreads)
- Python 3.9+: Runtime
- m4b-tool: Audio conversion (sandreas/m4b-tool)
- FFmpeg: Audio analysis and manipulation
- Docker: Container runtime
- pydantic: Data validation
- cachetools: Caching utilities
- mutagen: ID3 tag reading
- tinta: Terminal colors
Base: ubuntu:22.04
├── System packages (ffmpeg, curl, etc.)
├── PHP 8.2 & composer
├── m4b-tool (v0.5-prerelease)
├── Python 3.9+
├── gosu (for PUID/PGID switching)
└── Auto-M4B Python applicationCurrently minimal test coverage. Future improvements (Phase 4.3):
- Unit tests for core logic
- Integration tests for full pipeline
- Fixture-based testing with sample audiobooks
- Container logs:
docker-compose logs auto-m4b - Global log:
converted/auto-m4b.log - Debug output: Console (when
DEBUG=Y)
- INFO: Normal operations
- DEBUG: Detailed processing steps (DEBUG=Y)
- WARNING: Recoverable issues
- ERROR: Processing failures
- Container runs as specified PUID/PGID
- No root execution after entrypoint
- File ownership matches host user
- Read-only mounts possible for config
- Read-write required for processing folders
- No network ports exposed by default
- Future Web UI (Phase 2.1) will expose HTTP port
- Retry Logic (Phase 1.2): Automatic recovery from transient errors
- Metrics (Phase 1.5): Prometheus-compatible metrics endpoint
- Web UI (Phase 2.1): FastAPI-based dashboard
- Parallel Processing (Phase 3.1): Process multiple books simultaneously
- Plugin System (Phase 3.2): Extensibility for custom workflows
Current design works for:
- Single user/household
- Up to ~100 books/day
- Sequential processing
For higher throughput:
- Add parallel processing
- Distribute across multiple containers
- Add queue-based architecture (Celery, RabbitMQ)