Skip to content

feat: v2.4.2 - Claude enhancements, metadata translation architecture, and RTL/LTR ebook support#537

Closed
seidnerj wants to merge 5 commits into
bookfere:masterfrom
seidnerj:master
Closed

feat: v2.4.2 - Claude enhancements, metadata translation architecture, and RTL/LTR ebook support#537
seidnerj wants to merge 5 commits into
bookfere:masterfrom
seidnerj:master

Conversation

@seidnerj

@seidnerj seidnerj commented Jan 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Major release (v2.4.2) adding advanced translation features, comprehensive metadata translation system with background processing, and RTL/LTR ebook formatting support.

Translation Engine Enhancements

Prompt Caching

  • Enable parallel section translation with full book context
  • 90% input cost reduction on subsequent sections
  • Works with existing merge_length for section sizing
  • UI checkbox (default: disabled)

Batch Translation API

  • New asynchronous bulk translation engine option
  • 50% cost reduction on all requests
  • Combined savings: Up to 84% total with caching
  • Trade-off: Processing time up to 24 hours (most <1 hour)

Dynamic Token Management

  • Automatic max_tokens scaling based on input length
  • Model-specific output limits (4K-128K)
  • Pre-translation warning if merge exceeds capacity
  • Prevents partial/truncated translations

Dynamic Timeout (Opt-in)

  • Content-based timeout scaling
  • Examples: 50K chars → 6.5min, 200K chars → 23min
  • UI checkbox (default: disabled)

Token Estimation Helper

  • Real-time estimates for merge translation
  • Language-aware character-to-token ratios
  • Shows percentage of context window
  • Updates dynamically

Beta Feature Controls

  • Extended output: 128K tokens (specific models)
  • Extended context: 1M token window (specific models)
  • Conditional UI visibility per model

Metadata Translation (9 Fields)

Individual Field Controls

  • Primary: title, creator, publisher, series
  • Advanced: creator file-as, rights, subject, contributor, description
  • Replace old "translate all" checkbox
  • Auto-migrate config on plugin load (v2.4.2 upgrade)

Background Processing

  • Translate in worker thread (no UI freeze)
  • Cache results for instant Output phase
  • Fallback: Live translation for old caches

Fields Translated

  • title → dc:title + calibre:title_sort
  • creator → dc:creator (all authors)
  • publisher → dc:publisher
  • series → calibre:series
  • creator file-as → opf:file-as
  • rights, subject, contributor, description
  • Set dc:language to target automatically

RTL/LTR Ebook Formatting

Complete RTL Support

  • primary-writing-mode meta tag (horizontal-rl/lr)
  • spine page-progression-direction attribute (rtl/ltr)
  • Conditional text-align inline styles
  • Only applied when source ≠ target direction

Language Support

  • RTL: Arabic, Hebrew, Farsi, Dari, Urdu, Yiddish, Pashto
  • LTR: English, Spanish, French, German, Italian, Portuguese, Russian, CJK
  • Hebrew code conversion (iw → he)
  • Language region handling (en-US → en)

Efficient Implementation

  • Calibre polish Container API
  • XPath-based OPF modification
  • Only updates modified files in ZIP

UI/UX Improvements

Streaming Enhancements

  • Immediate stop button response
  • Preserve scroll position during streaming
  • Prevent text corruption on click
  • Qt5/Qt6 compatibility

Cache Consistency

  • Single get_cache_id() helper function
  • Consistent across 3 call sites
  • Simplifies future modifications

Version

  • Bump to v2.4.2
  • Add ver242_upgrade() for config migration
  • Comprehensive CHANGELOG

Test Results

✅ Metadata translation (all 9 fields)
✅ RTL formatting (Hebrew, Arabic)
✅ Background processing (no UI freeze)
✅ Old cache fallback works
✅ Stop button immediate
✅ Scroll preservation

… features

- Add configuration options for 128K output tokens (Claude 3.7 Sonnet)
- Add configuration option for 1M context window (Claude Sonnet 4.0/4.5)
- Implement conditional UI checkboxes that show only for relevant models
- Refactor get_headers() to use config values instead of hardcoded logic
- Update default model to claude-sonnet-4-5 (official API alias)
- Add token estimate helper for merge translation feature
- Display estimates based on source language and model-specific context windows
- Show real-time updates when merge length, source language, or context settings change
- Include comprehensive mapping for all Claude 4.5, 4.x, and 3.x model context windows
- Properly hide entire rows (labels + checkboxes) when features not applicable
- Persist all beta feature settings to plugin configuration
…support

Metadata Translation Enhancements:
- Replace single 'translate all' checkbox with granular field controls
- Add individual checkboxes for: title, creator, creator file-as, publisher, series
- Set language code (dc:language) to target language by default
- Add special handling for calibre:series and author_sort fields
- Translate series and author_sort via direct translation calls post-conversion
- Maintain backward compatibility with old metadata_translation config flag

RTL/LTR Formatting Support:
- Add conditional text-align inline styles (only when source/target directions differ)
- Add RTL/LTR OPF metadata: primary-writing-mode and page-progression-direction
- Only apply formatting when translating between different text directions
- Support bidirectional translation (LTR→RTL and RTL→LTR)

Language Direction Support:
- Expand lang_directionality dictionary with comprehensive RTL language coverage
- Add: Farsi, Dari, Urdu, Yiddish, Pashto (RTL languages)
- Add: French, Portuguese, Russian, Chinese, Japanese, Korean (LTR languages)
- Support both legacy ('iw') and modern ('he') Hebrew codes
- Track source language in ElementHandler for direction comparison

Technical Improvements:
- Build inline styles with semicolon-separated properties
- Add robust error handling for OPF metadata modifications
- Log direction changes and metadata updates for debugging
- Use hasattr checks for OEB structure compatibility across Calibre versions
…timeout scaling

Dynamic max_tokens Calculation:
- Calculate max_tokens based on estimated output length (input / 3 chars per token)
- Use model-specific max output limits per official documentation:
  - Claude 3.7 Sonnet with extended output: 128K tokens
  - Claude 4.x models (Sonnet/Haiku/Opus 4.5, Opus 4.1, Sonnet 4.0): 64K tokens
  - Claude Opus 4.0: 32K tokens
  - Claude 3.x Haiku: 4K tokens
  - Other Claude models: 32K default
- Add 10% safety buffer to estimated output
- Minimum 4096 tokens maintained

Dynamic Request Timeout (Opt-in):
- Add UI checkbox to enable dynamic timeout (Claude only, default: disabled)
- When disabled: Uses user-configured timeout (default 30s)
- When enabled: Scales based on content length and token generation speed
- Formula: (estimated_output_tokens / 50) + 60s overhead
- Conservative 50 tokens/second estimate (Claude generates 65-120 tokens/sec)
- Timeout examples when enabled:
  - 10K chars → 93s (~1.5 min)
  - 50K chars → 393s (~6.5 min)
  - 200K chars → 1,393s (~23 min)
- Minimum 30 seconds, maximum 2 hours
- Restores original timeout after each request

Test Fixes:
- Initialize source_lang attribute in Element class
- Set source_lang on elements during prepare_original (both handlers)
- Update test_get_metadata_elements to match new config access pattern
- Update Claude translate tests to expect default 30s timeout

Source: https://platform.claude.com/docs/en/about-claude/models/overview
Source: https://artificialanalysis.ai/models/claude-3-opus
@seidnerj seidnerj force-pushed the master branch 7 times, most recently from bf1fa8c to 8c45791 Compare January 4, 2026 11:59
Streaming Text Insertion:
- Disable widget updates during text insertion to prevent auto-scroll
- Append using separate QTextCursor on document (doesn't affect visible cursor)
- Only auto-scroll to bottom if user was already at bottom (watching stream)
- Preserve user's scroll position when they scroll up to read earlier content
- Qt version compatibility for QTextCursor.End enum (Qt5/Qt6)

Stop Button Fix:
- Add cancellation check inside streaming generator consumption loop
- Check cancel_request() on each character/chunk during streaming
- Raise TranslationCanceled immediately when stop is requested
- Apply to both single translation and batch mode streaming
- Fixes infinite 'Stopping...' state with large merged translations

Scroll Synchronization:
- Keep simple pixel-based scroll sync (most stable)
- Note: Minor drift can occur with word wrap when languages have different lengths
- Alternative sync methods (line-based, percentage-based) had worse issues
@seidnerj

seidnerj commented Jan 7, 2026

Copy link
Copy Markdown
Contributor Author

@bookfere Apart from the claude prompt caching and claude batch api - everything else has been tested over and over. Those two, need additional testing. I am hoping we could get the community to help test those. These have massive potential to both give better translations (full context while translating ANY chunk of the book so discrepancies should not happen) and massive reduction in translation cost 🙏🙂🤞

@bookfere

bookfere commented Jan 8, 2026

Copy link
Copy Markdown
Owner

@seidnerj

Thank you very much for contributing such a significant enhancement to the plugin! I’ll review the changes carefully over time, though this may take a while, as I haven’t had much time recently.

For the Claude feature, since I cannot use Claude in my location, I can only review the other components related to it.

By the way, for some business logic or pure functions, adding unit test cases would be appreciated. Although this work can be tedious, it can reduce some manual testing and also make it easier for others to understand the intent of your code :)

Comment thread lib/conversion.py Outdated
Comment on lines +250 to +306

# Translate metadata in background job before cache.done()
# This avoids UI freeze in translate_done() which runs in GUI thread
if not cache_only and convertor == convert_book:
config = get_config()
ebook_metadata_config = config.get('ebook_metadata') or {}
if ebook_metadata_config:
try:
from calibre.ebooks.metadata.meta import get_metadata as read_metadata

with open(output_path, 'r+b') as file:
metadata = read_metadata(file, 'epub')

# Disable streaming for metadata
original_stream = translator.stream
translator.stream = False

def translate_and_cache(field_name, value):
result = translator.translate(value)
if hasattr(result, '__iter__') and not isinstance(result, str):
result = ''.join(result)
if result and result.strip():
cache.set_info('translated_' + field_name, result.strip())

# Translate each field if enabled
if metadata.title and ebook_metadata_config.get('translate_title', False):
translate_and_cache('title', metadata.title)
if metadata.series and ebook_metadata_config.get('translate_series', False):
translate_and_cache('series', metadata.series)
if metadata.author_sort and ebook_metadata_config.get('translate_creator_file_as', False):
translate_and_cache('author_sort', metadata.author_sort)
if metadata.publisher and ebook_metadata_config.get('translate_publisher', False):
translate_and_cache('publisher', metadata.publisher)
if metadata.rights and ebook_metadata_config.get('translate_rights', False):
translate_and_cache('rights', metadata.rights)
if metadata.comments and ebook_metadata_config.get('translate_description', False):
translate_and_cache('description', metadata.comments)
if metadata.book_producer and ebook_metadata_config.get('translate_contributor', False):
translate_and_cache('book_producer', metadata.book_producer)

# Translate authors list
if metadata.authors and ebook_metadata_config.get('translate_creator', False):
translated_authors = []
for author in metadata.authors:
result = translator.translate(author)
if hasattr(result, '__iter__') and not isinstance(result, str):
result = ''.join(result)
if result and result.strip():
translated_authors.append(result.strip())
if translated_authors:
cache.set_info('translated_authors', '||'.join(translated_authors))

translator.stream = original_stream
log.info('Metadata translation completed in background')
except Exception as e:
log.warn('Failed to translate metadata in background: %s' % e)

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata is already stored in the cache with the book content (perhaps its representation in the UI could be improved), and users can fully control the translation progress. Is there a reason to implement separate translation logic for the metadata?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My pleasure! My understanding is that there is no cache for the translate metadata values, or maybe I am mistaken and/or misunderstood what you meant?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the cache stores elements extracted from three types of pages, as follows:

elements.extend(get_metadata_elements(oeb.metadata))
elements.extend(get_toc_elements(oeb.toc.nodes, []))
elements.extend(get_page_elements(oeb.manifest.items))

The metadata elements to be translated are defined here:

names = (
'title', 'creator', 'publisher', 'rights', 'subject', 'contributor',
'description')

Once the metadata_translation configuration is enabled, these items will be translated.


I think enumerating all metadata names and making them selectable is unnecessary, because we cannot predict every case. Maintaining a metadata list should be sufficient.

It would be beneficial for all these elements to share a common translation logic, covering all scenarios, including applying user configurations and handling various exceptions.

@seidnerj

Copy link
Copy Markdown
Contributor Author

Architecture Update: Separate Metadata/TOC Translation

Thank you for the feedback about maintaining a unified architecture. After extensive investigation and testing, I've implemented a hybrid solution that addresses your concerns while working around a fundamental Calibre limitation.

Investigation Results:

I thoroughly tested the unified element-based approach where metadata is translated during OEB processing alongside content. The findings:

  • Metadata elements ARE extracted and translated correctly
  • Translations ARE applied to element.content during OEB processing
  • However, none of the metadata translations appear in the final EPUB file

After examining Calibre's source code (plumber.py, oeb/base.py, conversion plugins), I confirmed that:

  • The same OEB instance flows through the entire pipeline
  • Metadata Item objects are references (modifications should persist)
  • to_opf2() writes element.value directly to the OPF file

Theoretically, the unified approach should work. But empirical testing shows it doesn't - metadata remains untranslated in the output file.

New Architecture:

The current implementation uses a hybrid approach:

  1. Content/TOC: Continue using the unified element system (unchanged)
  2. Metadata: Separate translation system with individual field handling

Why This Is Necessary:

  • Each metadata field stored separately in cache (not merged)
  • Visible in Advanced Mode UI with "Type" column indicator
  • Translates during OEB phase (same timing as content)
  • Bypasses ElementHandler to avoid merging
  • Only enabled fields appear in UI

Benefits:

  • ✅ Addresses the Calibre limitation that prevents unified metadata translation
  • ✅ Each metadata field individually cached and displayed
  • ✅ Works with old caches (automatic synchronization)
  • ✅ Clean UI separation (Metadata/TOC/Content types)
  • ✅ Maintains shared cache infrastructure

Cache Compatibility:

  • Automatically detects old cache structure
  • Adds missing metadata/TOC entries
  • Removes obsolete merged entries
  • Logs all migrations for transparency

The code is more complex than pure unification, but it's the only approach that actually works given Calibre's conversion behavior. I'm open to suggestions if there's a way to make the unified approach work with Calibre's converter.

…OC translation

Cache ID Refactoring:
- Extract get_cache_id() helper function in lib/utils.py
- Single source of truth for cache ID calculation
- Used in: convert_item, translate_done, PreparationWorker
- Formula: uid(input_path + engine_name + target_lang + merge_length + encoding_suffix)

Metadata/TOC Separate Translation Architecture:
- Translate each metadata field individually (not merged with content)
- Store in cache with page='content.opf' for UI display
- Only enabled fields appear in cache and Advanced Mode UI
- Auto-populate title_sort from title, author_sort from creator
- Works with both new and old caches via cache synchronization

TOC Translation:
- Merge all TOC entries into single paragraph
- Store with page='toc.ncx' for UI identification
- Translate as one unit, split back to individual elements

Cache Synchronization:
- Detect old cache structure and add missing metadata/TOC entries
- Remove obsolete entries when structure changes
- Graceful migration from old merged format to new separated format

UI Improvements:
- Add Type column showing Metadata/TOC/Content in Advanced Mode
- Metadata and TOC rows appear before translation
- Alignment checks skip metadata/TOC (content only)
- Add log_content setting for verbose paragraph display

Bug Fixes:
- Fix Cache Manager delete for non-consecutive rows (IndexError)
- Collect selected rows before deletion to avoid index shifting

Settings:
- title_sort/author_sort follow parent fields (no separate settings)
- Remove translate_missing_metadata (not needed with new architecture)

Test Fixes:
- Add log_content to test defaults
- Update tests for new metadata architecture and 3-tuple structure
- Fix ElementHandlerMerge to use 5-tuple (no page_id)
@seidnerj seidnerj changed the title feat(anthropic): add UI controls for Claude extended output and context beta features feat: v2.4.2 - Claude enhancements, metadata translation architecture, and RTL/LTR ebook support Jan 12, 2026
@bookfere

Copy link
Copy Markdown
Owner

I appreciate your valuable time and effort. However, I would like to make some clarifications.

Theoretically, the unified approach should work. But empirical testing shows it doesn't - metadata remains untranslated in the output file.

This is not true.

When testing metadata translation (on the master branch), you should enable the "Metadata Translation" option and make sure to delete any existing cache.

If the cache is generated while the option is disabled, the metadata element will be marked as ignored. Even if you re-enable the option, it will not be retranslated or shown in the UI while using the same cache (this unexpected behavior can be fixed).

The three element types I mentioned in the previous comment are all based on the Element class and implement the same methods, so the ElementHandler class can handle them seamlessly. Separating them is unnecessary.

Metadata should not be treated as a separate entity; it is simply a set of attributes stored in an XML file (.opf), which is no different from .ncx and .xhtml files. Cataloging them with the identifier content.opf as follows is sufficient.

Overall, the original business logic for metadata translation has no issues. We only need to make the metadata element more explicit in the UI and ensure that when a user re-enables the "Metadata Translation" option, it can be translated correctly.

I suggest that this PR only provides the Claude feature, and the other two changes be submitted as separate PRs.

@seidnerj

Copy link
Copy Markdown
Contributor Author

Closing this PR to split into separate focused PRs as requested:

  1. Claude enhancements (prompt caching, batch API, extended output/context)
  2. RTL/LTR ebook formatting support
  3. Metadata translation architecture

This will make review easier and allow merging features independently.

@seidnerj seidnerj closed this Jan 12, 2026
@seidnerj

Copy link
Copy Markdown
Contributor Author

Split this into 3 separate PRs:

#540
#541
#542

You can review each individually and we can discuss which changes might be needed. I think 1 and 2 are not controversial. Re the 3rd one, I honest could never get metadata translation to work, no matter what I tried, so I either I missing something or one us has incorrect assumptions about something. Happy to discuss - I just want this to work! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants