feat: v2.4.2 - Claude enhancements, metadata translation architecture, and RTL/LTR ebook support#537
feat: v2.4.2 - Claude enhancements, metadata translation architecture, and RTL/LTR ebook support#537seidnerj wants to merge 5 commits into
Conversation
… features - Add configuration options for 128K output tokens (Claude 3.7 Sonnet) - Add configuration option for 1M context window (Claude Sonnet 4.0/4.5) - Implement conditional UI checkboxes that show only for relevant models - Refactor get_headers() to use config values instead of hardcoded logic - Update default model to claude-sonnet-4-5 (official API alias) - Add token estimate helper for merge translation feature - Display estimates based on source language and model-specific context windows - Show real-time updates when merge length, source language, or context settings change - Include comprehensive mapping for all Claude 4.5, 4.x, and 3.x model context windows - Properly hide entire rows (labels + checkboxes) when features not applicable - Persist all beta feature settings to plugin configuration
…support
Metadata Translation Enhancements:
- Replace single 'translate all' checkbox with granular field controls
- Add individual checkboxes for: title, creator, creator file-as, publisher, series
- Set language code (dc:language) to target language by default
- Add special handling for calibre:series and author_sort fields
- Translate series and author_sort via direct translation calls post-conversion
- Maintain backward compatibility with old metadata_translation config flag
RTL/LTR Formatting Support:
- Add conditional text-align inline styles (only when source/target directions differ)
- Add RTL/LTR OPF metadata: primary-writing-mode and page-progression-direction
- Only apply formatting when translating between different text directions
- Support bidirectional translation (LTR→RTL and RTL→LTR)
Language Direction Support:
- Expand lang_directionality dictionary with comprehensive RTL language coverage
- Add: Farsi, Dari, Urdu, Yiddish, Pashto (RTL languages)
- Add: French, Portuguese, Russian, Chinese, Japanese, Korean (LTR languages)
- Support both legacy ('iw') and modern ('he') Hebrew codes
- Track source language in ElementHandler for direction comparison
Technical Improvements:
- Build inline styles with semicolon-separated properties
- Add robust error handling for OPF metadata modifications
- Log direction changes and metadata updates for debugging
- Use hasattr checks for OEB structure compatibility across Calibre versions
…timeout scaling Dynamic max_tokens Calculation: - Calculate max_tokens based on estimated output length (input / 3 chars per token) - Use model-specific max output limits per official documentation: - Claude 3.7 Sonnet with extended output: 128K tokens - Claude 4.x models (Sonnet/Haiku/Opus 4.5, Opus 4.1, Sonnet 4.0): 64K tokens - Claude Opus 4.0: 32K tokens - Claude 3.x Haiku: 4K tokens - Other Claude models: 32K default - Add 10% safety buffer to estimated output - Minimum 4096 tokens maintained Dynamic Request Timeout (Opt-in): - Add UI checkbox to enable dynamic timeout (Claude only, default: disabled) - When disabled: Uses user-configured timeout (default 30s) - When enabled: Scales based on content length and token generation speed - Formula: (estimated_output_tokens / 50) + 60s overhead - Conservative 50 tokens/second estimate (Claude generates 65-120 tokens/sec) - Timeout examples when enabled: - 10K chars → 93s (~1.5 min) - 50K chars → 393s (~6.5 min) - 200K chars → 1,393s (~23 min) - Minimum 30 seconds, maximum 2 hours - Restores original timeout after each request Test Fixes: - Initialize source_lang attribute in Element class - Set source_lang on elements during prepare_original (both handlers) - Update test_get_metadata_elements to match new config access pattern - Update Claude translate tests to expect default 30s timeout Source: https://platform.claude.com/docs/en/about-claude/models/overview Source: https://artificialanalysis.ai/models/claude-3-opus
bf1fa8c to
8c45791
Compare
Streaming Text Insertion: - Disable widget updates during text insertion to prevent auto-scroll - Append using separate QTextCursor on document (doesn't affect visible cursor) - Only auto-scroll to bottom if user was already at bottom (watching stream) - Preserve user's scroll position when they scroll up to read earlier content - Qt version compatibility for QTextCursor.End enum (Qt5/Qt6) Stop Button Fix: - Add cancellation check inside streaming generator consumption loop - Check cancel_request() on each character/chunk during streaming - Raise TranslationCanceled immediately when stop is requested - Apply to both single translation and batch mode streaming - Fixes infinite 'Stopping...' state with large merged translations Scroll Synchronization: - Keep simple pixel-based scroll sync (most stable) - Note: Minor drift can occur with word wrap when languages have different lengths - Alternative sync methods (line-based, percentage-based) had worse issues
|
@bookfere Apart from the claude prompt caching and claude batch api - everything else has been tested over and over. Those two, need additional testing. I am hoping we could get the community to help test those. These have massive potential to both give better translations (full context while translating ANY chunk of the book so discrepancies should not happen) and massive reduction in translation cost 🙏🙂🤞 |
|
Thank you very much for contributing such a significant enhancement to the plugin! I’ll review the changes carefully over time, though this may take a while, as I haven’t had much time recently. For the Claude feature, since I cannot use Claude in my location, I can only review the other components related to it. By the way, for some business logic or pure functions, adding unit test cases would be appreciated. Although this work can be tedious, it can reduce some manual testing and also make it easier for others to understand the intent of your code :) |
|
|
||
| # Translate metadata in background job before cache.done() | ||
| # This avoids UI freeze in translate_done() which runs in GUI thread | ||
| if not cache_only and convertor == convert_book: | ||
| config = get_config() | ||
| ebook_metadata_config = config.get('ebook_metadata') or {} | ||
| if ebook_metadata_config: | ||
| try: | ||
| from calibre.ebooks.metadata.meta import get_metadata as read_metadata | ||
|
|
||
| with open(output_path, 'r+b') as file: | ||
| metadata = read_metadata(file, 'epub') | ||
|
|
||
| # Disable streaming for metadata | ||
| original_stream = translator.stream | ||
| translator.stream = False | ||
|
|
||
| def translate_and_cache(field_name, value): | ||
| result = translator.translate(value) | ||
| if hasattr(result, '__iter__') and not isinstance(result, str): | ||
| result = ''.join(result) | ||
| if result and result.strip(): | ||
| cache.set_info('translated_' + field_name, result.strip()) | ||
|
|
||
| # Translate each field if enabled | ||
| if metadata.title and ebook_metadata_config.get('translate_title', False): | ||
| translate_and_cache('title', metadata.title) | ||
| if metadata.series and ebook_metadata_config.get('translate_series', False): | ||
| translate_and_cache('series', metadata.series) | ||
| if metadata.author_sort and ebook_metadata_config.get('translate_creator_file_as', False): | ||
| translate_and_cache('author_sort', metadata.author_sort) | ||
| if metadata.publisher and ebook_metadata_config.get('translate_publisher', False): | ||
| translate_and_cache('publisher', metadata.publisher) | ||
| if metadata.rights and ebook_metadata_config.get('translate_rights', False): | ||
| translate_and_cache('rights', metadata.rights) | ||
| if metadata.comments and ebook_metadata_config.get('translate_description', False): | ||
| translate_and_cache('description', metadata.comments) | ||
| if metadata.book_producer and ebook_metadata_config.get('translate_contributor', False): | ||
| translate_and_cache('book_producer', metadata.book_producer) | ||
|
|
||
| # Translate authors list | ||
| if metadata.authors and ebook_metadata_config.get('translate_creator', False): | ||
| translated_authors = [] | ||
| for author in metadata.authors: | ||
| result = translator.translate(author) | ||
| if hasattr(result, '__iter__') and not isinstance(result, str): | ||
| result = ''.join(result) | ||
| if result and result.strip(): | ||
| translated_authors.append(result.strip()) | ||
| if translated_authors: | ||
| cache.set_info('translated_authors', '||'.join(translated_authors)) | ||
|
|
||
| translator.stream = original_stream | ||
| log.info('Metadata translation completed in background') | ||
| except Exception as e: | ||
| log.warn('Failed to translate metadata in background: %s' % e) | ||
|
|
There was a problem hiding this comment.
The metadata is already stored in the cache with the book content (perhaps its representation in the UI could be improved), and users can fully control the translation progress. Is there a reason to implement separate translation logic for the metadata?
There was a problem hiding this comment.
My pleasure! My understanding is that there is no cache for the translate metadata values, or maybe I am mistaken and/or misunderstood what you meant?
There was a problem hiding this comment.
Currently, the cache stores elements extracted from three types of pages, as follows:
Ebook-Translator-Calibre-Plugin/lib/conversion.py
Lines 178 to 180 in 2232a79
The metadata elements to be translated are defined here:
Ebook-Translator-Calibre-Plugin/lib/element.py
Lines 843 to 845 in 2232a79
Once the metadata_translation configuration is enabled, these items will be translated.
I think enumerating all metadata names and making them selectable is unnecessary, because we cannot predict every case. Maintaining a metadata list should be sufficient.
It would be beneficial for all these elements to share a common translation logic, covering all scenarios, including applying user configurations and handling various exceptions.
Architecture Update: Separate Metadata/TOC TranslationThank you for the feedback about maintaining a unified architecture. After extensive investigation and testing, I've implemented a hybrid solution that addresses your concerns while working around a fundamental Calibre limitation. Investigation Results: I thoroughly tested the unified element-based approach where metadata is translated during OEB processing alongside content. The findings:
After examining Calibre's source code (plumber.py, oeb/base.py, conversion plugins), I confirmed that:
Theoretically, the unified approach should work. But empirical testing shows it doesn't - metadata remains untranslated in the output file. New Architecture: The current implementation uses a hybrid approach:
Why This Is Necessary:
Benefits:
Cache Compatibility:
The code is more complex than pure unification, but it's the only approach that actually works given Calibre's conversion behavior. I'm open to suggestions if there's a way to make the unified approach work with Calibre's converter. |
…OC translation Cache ID Refactoring: - Extract get_cache_id() helper function in lib/utils.py - Single source of truth for cache ID calculation - Used in: convert_item, translate_done, PreparationWorker - Formula: uid(input_path + engine_name + target_lang + merge_length + encoding_suffix) Metadata/TOC Separate Translation Architecture: - Translate each metadata field individually (not merged with content) - Store in cache with page='content.opf' for UI display - Only enabled fields appear in cache and Advanced Mode UI - Auto-populate title_sort from title, author_sort from creator - Works with both new and old caches via cache synchronization TOC Translation: - Merge all TOC entries into single paragraph - Store with page='toc.ncx' for UI identification - Translate as one unit, split back to individual elements Cache Synchronization: - Detect old cache structure and add missing metadata/TOC entries - Remove obsolete entries when structure changes - Graceful migration from old merged format to new separated format UI Improvements: - Add Type column showing Metadata/TOC/Content in Advanced Mode - Metadata and TOC rows appear before translation - Alignment checks skip metadata/TOC (content only) - Add log_content setting for verbose paragraph display Bug Fixes: - Fix Cache Manager delete for non-consecutive rows (IndexError) - Collect selected rows before deletion to avoid index shifting Settings: - title_sort/author_sort follow parent fields (no separate settings) - Remove translate_missing_metadata (not needed with new architecture) Test Fixes: - Add log_content to test defaults - Update tests for new metadata architecture and 3-tuple structure - Fix ElementHandlerMerge to use 5-tuple (no page_id)
|
I appreciate your valuable time and effort. However, I would like to make some clarifications.
This is not true. When testing metadata translation (on the master branch), you should enable the "Metadata Translation" option and make sure to delete any existing cache. If the cache is generated while the option is disabled, the metadata element will be marked as The three element types I mentioned in the previous comment are all based on the Metadata should not be treated as a separate entity; it is simply a set of attributes stored in an XML file (.opf), which is no different from .ncx and .xhtml files. Cataloging them with the identifier Overall, the original business logic for metadata translation has no issues. We only need to make the metadata element more explicit in the UI and ensure that when a user re-enables the "Metadata Translation" option, it can be translated correctly. I suggest that this PR only provides the Claude feature, and the other two changes be submitted as separate PRs. |
|
Closing this PR to split into separate focused PRs as requested:
This will make review easier and allow merging features independently. |
|
Split this into 3 separate PRs: You can review each individually and we can discuss which changes might be needed. I think 1 and 2 are not controversial. Re the 3rd one, I honest could never get metadata translation to work, no matter what I tried, so I either I missing something or one us has incorrect assumptions about something. Happy to discuss - I just want this to work! :) |

Summary
Major release (v2.4.2) adding advanced translation features, comprehensive metadata translation system with background processing, and RTL/LTR ebook formatting support.
Translation Engine Enhancements
Prompt Caching
Batch Translation API
Dynamic Token Management
Dynamic Timeout (Opt-in)
Token Estimation Helper
Beta Feature Controls
Metadata Translation (9 Fields)
Individual Field Controls
Background Processing
Fields Translated
RTL/LTR Ebook Formatting
Complete RTL Support
Language Support
Efficient Implementation
UI/UX Improvements
Streaming Enhancements
Cache Consistency
Version
Test Results
✅ Metadata translation (all 9 fields)
✅ RTL formatting (Hebrew, Arabic)
✅ Background processing (no UI freeze)
✅ Old cache fallback works
✅ Stop button immediate
✅ Scroll preservation