Skip to content

feat: Claude enhancements - prompt caching, batch API, and translation improvements#540

Open
seidnerj wants to merge 4 commits intobookfere:masterfrom
seidnerj:pr/claude-enhancements
Open

feat: Claude enhancements - prompt caching, batch API, and translation improvements#540
seidnerj wants to merge 4 commits intobookfere:masterfrom
seidnerj:pr/claude-enhancements

Conversation

@seidnerj
Copy link
Copy Markdown
Contributor

@seidnerj seidnerj commented Jan 12, 2026

Summary

Advanced translation features specifically for the Claude (Anthropic) engine, including cost-saving optimizations and reliability improvements.

Translation Refusal Detection & Retry

  • When prompt caching provides full book context, Claude may refuse to translate content it identifies as copyrighted
  • Enhanced system prompt with anti-refusal language for both standard and cached modes
  • Two-stage refusal detection: fast heuristic pre-filter (pattern matching) followed by LLM classification to avoid false positives
  • Automatic retry (up to 3 attempts) when refusal is detected in translate_paragraph
  • Handles edge cases like translating copyright-related text to English without false positives

Prompt Caching (90% Cost Reduction)

  • Enable parallel section translation with full book context
  • Cache full book text as system message
  • 90% input token cost reduction on subsequent sections
  • Works with existing merge_length for section sizing
  • UI checkbox in Claude settings (default: disabled)

Batch Translation API (50% Cost Reduction)

  • New asynchronous bulk translation engine: ClaudeBatchTranslate
  • Process up to 100,000 messages per batch
  • 50% cost reduction compared to standard API
  • Combined savings: Up to 84% total cost with prompt caching
  • Trade-off: Processing time up to 24 hours (most complete <1 hour)
  • Automatic polling with progress logging

Dynamic Token Management

  • Automatic max_tokens scaling based on input length
  • Model-specific output limits (4K-128K depending on model)
  • Pre-translation warning if merge length exceeds model capacity
  • Prevents partial/truncated translations
  • Language-aware token estimation helper

Extended Output/Context (Beta Features)

  • Extended output: 128K tokens (Claude 3.7 Sonnet)
  • Extended context: 1M token window (Claude Sonnet 4.0/4.5)
  • Conditional UI visibility based on selected model
  • UI checkboxes for opt-in beta features

Dynamic Timeout (Opt-in)

  • Content-based timeout scaling for large translations
  • Example: 50K chars → 6.5min, 200K chars → 23min
  • Prevents timeout failures on very large merged sections
  • UI checkbox (default: disabled)

Streaming Improvements

  • Immediate stop button response during streaming
  • Preserve scroll position when adding streamed text
  • Prevent text corruption on clicks during streaming
  • Qt5/Qt6 compatibility

Testing

  • Prompt caching with full context
  • Batch API submission and retrieval
  • Dynamic max_tokens prevents truncation
  • Extended output/context settings
  • Streaming stop button works immediately
  • Timeout scaling handles large content
  • Refusal detection with copyrighted ebook content
  • Refusal retry loop completes successfully
  • No false positives when translating copyright-related text

Version

Bumped to v2.4.2 with comprehensive CHANGELOG

seidnerj and others added 2 commits March 14, 2026 15:30
…vements

- Add prompt caching support with configurable cache control headers
- Add batch API support for cost-efficient bulk translations
- Add UI controls for extended output and context beta features
- Improve streaming text insertion and stop button functionality
- Prevent partial translations with dynamic max_tokens and opt-in
  timeout scaling
- Bump version to 2.4.2
When prompt caching is enabled and the full book context is in the
system prompt, Claude sometimes refuses to translate content it
identifies as copyrighted. This adds:

- Anti-refusal language to the base and cached system prompts
- Two-stage refusal detection: heuristic pre-filter (pattern matching)
  followed by LLM classification to avoid false positives
- Automatic retry loop (up to 3 attempts) in translate_paragraph
- install.sh script for quick local Calibre plugin installation
@seidnerj seidnerj force-pushed the pr/claude-enhancements branch from 62ad7c5 to 2b65a60 Compare March 14, 2026 13:30
Add partial-translation refusal patterns (scope limiting, offering
alternatives) and bump refusal_max_retries from 3 to 5.
- Raise TranslationFailed when all refusal retries are exhausted instead
  of silently saving the refusal text as the translation
- Fix cache_enabled defaulting to disabled when config value is None
  (use `is not False` so None defaults to enabled)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant