Skip to content

arc53/fast-ebook

Repository files navigation

fast-ebook

PyPI version CI

Rust-powered EPUB2/EPUB3 library for Python. Fast reading, writing, validation, and markdown conversion — MIT-licensed.

How fast?

Converting War and Peace (1.8 MB EPUB) to a single markdown document:

Library Time
ebooklib + html2text 375 ms baseline
fast-ebook book.to_markdown() 56 ms 6.7x faster
from fast_ebook import epub

book = epub.read_epub('war_and_peace.epub')
markdown = book.to_markdown()

Other operations on the same book — read+extract every chapter is 3x faster, get_item_with_id is 78x faster. Full numbers and methodology: docs/benchmarks.md.

Installation

pip install fast-ebook

Quick Start

Migration from ebooklib

The public API mirrors ebooklib — for most code you only need to change the import:

from fast_ebook import epub
import fast_ebook  # for ITEM_* constants

Or use the drop-in compatibility layer for a one-line change:

import fast_ebook.compat as ebooklib
from fast_ebook.compat import epub

Read

from fast_ebook import epub
import fast_ebook

book = epub.read_epub('book.epub')

print(book.get_metadata('DC', 'title'))

for img in book.get_items_of_type(fast_ebook.ITEM_IMAGE):
    print(img.get_name(), len(img.get_content()), 'bytes')

item = book.get_item_with_id('chapter1')

for entry in book.toc:
    print(entry.title, entry.href)

Also accepts Path, bytes, or BytesIO. See docs/api.md for the full reading API, options (lazy, ignore_ncx, ignore_nav), and the context manager form.

Write

from fast_ebook import epub

book = epub.EpubBook()
book.set_identifier('id123')
book.set_title('My Book')
book.set_language('en')
book.add_author('Author Name')

c1 = epub.EpubHtml(title='Intro', file_name='chap_01.xhtml', lang='en')
c1.content = '<h1>Hello</h1><p>World</p>'

book.add_item(c1)
book.add_item(epub.EpubNcx())
book.add_item(epub.EpubNav())

book.toc = [epub.Link('chap_01.xhtml', 'Introduction', 'intro')]
book.spine = ['nav', c1]

epub.write_epub('output.epub', book)

Convert to Markdown

md = epub.read_epub('book.epub').to_markdown()

Parallel batch read

from fast_ebook import epub

books = epub.read_epubs(['a.epub', 'b.epub', 'c.epub'], workers=4)

Documentation

License

MIT

About

fast, rust based epub library for python

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors