Changes (Release 5.0.0b1)
Summary (pypdfium2)
API changes
- Rendering / Bitmap
- Removed
PdfDocument.render()
(see deprecation rationale in v4.25 changelog). Instead, usePdfPage.render()
with a loop or process pool. - Removed
PdfBitmap.get_info()
andPdfBitmapInfo
, which existed mainly on behalf of data transfer withPdfDocument.render()
. Instead, take the info from thePdfBitmap
object directly. (If using an adapter that copies, you may want to store the relevant info in variables to avoid holding a reference to the original buffer.) PdfBitmap.fill_rect()
: Changed argument order. Thecolor
parameter now goes first.PdfBitmap.to_numpy()
: If the bitmap is single-channel (grayscale), use a 2d shape to avoid needlessly wrapping each pixel value in a list.PdfBitmap.from_pil()
: Removedrecopy
parameter.
- Removed
- Pageobjects
- Renamed
PdfObject.get_pos()
to.get_bounds()
. - Renamed
PdfImage.get_size()
to.get_px_size()
. PdfImage.extract()
: Removedfb_render
option because it does not fit in this API. If the image's rendered bitmap is desired, use.get_bitmap(render=True)
in the first place.
- Renamed
PdfDocument.get_toc()
: ReplacedPdfOutlineItem
namedtuple with method-oriented wrapper classesPdfBookmark
andPdfDest
, so callers may retrieve only the properties they actually need. This is closer to pdfium's original API and exposes the underlying raw objects. Provides signed count as-is rather than splitting inn_kids
andis_closed
. Also distinguishes betweendest is None
and a dest with unknown mode.- Renamed misleading
PdfMatrix.mirror()
parametersv, h
toinvert_x, invert_y
, as the terms horizontal/vertical flip commonly refer to the transformation applied, not the axis around which is being flipped (i.e. the previousv
meant flipping around the Y axis, which is vertical, but the resulting transform is inverting the X coordinates and thus actually horizontal). No behavior change if you did not use keyword arguments. get_text_range()
: Removed implicit translation of default calls toget_text_bounded()
, as pdfium revertedFPDFText_GetText()
to UCS-2, which resolves the allocation concern. However, callers are encouraged to explicitly useget_text_bounded()
for full Unicode support.- Removed legacy version flags.
Improvements and new features
- Added
PdfPosConv
andPdfBitmap.get_posconv(page)
helper for bidirectional translation between page and bitmap coordinates. - Added
PdfObject.get_quad_points()
to get the corner points of an image or text object. - Exposed
PdfPage.flatten()
(previously semi-private_flatten()
), after having found out how to correctly use it. Added check and updated docs accordingly. - With
PdfImage.get_bitmap(render=True)
, addedscale_to_original
option (defaults to True) to temporarily scale the image to its pixel size. Thanks to Lei Zhang for the suggestion. - Added context manager support to
PdfDocument
, so it can be used in awith
-statement, because opening from a file path binds a file descriptor (usually on the C side), which should be released explicitly, given OS limits. - If document loading failed,
err_code
is now assigned to thePdfiumError
instance so callers may programmatically handle the error subtype. - In
PdfPage.render()
, added a new optionuse_bgra_on_transparency
. If there is page content with transparency, using BGR(x) may slow down PDFium. Therefore, it is recommended to set this option to True if dynamic (page-dependent) pixel format selection is acceptable. Alternatively, you might want to use only BGRA viaforce_bitmap_format=pypdfium2.raw.FPDFBitmap_BGRA
(at the cost of occupying more memory compared to BGR). - In
PdfBitmap.new_*()
methods, avoid use of.from_raw()
, and instead call the constructor directly, as most parameters are already known on the caller side when creating a bitmap. - In the rendering CLI, added
--invert-lightness --exclude-images
post-processing options to render with selective lightness inversion. This may be useful to achieve a "dark theme" for light PDFs while preserving different colors, but goes at the cost of performance. (PDFium also provides a color scheme option, but this only allows you to set colors for certain object types, which are then forced on all instances of the type in question. This may flatten different colors into one, leading to a loss of visual information.) - Corrected some null pointer checks: we have to use
bool(ptr)
rather thanptr is None
. - Improved startup performance by deferring imports of optional dependencies to the point where they are actually needed, to avoid overhead if you do not use them.
- Simplified version classes (no API change expected).
Platforms
- Experimental Android support added (cf. PEP 738).
arm64_v8a
,armeabi_v7a
,x86_64
,x86
are now handled in setup and should implicitly download the right binaries. We do not publish any android wheels at this time (for one thing, PyPI/warehouse does not support them yet). However, we might want to packagearm64_v8a
(and maybearmeabi_v7a
) wheels in the future. Note, android support is provided on a best effort basis, and largely untested (only arm64 Termux prior to PEP 738 has been tested on the author's phone). Please report success or failure. - Experimental iOS support added as well (cf. PEP 730).
arm64
device and simulator, andx86_64
simulator are now handled and should implicitly download the right binaries. However, this is untested and may not be enough to get all the way through. In particular, the PEP hints that the binary needs to be moved to a Frameworks location, in which case you'd also need to change the library search path. No iOS wheels will be provided at this time. However, if there are testers and an actual demand, iOS arm64 wheels may be enabled in the future. - Note, we have no intent to package wheels for the simulators (
android x86_64/x86
,ios arm64_simu/x86_64
), as they are only relevant to developers, and installing from source with implicit binary download should be roughly equialvent.
Setup
- Avoid needlessly calling
_get_libc_ver()
. Instead, call it only on Linux. A negative side effect of calling this unconditionally is that, on non-Linux platforms, an empty string may be returned, in which case the musllinux handler would be reached, which uses non-public API and isn't meant to be called on other platforms (though it seems to have passed). - If packaging with
PDFIUM_PLATFORM=sourcebuild
, forward the platform tag determined bybdist_wheel
's wrapper, rather than using the underlyingsysconfig.get_platform()
directly. This may provide more accurate results, e.g. on macOS.
Project
- Made the runfile fail fast and propagate errors via bash
-eu
. This is actually quite important to avoid potentially continuing on a broken state in CI. - CI: Added Linux aarch64 (GH now provides free runners) and Python 3.13 to the test matrix.
- Merged
tests_old/
back intotests/
. - Migrated from deprecated
.reuse/dep5
/.reuse/dep5-wheel
to more visibleREUSE.toml
/REUSE-wheel.toml
. - Docs: Improved logic when to include the unreleased version warning and upcoming changelog.
- Bumped minimum pdfium requirement in conda recipe to
>6635
(effectively>=6638
), due to new errchecks that are not version-guarded. - Cleanly split out conda packaging into an own file, and confined it to the
conda/
directory, to avoid polluting the main setup code.
pypdfium2 commit log
Commits between 4.30.1
and 5.0.0b1
(latest commit first):
8ea3c9f
[autorelease main] update 5.0.0b1c439329
fix typos6d04fac
Take out android for now28ef5b3
docs/conf.py: take out problematic assertion936314f
CI/debugging: addverbose: true
0882a90
conda_raw: fix upload pathefcf088
conda_raw: temporarily comment out schedulee2a760e
fix spelling029327a
v5 devel branch (#307)9e55a45
Comment TestPyPI back in
PDFium commit log
Commits between 6899
and 6996
(latest commit first):
012fe571c
Fix unnecessary tree traversal in SearchNameNodeByNameInternal()3c2bfd785
Refactor SearchNameNodeByNameInternal()a9f2f0f33
Use CIDToGIDMap to fill font widths in FPDFText_LoadCidType2Font()0d2d104ba
Roll goldctl from 78856799f02f to 9389855cfb14d69e9855e
Add even better compiler-support section to README.md6c386f729
Always initialize CFX_SkiaDeviceDriver::m_bRgbByteOrdera78c76720
Add supported compilers section to README.mdef5fcdf6e
Remove some MSVC-specific codefa6581277
Allow options and input files in any order in pdfium_test170de1e03
Fix stack-use-after-scope in pdfium_test2febc2869
Fix FPDFText_GetLooseCharBox() to handle rotation4c7464b07
Add tests to show FPDFText_GetLooseCharBox() bug with rotated text89a94c1b9
Fix test helper to get correct indices from rotated_text.pdf603caea4e
Add a helper to FPDFTextEmbedderTest for use with rotated_text.pdfda069983b
Roll libpng from cf7c36ed084c to 28213bcabe21 (1 revision)859f92a77
Check the font width array generated by FPDFText_LoadCidType2Font()efe66807a
Add GetWidthsArrayForCidFont() helper to fpdf_edit_embeddertest.cppf6da7d235
Add comment for subtle code in CPDF_StreamContentParserfab1b6d64
Add debugging data to help diagnose a hang in fread()6be4f3be7
Rename pdfium_unsafe_buffers_paths.txt filee99f1e8d5
Avoid out of bounds crash when reading fonts594caeb0e
Avoid fixed-offset NULL-deref in XFA_Node::InsertChildAndNotify().aacaea19d
[AGG] Only add positive dash lengths and gap lengthsb4cf887f7
Add pixel test for negative dash scales3cd0a262c
Use AutoRestorer in CPDF_StreamParser::ReadInlineStream()4bc397f60
Rename local variables in CPDF_StreamParser::ReadInlineStream()7420dfeed
Fix pdfium_test in Chromium builds when Skia is enabled by defaultd8b668c01
Making CPDF_SyntaxParser::FindTag(ByteStringView tag) robust320fc870f
Roll Zlib from 82a5fecf8aae to b763971bcaa3 (1 revision)e116b67b1
Fix bad refactoring in CXFA_TextParser::GetFont()67a00b167
Add test showing copies do not happen in fxcrt::Zip().20b8b48e4
Avoid UNSAFE_TODO() in AreColorIndicesOutOfBounds().28cfa3a8a
Remove distinction between input/output views in fxcrt::Zip().ea4eab892
Update documentation and tests for fxcrt::Zip()da206beb2
Make PDFium's compiler_specific.h use clang's UNSAFE_BUFFERS_BUILD4adcb08d8
Roll build, clang, and rust4886ee0d3
Update gn_version to c97a86a72105f3328a540f5a5ab17d11989ab7dded30f70b4
Roll buildtools and libc++16df41e4b
Roll v8/ 3e984a9e0..75be3dcb5 (277 commits)0320375fa
Add third_party/highway dependencyab72191db
Roll third_party/freetype/src/ 0ae7e6073..afc7000ca (9 commits)77e7dee60
Roll v8/ 313e6ed36..3e984a9e0 (189 commits)8198c4e98
Roll third_party/libc++abi/src/ 6c4fa00e4..83dfa1f5b (12 commits)ac5bfacd6
Update reclient_version to re_client_version:0.172.0.3cf60ba5-gomaipd1e80cff3
Roll testing/scripts/rust/ 347b3c20a..6712dc59f (1 commit)0af514970
Roll third_party/llvm-libc/src/ 4c70d6b5a..60b7db20a (87 commits)328507313
Roll third_party/libunwind/src/ 5b01ea4a6..d1e95b102 (4 commits)4ad60e37a
Roll third_party/abseil-cpp/ 0b76dfe4f..72093794a (6 commits)84acf3a55
Roll third_party/googletest/src/ d14403194..7d76a231b (6 commits)7f588b3b4
Roll base/allocator/partition_allocator/ c551156ef..9cab8b0d1 (13 commits)b5d8c977c
Roll third_party/icu/ 4239b1559..bbccc2f6e (5 commits)1ef5cd32e
Roll third_party/skia/ 3db026d62..975788ea9 (248 commits)8d0676e4f
Roll third_party/clang-format/script/ 37f6e68a1..1549a8dba (3 commits)c1992c827
Roll Catapult from 6a0960fe97ab to 86d6f8ee6130 (59 revisions)994b9858b
Roll Code Coverage from 719f1eba4379 to 5e7c277c0d8c (2 revisions)cbf0bb586
Roll Depot Tools from 8d20c1e0b56c to 58625e82c685 (39 revisions)55a8262e3
Roll goldctl from b2da51fa8d3a to db814b551104a4cbdc9ed
Update OpenJPEG to 2.5.3d48287fd9
add missing includes for the build with use_libcxx_modulesb69783fd1
Begin marking unsafe libc functions as UNSAFE_BUFFERS().bea10144d
Roll Instrumented Libraries from 69291a3c7c79 to 3cc43119a291 (2 revisions)
Edit: Removed the *.publish.attestation
files that were inadvertently included in the GH release.