Skip to content

4.25.0

Compare
Choose a tag to compare
@github-actions github-actions released this 10 Dec 04:23
· 128 commits to main since this release

Changes (Release 4.25.0)

Summary (pypdfium2)

  • Removed multiprocessing from deprecated PdfDocument.render() API and replaced with linear rendering. See below for more info.
  • setup: Fixed blunder in headers cache logic that would cause existing headers to be always reused regardless of version. Note, this did not affect release workflows, only local source re-installs.
  • Show path of linked binary in pypdfium2 -v.
  • conda: Improved installation docs and channel config.
  • conda/workflows: Added ability to (re-)build pypdfium2_raw bindings with any given version of pdfium. Fixes {issue}279.
  • Made reference bindings more universal by including V8, XFA and Skia symbols. This is possible due to the dynamic symbol guards.
  • Instruct ctypesgen to exclude some unused alias symbols pulled in from struct tags.
  • Improved issue templates, added pull request template.
  • Improved ctypesgen (pypdfium2-team fork).

Rationale for PdfDocument.render() deprecation

  • The parallel rendering API unfortunately was an inherent design mistake: Multiprocessing is not meant to transfer large amounts of pixel data from workers to the main process.
  • This was such a heavy drawback that it basically outweighed the parallelization, so there was no real performance advantage, only higher memory load.
  • As a related problem, the worker pool produces bitmaps at an indepedent speed, regardless of where the receiving iteration might be, so bitmaps could queue up in memory, possibly causing an enormeous rise in memory consumption over time. This effect was pronounced e.g. with PNG saving via PIL, as exhibited in Facebook's nougat project.
  • Instead, each bitmap should be processed (e.g. saved) in the job which created it. Only a minimal, final result should be sent back to the main process (e.g. a file path).
  • This means we cannot reasonably provide a generic parallel renderer, instead it needs to be implemented by callers.
  • Historically, note that there had been even more faults in the implementation:
    • Prior to 4.22.0, the pool was always initialized with os.cpu_count() processes by default, even when rendering less pages.
    • Prior to 4.20.0, a full-scale input transfer was conducted on each job (rendering it unusable with bytes input). However, this can and should be done only once on process creation.
  • pypdfium2's rendering CLI cleanly re-implements parallel rendering to files. We may want to turn this into an API in the future.

Due to the potential for serious issues as outlined above, we strongly recommend that end users update and dependants bump their minimum requirement to this version. Callers should move away from PdfDocument.render() and use PdfPage.render() instead.

pypdfium2 commit log

Commits between 4.24.0 and 4.25.0 (latest commit first):

  • 45dbb2c [autorelease main] update 4.25.0
  • 6f5df31 workflows/main: minor comments cleanup
  • 4c0aee6 conda: get rid of effectively dead bundling code
  • 816f421 craft_packages: cut overly extensive comments
  • d140de2 Handle latest version separately for conda pdfium/pypdfium2_raw
  • 63a17ba conda: sync pypdfium2_raw schedule with pdfium-binaries
  • cdb1fe4 instruct ctypesgen to exclude some garbage symbols
  • adf89b0 docs correction
  • afc4740 reinstall schedule
  • 6024f15 Make document-level renderer harmless by linearization (#282)
  • 7c53bd1 temporarily inhibit schedule
  • 0fe8578 ctypesgen: add --no-macro-guards
  • 8f0a425 readme: update ABI bindings section
  • 6a0a67b get_text_range: slightly enhance docs
  • 5886522 improve changelog/template, add tasks
  • acd1719 conda/recipes: do not wrap env vars in quotes after all
  • f3cc2dd continue on refbindings
  • 0822682 continue on PR template (setup)
  • e26b24d Prepare changelog
  • f93532d setup: avoid explicit mention of clang
  • ad56b56 docs: continue on conda
  • 54d2ffc craft/conda: update a code comment
  • 59dc178 issue-templates: rename "package" to "install" in titles
  • 9a1dcd3 Inline build version handling in craft_packages
  • 3b418eb conda_raw: handle rebuilds (#280)
  • ecb77d2 Add skia to refbindings flags (+ thoughts on external headers)
  • 01727fc setup-miniconda: correct channel prio
  • a5f2c5b PR template nits
  • 100d7e4 Tighten issue template descriptions
  • ddc8abc refbindings: define feature flags
  • 136a655 setup: fix blunder in headers cache logic
  • 67de448 Continue on PR template
  • a1037a3 Add draft pull request template
  • 9f54a76 issue template generic: clarify checkbox 2 (CC #277)
  • c5c558a Add note on version info
  • 8b454c0 nits
  • 24839ea conda: slightly improve script and recipes
  • 9e73c75 wf/conda: try to make sure we install the built package
  • 8a438f8 readme: conda section again
  • 2fbac25 workflows/conda: remove a redundant layer of channels
  • c6dd1d1 readme: rework conda section again
  • 28649ad issues/conda: check python executable
  • 07d336e readme: revise conda instructions
  • 45e8f0c cli/version: show only libpath rather than whole loader info
  • 81f24de dep5-wheel: add helpers version file
  • 09d194f manifest: fix missing reuse/dep5 include blunder
  • 1cf9442 req: sunset defaults.txt
  • d95efed req: test implies converters
  • 9e29101 workflows/main: reinstall monthly schedule
  • a2c79ad slightly update readme
  • 15a35ee nit: move variable
  • e289d61 readme: explain state of system install option
  • a67456d version: dump library loader info
  • bdbf30d musl: update comment
  • d4aa65a update changelog
  • 2dfc4f1 slightly improve prev commit
  • d4c1ef3 Require explicit version with prepared target
  • 30c60af Revert "temporarily comment out testpypi upload"
PDFium commit log

Commits between 6110 and 6164 (latest commit first):

  • 7388bd02f Make WideStringToBuffer() call Utf16EncodeMaybeCopyAndReturnLength()
  • b0ab5e964 Remove unnecessary argument from FuseSurrogates().
  • dbf4b0a4c Tidy GN files by introducing group pdfium_pa.
  • d324c7218 Remove unused FlateEncode() / FlateDecode()
  • 4419e5152 Add comments about ByteString::GetBuffer() and ReleaseBuffer().
  • 069ebe4c9 Introduce result struct for IPWL_FillerNotify::OnBeyforeKeyStroke().
  • c65f45e6b Make CXFA_EventParam constructor take a type argument
  • 5b91a0148 Remove FX_UTF8Decode() in favor of WideString::FromUTF8()
  • d06523d84 Make WideString's FromUTF16BE / FromUTF16LE do surrogate fusing
  • e2704cba8 Use dedicated struct instead of std::pair in ExecuteBoolScript()
  • 28c6c6dc7 Remove out-parameter from CFXJSE_Context::ExecuteScript()
  • ce9900139 Simplify tests that use FORM_GetSelectedText()
  • 03c23083a Use spans in more places
  • 8b2380dce Add pdfium::as_byte_span() helper
  • bee6d0b15 Remove out parameter from DynPropGetterAdapter().
  • 40d92c45b Make WideString's FromUTF16LE(), FromUTF16BE() take bytes, not wchar_t
  • 8c2fc5da8 Roll base/allocator/partition_allocator/ 4d90e004b..6800d0930 (7 commits) https://chromium.googlesource.com/chromium/src/base/allocator/partition_allocator.git/+log/4d90e004b935..6800d0930f06
  • 6bef48cda Unify object type detection code inside cpdf_document.cpp.
  • 4d0aaaa07 Consistently name variables in CPDF_Document::InsertDeletePDFPage()
  • d1debc773 In PDF_DecodeText() UTF-16, do surrogate fusion before language code stripping
  • b98c5b4c0 Clean up CountPages()
  • 43d835b47 Fix object type detection in CPDF_Document::InsertDeletePDFPage()
  • e4424849c Update reclient_version to 0.120.1.f75cfb7-gomaip
  • 0db38df32 Add FPDFPPOEmbedderTest.ImportIntoDocWithWrongPageType test case
  • 8df2e8494 Roll Catapult from 47efdb4b1428 to f0228fa92b0a (63 revisions)
  • aa7d390b4 Add support for UTF-8 text strings
  • 9b8ac25af Simplify PDF_DecodeText() and new helper functions
  • 5adcad9d3 Roll third_party/skia/ 8e9e16841..3a79d7a61 (69 commits)
  • e2b69c4bb Sanity check the inputs to Blend()
  • 6e7f70b39 Replace implicit dependency on global SkFontMgr
  • 1e9d89db3 Extract language code stripping from PDF_DecodeText() into function
  • 445b54a73 Extract surrogate fusing from PDF_DecodeText() into own separate function
  • bb8fd49d1 Simplify Blend()
  • 61307c2ad Unit test Blend()
  • 2962e6ca2 Move Blend() function to its own file.
  • a5bb284fd Roll Zlib from dfc48fc4de8e to 5daffc716bb6 (6 revisions)
  • aae740cc1 Re-organize cppgc::Member<> members
  • 16b2fa3fc Clean up CPDF_PageOrganizer::Init()
  • acddfedb7 CHECK() the bitmap argument in CPDF_RenderStatus::CompositeDIBitmap()
  • ad80a04d3 Split out skcms_sources into multiple GN targets.
  • c512857c0 Roll third_party/skia/ 77aeee3b8..8e9e16841 (187 commits)
  • cba9a3c1d Properly support the use_system_libtiff GN build option
  • fa80feef8 Avoid setting the private tif_fd field in struct tiff
  • 25df6e84c Mention "document outline" in public/fpdf_doc.h
  • 8b1177f2d Fix undefined behavior in FXSYS_wcsnicmp()
  • 5636e90a8 Allow ProcessCrossRefV5Entry() to overwrite existing entries
  • 154e17543 Prefer ClearAndDelete() to delete ExtractAsDangling().
  • d82c698a1 Upgrade vpython3 and wheels
  • 5d87ac6ec Remove "six" python wheel
  • ea0263079 Move fonts used for pixel tests to their own directories
  • 56a444f10 Replace Copy() with operator=() in CPDF_{All,Graphic}States
  • 5746eb685 Rename CPDF_GraphicStates::DefaultStates() to SetDefaultStates().
  • 747015873 Stop inheriting from CPDF_GraphicStates
  • 08f11e596 Roll libpng from 7e1f7e7b1063 to 1db23788f5aa (1 revision)
  • 676f13456 Roll Depot Tools from 73b69b016703 to ea9bf7f343d3 (50 revisions)
  • 1bfae352b Roll Code Coverage from f06a56e5b449 to 61632b07bdc6 (2 revisions)
  • f3b5f3db3 Encapsulate CPDF_GraphicStates member variables
  • 3c2845720 Encapsulate CPDF_AllStates member variables
  • a2dc6ecce Ensure exactly 1 WCHAR_T_IS_*_BIT define is defined
  • 18a2f3c02 Add some unit tests for FXSYS_wcsnicmp()
  • a3ce9f4ae Roll base/allocator/partition_allocator/ 0d03e4082..4d90e004b (14 commits)
  • 2af0a20e3 Fix how FPDFText_LoadFont() detects font glyph count
  • cc923cac5 Add an embedder test to demonstrate a FPDFText_LoadFont() failure
  • 57d1f79d4 Avoid the NULL, 0 undefined behavior problem in CFX_GlyphCache
  • 4da226a07 Change GetFileContents() test utility to return a vector
  • 1bcae281c Roll build/ ab8815d43..292639dc3 (1 commit)
  • 4a4d922ba Roll build/ f4167331a..ab8815d43 (29 commits)
  • 7233e99fc Use span in Processor::ProcessPdf()
  • f0e6edf81 Use pdfium::base::checked_cast() in a few test files
  • 1225d9d80 Remove out-parameter from PathService::GetTestFilePath()
  • 16bebb03b Extend PDFEditImgTest.NewImageObjLoadJpeg to write out a PDF
  • c930b5516 Add missing stdint.h include in span.h
  • 42154f4c1 Rename some "remain" variables to "remaining"
  • 6a34da391 Fix misalignment between Redo and Undo after consecutive text pasting.
  • 5f814e878 Fulfill a TODO in cpdf_dib.cpp
  • 9be49d4b0 Add rust_build_tests to pdfium_all when enable_rust=true
  • ae960583a Roll Instrumented Libraries from 032e9c850ab9 to 48a6beefc1bb (2 revisions)
  • 7a7c86c68 Roll Memory Tools from 8b06a5370188 to bb03b820532d (1 revision)
  • d336e56dc Roll third_party/skia/ e8c78601e..77aeee3b8 (221 commits)
  • 650399315 Update reclient_version to 0.118.1.ae3c3be-gomaip
  • 75815063c Roll base/allocator/partition_allocator/ 6f90cb04a..0d03e4082 (16 commits)
  • f76cff73a Roll third_party/libunwind/src/ 7608093d2..69b8c6469 (6 commits)
  • 70062cf7a Roll v8/ 1fb69d9f5..06aba4270 (284 commits)
  • e21a7e389 Roll fuchsia_gn_sdk_revision and fuchsia sdk