4.25.0
Changes (Release 4.25.0)
Summary (pypdfium2)
- Removed multiprocessing from deprecated
PdfDocument.render()
API and replaced with linear rendering. See below for more info. - setup: Fixed blunder in headers cache logic that would cause existing headers to be always reused regardless of version. Note, this did not affect release workflows, only local source re-installs.
- Show path of linked binary in
pypdfium2 -v
. - conda: Improved installation docs and channel config.
- conda/workflows: Added ability to (re-)build pypdfium2_raw bindings with any given version of pdfium. Fixes {issue}
279
. - Made reference bindings more universal by including V8, XFA and Skia symbols. This is possible due to the dynamic symbol guards.
- Instruct ctypesgen to exclude some unused alias symbols pulled in from struct tags.
- Improved issue templates, added pull request template.
- Improved ctypesgen (pypdfium2-team fork).
Rationale for PdfDocument.render()
deprecation
- The parallel rendering API unfortunately was an inherent design mistake: Multiprocessing is not meant to transfer large amounts of pixel data from workers to the main process.
- This was such a heavy drawback that it basically outweighed the parallelization, so there was no real performance advantage, only higher memory load.
- As a related problem, the worker pool produces bitmaps at an indepedent speed, regardless of where the receiving iteration might be, so bitmaps could queue up in memory, possibly causing an enormeous rise in memory consumption over time. This effect was pronounced e.g. with PNG saving via PIL, as exhibited in Facebook's
nougat
project. - Instead, each bitmap should be processed (e.g. saved) in the job which created it. Only a minimal, final result should be sent back to the main process (e.g. a file path).
- This means we cannot reasonably provide a generic parallel renderer, instead it needs to be implemented by callers.
- Historically, note that there had been even more faults in the implementation:
- Prior to
4.22.0
, the pool was always initialized withos.cpu_count()
processes by default, even when rendering less pages. - Prior to
4.20.0
, a full-scale input transfer was conducted on each job (rendering it unusable with bytes input). However, this can and should be done only once on process creation.
- Prior to
- pypdfium2's rendering CLI cleanly re-implements parallel rendering to files. We may want to turn this into an API in the future.
Due to the potential for serious issues as outlined above, we strongly recommend that end users update and dependants bump their minimum requirement to this version. Callers should move away from PdfDocument.render()
and use PdfPage.render()
instead.
pypdfium2 commit log
Commits between 4.24.0
and 4.25.0
(latest commit first):
45dbb2c
[autorelease main] update 4.25.06f5df31
workflows/main: minor comments cleanup4c0aee6
conda: get rid of effectively dead bundling code816f421
craft_packages: cut overly extensive commentsd140de2
Handle latest version separately for conda pdfium/pypdfium2_raw63a17ba
conda: sync pypdfium2_raw schedule with pdfium-binariescdb1fe4
instruct ctypesgen to exclude some garbage symbolsadf89b0
docs correctionafc4740
reinstall schedule6024f15
Make document-level renderer harmless by linearization (#282)7c53bd1
temporarily inhibit schedule0fe8578
ctypesgen: add --no-macro-guards8f0a425
readme: update ABI bindings section6a0a67b
get_text_range: slightly enhance docs5886522
improve changelog/template, add tasksacd1719
conda/recipes: do not wrap env vars in quotes after allf3cc2dd
continue on refbindings0822682
continue on PR template (setup)e26b24d
Prepare changelogf93532d
setup: avoid explicit mention of clangad56b56
docs: continue on conda54d2ffc
craft/conda: update a code comment59dc178
issue-templates: rename "package" to "install" in titles9a1dcd3
Inline build version handling in craft_packages3b418eb
conda_raw: handle rebuilds (#280)ecb77d2
Add skia to refbindings flags (+ thoughts on external headers)01727fc
setup-miniconda: correct channel prioa5f2c5b
PR template nits100d7e4
Tighten issue template descriptionsddc8abc
refbindings: define feature flags136a655
setup: fix blunder in headers cache logic67de448
Continue on PR templatea1037a3
Add draft pull request template9f54a76
issue template generic: clarify checkbox 2 (CC #277)c5c558a
Add note on version info8b454c0
nits24839ea
conda: slightly improve script and recipes9e73c75
wf/conda: try to make sure we install the built package8a438f8
readme: conda section again2fbac25
workflows/conda: remove a redundant layer of channelsc6dd1d1
readme: rework conda section again28649ad
issues/conda: check python executable07d336e
readme: revise conda instructions45e8f0c
cli/version: show only libpath rather than whole loader info81f24de
dep5-wheel: add helpers version file09d194f
manifest: fix missing reuse/dep5 include blunder1cf9442
req: sunset defaults.txtd95efed
req: test implies converters9e29101
workflows/main: reinstall monthly schedulea2c79ad
slightly update readme15a35ee
nit: move variablee289d61
readme: explain state of system install optiona67456d
version: dump library loader infobdbf30d
musl: update commentd4aa65a
update changelog2dfc4f1
slightly improve prev commitd4c1ef3
Require explicit version with prepared target30c60af
Revert "temporarily comment out testpypi upload"
PDFium commit log
Commits between 6110
and 6164
(latest commit first):
7388bd02f
Make WideStringToBuffer() call Utf16EncodeMaybeCopyAndReturnLength()b0ab5e964
Remove unnecessary argument from FuseSurrogates().dbf4b0a4c
Tidy GN files by introducing group pdfium_pa.d324c7218
Remove unused FlateEncode() / FlateDecode()4419e5152
Add comments about ByteString::GetBuffer() and ReleaseBuffer().069ebe4c9
Introduce result struct for IPWL_FillerNotify::OnBeyforeKeyStroke().c65f45e6b
Make CXFA_EventParam constructor take a type argument5b91a0148
Remove FX_UTF8Decode() in favor of WideString::FromUTF8()d06523d84
Make WideString's FromUTF16BE / FromUTF16LE do surrogate fusinge2704cba8
Use dedicated struct instead of std::pair in ExecuteBoolScript()28c6c6dc7
Remove out-parameter from CFXJSE_Context::ExecuteScript()ce9900139
Simplify tests that use FORM_GetSelectedText()03c23083a
Use spans in more places8b2380dce
Add pdfium::as_byte_span() helperbee6d0b15
Remove out parameter from DynPropGetterAdapter().40d92c45b
Make WideString's FromUTF16LE(), FromUTF16BE() take bytes, not wchar_t8c2fc5da8
Roll base/allocator/partition_allocator/ 4d90e004b..6800d0930 (7 commits) https://chromium.googlesource.com/chromium/src/base/allocator/partition_allocator.git/+log/4d90e004b935..6800d0930f066bef48cda
Unify object type detection code inside cpdf_document.cpp.4d0aaaa07
Consistently name variables in CPDF_Document::InsertDeletePDFPage()d1debc773
In PDF_DecodeText() UTF-16, do surrogate fusion before language code strippingb98c5b4c0
Clean up CountPages()43d835b47
Fix object type detection in CPDF_Document::InsertDeletePDFPage()e4424849c
Update reclient_version to 0.120.1.f75cfb7-gomaip0db38df32
Add FPDFPPOEmbedderTest.ImportIntoDocWithWrongPageType test case8df2e8494
Roll Catapult from 47efdb4b1428 to f0228fa92b0a (63 revisions)aa7d390b4
Add support for UTF-8 text strings9b8ac25af
Simplify PDF_DecodeText() and new helper functions5adcad9d3
Roll third_party/skia/ 8e9e16841..3a79d7a61 (69 commits)e2b69c4bb
Sanity check the inputs to Blend()6e7f70b39
Replace implicit dependency on global SkFontMgr1e9d89db3
Extract language code stripping from PDF_DecodeText() into function445b54a73
Extract surrogate fusing from PDF_DecodeText() into own separate functionbb8fd49d1
Simplify Blend()61307c2ad
Unit test Blend()2962e6ca2
Move Blend() function to its own file.a5bb284fd
Roll Zlib from dfc48fc4de8e to 5daffc716bb6 (6 revisions)aae740cc1
Re-organize cppgc::Member<> members16b2fa3fc
Clean up CPDF_PageOrganizer::Init()acddfedb7
CHECK() the bitmap argument in CPDF_RenderStatus::CompositeDIBitmap()ad80a04d3
Split out skcms_sources into multiple GN targets.c512857c0
Roll third_party/skia/ 77aeee3b8..8e9e16841 (187 commits)cba9a3c1d
Properly support the use_system_libtiff GN build optionfa80feef8
Avoid setting the private tif_fd field in struct tiff25df6e84c
Mention "document outline" in public/fpdf_doc.h8b1177f2d
Fix undefined behavior in FXSYS_wcsnicmp()5636e90a8
Allow ProcessCrossRefV5Entry() to overwrite existing entries154e17543
Prefer ClearAndDelete() to delete ExtractAsDangling().d82c698a1
Upgrade vpython3 and wheels5d87ac6ec
Remove "six" python wheelea0263079
Move fonts used for pixel tests to their own directories56a444f10
Replace Copy() with operator=() in CPDF_{All,Graphic}States5746eb685
Rename CPDF_GraphicStates::DefaultStates() to SetDefaultStates().747015873
Stop inheriting from CPDF_GraphicStates08f11e596
Roll libpng from 7e1f7e7b1063 to 1db23788f5aa (1 revision)676f13456
Roll Depot Tools from 73b69b016703 to ea9bf7f343d3 (50 revisions)1bfae352b
Roll Code Coverage from f06a56e5b449 to 61632b07bdc6 (2 revisions)f3b5f3db3
Encapsulate CPDF_GraphicStates member variables3c2845720
Encapsulate CPDF_AllStates member variablesa2dc6ecce
Ensure exactly 1 WCHAR_T_IS_*_BIT define is defined18a2f3c02
Add some unit tests for FXSYS_wcsnicmp()a3ce9f4ae
Roll base/allocator/partition_allocator/ 0d03e4082..4d90e004b (14 commits)2af0a20e3
Fix how FPDFText_LoadFont() detects font glyph countcc923cac5
Add an embedder test to demonstrate a FPDFText_LoadFont() failure57d1f79d4
Avoid the NULL, 0 undefined behavior problem in CFX_GlyphCache4da226a07
Change GetFileContents() test utility to return a vector1bcae281c
Roll build/ ab8815d43..292639dc3 (1 commit)4a4d922ba
Roll build/ f4167331a..ab8815d43 (29 commits)7233e99fc
Use span in Processor::ProcessPdf()f0e6edf81
Use pdfium::base::checked_cast() in a few test files1225d9d80
Remove out-parameter from PathService::GetTestFilePath()16bebb03b
Extend PDFEditImgTest.NewImageObjLoadJpeg to write out a PDFc930b5516
Add missing stdint.h include in span.h42154f4c1
Rename some "remain" variables to "remaining"6a34da391
Fix misalignment between Redo and Undo after consecutive text pasting.5f814e878
Fulfill a TODO in cpdf_dib.cpp9be49d4b0
Add rust_build_tests to pdfium_all when enable_rust=trueae960583a
Roll Instrumented Libraries from 032e9c850ab9 to 48a6beefc1bb (2 revisions)7a7c86c68
Roll Memory Tools from 8b06a5370188 to bb03b820532d (1 revision)d336e56dc
Roll third_party/skia/ e8c78601e..77aeee3b8 (221 commits)650399315
Update reclient_version to 0.118.1.ae3c3be-gomaip75815063c
Roll base/allocator/partition_allocator/ 6f90cb04a..0d03e4082 (16 commits)f76cff73a
Roll third_party/libunwind/src/ 7608093d2..69b8c6469 (6 commits)70062cf7a
Roll v8/ 1fb69d9f5..06aba4270 (284 commits)e21a7e389
Roll fuchsia_gn_sdk_revision and fuchsia sdk