Skip to content

Form rendering differences for textboxes and checkboxes when generated with pypdf and Acrobat #3115

Open
@alpepi

Description

@alpepi

I have a PDF form I am trying to autofill. The form was made with Adobe Acrobat Pro, and it contains various textboxes and checkboxes.

I am able to add text entries in the textboxes, and when I open that output PDF in Adobe Acrobat, the file opens and the entries "render" properly.

However, the issue arises when I also check checkboxes. When I fill textboxes and checkboxes with pypdf, then opening the PDF form in Adobe Acrobat, the checkboxes aren't truly checked and all the formatting of the textboxes are broken (i.e. not rendered). Let me explain further.

Text in pypdf filled textboxes look like this:
Image

The formatting only fixes itself if I manually modify the text entry with Adobe Acrobat:
Image

Also, visually the checkboxes are checked :
Image

However, if I manually click this checkbox once in Acrobat, the box remains checked (i.e. it was never ON in Acrobat's eyes?)
Image
A second click then unchecks the box. In contract, checkboxes not touched by pypdf don't behave this way, the first click switches the checkbox to the opposite state, as expected.

Once I manually interact with the pypdf filled checkboxes in Acrobat, and save the document, then all the textboxes render properly upon re-opening. So in summary, checking checkboxes with pypdf breaks the rendering of all textboxes in Acrobat. And it seems like Adobe doesn't recognized the checkboxes as checked by pypdf.

Environment

Python 3.10.11, pypdf 5.2.0
After pypdf outputs the PDF, I view in Adobe Acrobat.

Note that this doesn't seem to be an issue when opening the PDF with Chrome. But for my purposes I must use Adobe Acrobat.

Code + PDF

from pypdf import PdfReader, PdfWriter

def write_to_form(input_name: str, dict_to_write: dict, output_name: str, auto_regen):

    reader = PdfReader(input_name)
    writer = PdfWriter()

    #fields = reader.get_form_text_fields()
    fields = reader.get_fields()
    writer.append(reader)

    writer.update_page_form_field_values(
        writer.pages[0],
        dict_to_write,
        auto_regenerate= auto_regen,
    )

    with open(output_name, "wb") as output_stream:
        writer.write(output_stream)


textboxes = {"TextBox1": "my text entry.",
                   }

buttons = {"Other": '/On',
           }

to_write = buttons | textboxes
write_to_form("blank-form.pdf", to_write, "filled-out.pdf", False)

Unfortunately I can't share the PDF file itself.
Here is a redacted version of my PDF form. I cleared the PDF, other than the boxes of interest. The behaviour described above is still present in the remaining boxes.
blank-form.pdf

Test PDF Form - pypdf filled.pdf
Test PDF Form - Acrobat filled.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    workflow-formsFrom a users perspective, forms is the affected feature/workflow

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions