Skip to content

Cloning errors when using context manager #2912

Open
@pubpub-zz

Description

@pubpub-zz

I agree with @alexaryn in that responsibility of closing a stream should rest purely on the caller of PdfWriter, and not PdfWriter itself. Calling pdf_writer.close() or using the context manager should only clean up the resources that PdfWriter itself has created as part of its operation. I apologize for not catching the current behavior when I reviewed #1193. 😬

Looking at the code, there is also the inefficiency where PdfWriter.__init__ is called twice. It's also permissible to call PdfWriter with a PdfReader as the first argument (to simplify cloning), but in doing so with contexts, will hit the following error:

>>> with PdfReader("./sample-files/001-trivial/minimal-document.pdf") as reader:
...   with PdfWriter(reader) as writer:
...     print(writer.fileobj)
...
__init__
__enter__
__init__
<pypdf._reader.PdfReader object at 0x10304e970>
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 373, in __exit__
    self.write(self.fileobj)
  File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 1396, in write
    self.write_stream(stream)
  File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 1367, in write_stream
    object_positions, free_objects = self._write_pdf_structure(stream)
  File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 1500, in _write_pdf_structure
    stream.write(self.pdf_header.encode() + b"\n")
AttributeError: 'PdfReader' object has no attribute 'write'

I'm not sure what the correct behavior here should be, especially with regards to cloning. Like, if I call:

with PdfWriter("foo.pdf") as writer:

and foo.pdf exists, do I expect that it'll be cloned into the writer? I'd think no? However, I think this does force the second __init__ call to be made as there's no way to tell in the first __init__ that we're in a context. I'd think this could be confusing for end users, but I guess not since no one has written in about this behavior.

Originally posted by @MasterOdin in #2905 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    PdfWriterThe PdfWriter component is affected

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions