Description
I agree with @alexaryn in that responsibility of closing a stream should rest purely on the caller of
PdfWriter
, and notPdfWriter
itself. Callingpdf_writer.close()
or using the context manager should only clean up the resources thatPdfWriter
itself has created as part of its operation. I apologize for not catching the current behavior when I reviewed #1193. 😬Looking at the code, there is also the inefficiency where
PdfWriter.__init__
is called twice. It's also permissible to callPdfWriter
with aPdfReader
as the first argument (to simplify cloning), but in doing so with contexts, will hit the following error:>>> with PdfReader("./sample-files/001-trivial/minimal-document.pdf") as reader: ... with PdfWriter(reader) as writer: ... print(writer.fileobj) ... __init__ __enter__ __init__ <pypdf._reader.PdfReader object at 0x10304e970> Traceback (most recent call last): File "<stdin>", line 4, in <module> File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 373, in __exit__ self.write(self.fileobj) File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 1396, in write self.write_stream(stream) File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 1367, in write_stream object_positions, free_objects = self._write_pdf_structure(stream) File "/Users/mpeveler/code/github/pypdf/pypdf/_writer.py", line 1500, in _write_pdf_structure stream.write(self.pdf_header.encode() + b"\n") AttributeError: 'PdfReader' object has no attribute 'write'
I'm not sure what the correct behavior here should be, especially with regards to cloning. Like, if I call:
with PdfWriter("foo.pdf") as writer:
and
foo.pdf
exists, do I expect that it'll be cloned into the writer? I'd think no? However, I think this does force the second__init__
call to be made as there's no way to tell in the first__init__
that we're in a context. I'd think this could be confusing for end users, but I guess not since no one has written in about this behavior.
Originally posted by @MasterOdin in #2905 (comment)