-
Notifications
You must be signed in to change notification settings - Fork 44
[Feature] Add. PdfDoc.saveToTargetPath() for Writing PDF Directly to File Path #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[Feature] Add. PdfDoc.saveToTargetPath() for Writing PDF Directly to File Path #87
Conversation
This lib is cross-platform, including web. How can this work? |
Ahh.. right! thanks for reviewing! 🙏🙏 |
By the way currently my PR Only might work only Node.js Environment |
Maybe for other environments we could just write an error in the console explaining that this feature is not available on that environment? |
That's an idea! Fist of all, i would try to handle all other environments include Node.js but, if it's hard to cover let me know you by next weekend @Sharcoux |
i had some issue of memory usage of this feature.. 😂 give me some time for working on that! |
Would this PR help with adding streaming support to this library? This library is great, but I've ran into a memory issue in a serverless environment since it doesn't support streams it consumes more memory than my serverless function is allocated. In my case, I'm merging pdfs and would love to be able to stream pdfs into a writeable stream and direct it to disk, allowing this library to merge larger documents while maintaining constant-ish memory usage. As per the feedback from @Sharcoux (WRT to compatbility), perhaps the Web streams api would be a better a more compatible approach which would allow the stream to be redirected to a file in node, or a blob or something in the browser. I'd be happy to assist, but might be more effective with a pointer as to what portion of the codebase to focus on / high level approach. |
It would not. We read the pdf before parsing it. Then you can write it. Streaming won't help much as long as we need the full data to do the parsing. |
But that PR might still improve memory perfs as highlighted by @YoungSeok-Choi I agree with the use of web stream api |
@Sharcoux thanks for the quick reply, is there any issue in the backlog for streaming support? I looked in this repo but didn't find anything, I think the original repo had a few things in there backlog / roadmap. Would be great to land have a discussion about approach / feasability. |
It's unlikely that we'll work on this, but feel free to open a PR and start discussing stuff. This is a community package. We'll just monitor the PRs and releases, and try to help when we can. |
What?
Currentiy there is only way to save an modified PDF File(
PDFDocument.save()
) which is creates a entire Copy of PDF size bufferthis PR add a new feature(
PDFDocument.saveToTargetPath()
) totally same functionality withPDFDocument.save()
but reduces memory usage by writing PDF data directly to specific directory path (without copying Unit8Array
)Why?
I created this feature to address high memory usage when working in serverless environments (e.g., AWS Lambda). When processing large PDFs (e.g., 1GB), the current implementation requires at least 2GB of memory, resulting in substantial resource costs.
This PR reduces memory requirements, which is especially important in memory-constrained environments.
For more details, please refer to the related issue I created:
How?
PDFDocument.save()
: Creates a Uint8Array for the entire size of the modified PDF file and writes this array.PDFDocument.saveToTargetPath()
: Writes the modified PDF file directly to a specified path using a Writable stream, avoiding the creation of a large Uint8Array.Testing?
outputPath
) options is valid.PDFDocument.save()
andPDFDocument.saveToTargetPath()
are identical.New Dependencies?
No Additional Dependency
Screenshots
Suggested Reading?
Yes
Anything Else?
Nothing!
Checklist