-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Fix: Decrease pressure via agent disposal / management #6380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses HTTP agent resource leaks that cause timeouts after prolonged operation by implementing agent reuse and proper cleanup. Instead of creating new HTTP/HTTPS agents for every request, the code now creates reusable agents per monitor instance with keepAlive disabled to prevent connection pooling issues.
Key Changes:
- Added reusable HTTP/HTTPS agent instance variables to the Monitor class
- Modified HTTP/HTTPS/Steam/Docker monitor types to reuse agents instead of creating new ones per request
- Implemented agent cleanup in the stop() method to prevent resource leaks
|
I am in the process of looking to test this locally -- I am brand new to the project so it may take a minute. Feedback is welcome! |
|
So what was the result of the testing? |
|
@CommanderStorm I'm doing this as a side-side-quest for my day job and have never touched this repo before so it's going to take a little to build, push, and use the docker container in our existing setup amongst my other priorities. Hopefully a maintainer or others can weigh in on this PR in the interim. |
|
Just read #275 again. I have the same concern as @steveiliop56 said. #275 (comment) I have also checked the Node.js Doc. https://nodejs.org/api/http.html#new-agentoptions
And actually I am not so faimilar with |
|
And since you mentioned Claude AI, Did the idea of reusing agent and If yes, I think we better stop here. |
Great catch. I was taking a belt & suspenders approach. That one's on me, I hadn't read the docs thoroughly enough. I was coming with a .NET mindset of disposing any But from those same docs, right after you quoted:
So I'm thinking there could be something where enough sockets are kept open longer than they should be, causing an issue. This is a strange situation in that I'm attempting to resolve a performance-related symptom where it's difficult to diagnose the problem, so I'm looking for a non-harmful practice that could also possibly fix things, and this stuck out. To be clear, I'm also shifting my thinking to agree that this is less about memory (and perhaps less about sockets) and more about pressure from object accumulation over a long time. I'm not seeing memory pressure on our container at all. My new theory is that enough of these are being created that we see socket exhaustion faster than they're cleaned up, or that GC pauses due to object accumulation get longer. And that since it's efficient and fine to re-use them, nothing seems to be hurt by this change. I did work with a colleague to create a small test harness (which didn't make sense to include in the larger codebase I don't think, though let me know if you want it). It just shows that we definitely don't accumulate things anymore with this PR. But I think that much is obvious from the code change so it probably doesn't make sense to include. If you'd like, I can work to deploy this PR into our production environment and let it run for some time. That might be the ultimate proof -- putting it in a production scenario and seeing if the symptom is resolved. |
|
I've deployed this to an experimental container under my dockerhub and we are now using it in our pre-prod and prod environments. I'll report back on findings. This can stay open in the interim. Based on past experience it takes about 18 hours for the symptom to show up. I'll let this run for a few days and if symptoms don't show up I think we'll have a strong signal on this |
|
Observations from production so far:
|
|
Additional observations from production:
So I do think we have a strong signal that this is, at a minimum, helping with the symptom. Going to keep monitoring / reporting for a few more days. |
ℹ️ NOTE: This fix was generated in concert with Claude AI, but with me in the driver's seat, starting from my own hunch, working through analysis, and then looking at the possible fixes. I do not have deep background in node/TS, but I have 20 years of experience and I also hate AI slop. 😄 Blunt feedback on the PR is welcome.
❗ Important Announcements
Click here for more details:
🚫 Please Avoid Unnecessary Pinging of Maintainers
We kindly ask you to refrain from pinging maintainers unless absolutely necessary. Pings are for critical/urgent pull requests that require immediate attention.
📋 Overview
🛠️ Type of change
📄 Checklist
📷 Screenshots or Visual Changes
N/A