Skip to content

perf: [3759] Memory fix#5562

Open
Deses wants to merge 25 commits into
homarr-labs:devfrom
Deses:3759-memory-fix
Open

perf: [3759] Memory fix#5562
Deses wants to merge 25 commits into
homarr-labs:devfrom
Deses:3759-memory-fix

Conversation

@Deses
Copy link
Copy Markdown

@Deses Deses commented Apr 24, 2026


Homarr

Thank you for your contribution. Please ensure that your pull request meets the following pull request:

  • Builds without warnings or errors (pnpm build, autofix with pnpm format:fix)
  • Pull request targets dev branch
  • Commits follow the conventional commits guideline
  • No shorthand variable names are used (eg. x, y, i or any abbrevation)
  • Documentation is up to date. Create a pull request here.

Continuation of some ramblings on: #3759 (comment)

Image

What I did!

1.- I started by Lazy-loading integration SDKs: packages/integrations/src/base/creator.ts was importing all 48 integration classes at startup. Even if you only use Proxmox and Pi-hole like in my case, we are loading every single integration, which is crazy.

2.- Then I moved tasks and websockets to Nextjs to delete 2 node processes:
Both got merged into Next.js via instrumentation.ts, which Next.js already calls on startup , so taht one process runs everything.

3.- Another thing I changed was to stop the integrations from polling when the page is not being used. I might be overreaching but IMO if the page is closed it should not be working on refreshing the widgets since no one is looking at them. I did this by checking the websocket and setting a 1 minute timer after which the integrations stop working. This will save more cpu than memory but i guess it's nice to have.

4.- I also now force a heap limit and try to call the garbage collection more. It helps but since node seems to not unload libraries once loaded, this isn't helping as much as i'd like.
That's why if I reboot the container the memory sits at 200MB but as long as I open the page for the first time it goes to 340MB or so and sits there.

5.- To address this I first tried doing a small hardcoded test reloading homarr after 5 minutes of inactivity and, when I saw it working, I moved on to adding a setting in the configuration so users can toggle the setting and choose how much inactivity before reloading.
This change REQUIRES that the docker --restart=unless-stopped flag is set or it'll just die and never restart.

image

I added translations for all the languages. I used DeepL to translate these where they had the language available and Claude Free for the rest. I can only personally verify the correctness of English, Spanish Catalan, Portuguese and Italian. A German friend verified German and the rest... let's hope for the best.

In all honestly this commit is a bit... idk... maybe you want to take it out. It did work for me in two environments but I can see it leading to issues. That being said, this is the memory footprint after doing a reload:

Image

6.- A change you might not like or has to be tested more is the removal of isomorphic-dompurify from the SVG upload route. It alone needed an entire jsdom environment just to sanitize uploads which seems excessive. I replaced it with a small regexp that cleans user input.


Let me know what you think! I'm really excited about these changes! And I really needed these!! I have about 1Gb free on my proxmox machine. 🫠

@Deses Deses requested a review from a team as a code owner April 24, 2026 01:38
@Deses Deses changed the title [3759] Memory fix perf: [3759] Memory fix Apr 24, 2026
@deepsource-io
Copy link
Copy Markdown
Contributor

deepsource-io Bot commented Apr 24, 2026

DeepSource Code Review

We reviewed changes in b0f2c94...40ae6a7 on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.

See full review on DeepSource ↗

PR Report Card

Overall Grade   Security  

Reliability  

Complexity  

Hygiene  

Code Review Summary

Analyzer Status Updated (UTC) Details
JavaScript May 1, 2026 10:06p.m. Review ↗

Important

AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.

@manuel-rw
Copy link
Copy Markdown
Member

Thanks, will review asap

@Meierschlumpf
Copy link
Copy Markdown
Member

First of thank you for taking the time and tinkering with the codebase. It seems like you might have found a good solution on how we can combine the node processes without the limited custom server for standalone nextjs, great 👍🏼

Some thoughts I currently have about this pull request:

  • It's great that it reduces the usage by about half
  • It's pretty bad, that we have duplicated a lot of code (like widget definitions), but of course this can be still improved
  • Some type definitions probably need a few improvements (like integration creator)
  • Features like pausing jobs was kind of already an issue raised in feat(tasks): presence detection #3511, however the current solution will cause issues for manually paused jobs. I would suggest to not make this part of the changes
  • It's an interesting idea to stop the server when nobody is using it, but I think this will most likely break some setups and not sure if it actually improves memory usage by a lot
  • We should move the websocket and tasks code outside the instrumentation.ts file and import it instead of the many await import inside there
  • The change with dompurify probably needs to be reverted due to security issues with a simple regex (we might be able to optimize it however using some notes from npmjs.com/package/isomorphic-dompurify)
  • Translations should only be done for en.json, all other languages are translated on Crowdin by the community.

@manuel-rw
Copy link
Copy Markdown
Member

Regarding the idle restart, I don't like it for several reasons:

  • It relies on the host restarting the server. In Docker, this requires the mentioned flag. This change may be breaking and might require downstream changes (e.g. proxmox, runtipi, unraid, ...)
  • I doubt it frees up much resources. The most frequent complaint is of the high idle ram usage, which this PR is addressing. We can further reduce CPU usage by monitoring user presence and reducing intervals for jobs when no presence.
  • It is not fixing the root problem but is trying to combat the symptoms. I'd rather focus on the root cause.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, I think we can revert this

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i'll take it out, but I think that as an optional feature for people that know what they do or for seriously memory constrained users could be nice.

import { medias } from "@homarr/db/schema";

// Lightweight SVG sanitizer (This replaces DOMPurify!!)
function sanitizeSvg(svg: string): string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We should put this into a separate file; we may need to reuse it
  2. Does this roughly behave the same? We disclosed the vulnerability publicly and a regression in the protection against the exploit would be fatal.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does roughly behave the same but on a surface level.

After understanding what dompurify does, which analyzes all the DOM of the svg, it's way more thorough stripping anything bad that an svg could have. I think the small code I wrote could be bypassed with enough persistance... plus the DOMPurify team actively maintains its code and addresses new security vulnerabilities which is nice.

Replacing DOMPurify or making its footprint smaller belongs in another pr, but I'll keep what i wrote in a new file in case we want to reuse it in the future.

Comment thread apps/nextjs/src/instrumentation.ts Outdated
Comment on lines +94 to +98
restartTimer = setTimeout(() => {
restartTimer = null;
logger.info(`No clients for ${gracePeriodMinutes} minutes - restarting process to free memory`);
exitProcess(0);
}, gracePeriodMinutes * 60_000);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO can be reverted / removed.

Comment thread apps/nextjs/src/instrumentation.ts Outdated
});

// Periodically hint V8 to collect garbage that accumulates from cron jobs and
// request handlers. --expose-gc must be set in NODE_OPTIONS for this to work;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this flag safe for production?
Also, as a .NET developer, isn't GC supposed to run in background organized by the runtime?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted the gc changes too.
--expose-gc is used to enable the GC hint calls which it just nudged the gc to work but it can also ignore it. You know how temperamental gc is
--max-old-space-size=400 can help on constrained hosts by capping heap growth, but 400 MB could be a tad aggressive for a nextjs app and could cause OOM crashes. In my testing I haven't had problems and the app did grow its legs to 800mbs just to go back after some time but without extensive testing I can't vouch for it to work for everyone. That's another option we could expose to an user that knows what's doing,

Comment thread apps/nextjs/src/instrumentation.ts Outdated

// Periodically hint V8 to collect garbage that accumulates from cron jobs and
// request handlers. --expose-gc must be set in NODE_OPTIONS for this to work;
// if not available it is silently skipped.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of describing what, can you describe why?
Why would it not be available? If the flag mentioned above is not set?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting! Thank you for this change :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach, @Meierschlumpf can you proofread this? I am not deep enough to fully review this code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Translations should never be done in a developer change, they are done by the community using Crowdin. You can leave them as is, the community can correct them if they are bad. For next time, you don't have to do this step :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. I personally hate crowdin because it takes a crazy amount of time to change each line when I can just use my IDE and change them directly... iirc i tried to help`translate speedtest-tracker once and got frustrated with the tool and just gave up.

It would be great to have the ability to submit translation changes through a PR and have others review them in Crowdin.
Does Crowdin updates what's been changed from github?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean now we have duplicate definitions?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but it's intentional. The definition.ts files are server-side stubs so the server can compute default widget option values without dragging in React or UI dependencies. The full definitions in index.tsx are still used client-side as before. It's more files but it keeps the server bundle clean.

Before my changes the server was importing full widget modules (React components, Mantine, Tabler icons, client API etc etc) just to call createOptions(). The definition.ts stubs cut that down to just the schema. It's more of a bundle/startup improvement than a runtime heap reduction though. The big memory improvement comes from consolidating the websocket and tasks processes into Next.js instead of running them separately.

Comment thread SECURITY.md
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is somewhat unrelated, would be nice to have such changes separate next time :)

Copy link
Copy Markdown
Author

@Deses Deses May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guilty as charged 🥲

@manuel-rw
Copy link
Copy Markdown
Member

I also pushed some commits to fix the linting. Can you check those out too? You can run lint and format locally to check.
pnpm lint / pnpm format

@Deses
Copy link
Copy Markdown
Author

Deses commented Apr 27, 2026

I'll check out your comments and do the changes as soon as possible! probably between Thursday and Friday. I've been very busy and severely sleep deprived these past few days lol

@Meierschlumpf
Copy link
Copy Markdown
Member

Meierschlumpf commented Apr 28, 2026

No pressure - now that we have a solution, we don't have to release it right away. Please take care of yourself; I hope you'll be getting enough sleep again soon.

@manuel-rw
Copy link
Copy Markdown
Member

No worries, please sleep enough. Sleep is more important :)

@Deses
Copy link
Copy Markdown
Author

Deses commented May 1, 2026

After addressing some of the cahnges you wanted me to make, this is how much memory Homarr is using now

image And still very little usage after just rebooting and not opening the page. image

I think it's worth it to explore an auto-reset again in the future as an advanced user option.

I'll test a bit more before pushing :)

@manuel-rw
Copy link
Copy Markdown
Member

Can you @ me as soon as we can discuss the bounty? You can also DM me on Discord so we can organise the payment (Manicr..)

Deses added 3 commits May 1, 2026 23:32
# Conflicts:
#	apps/tasks/package.json
#	packages/api/src/router/cron-jobs.ts
#	pnpm-lock.yaml
- conExpressionShcema no longer used
- stopJob was async because it used an HTTP client and now it calls JobManager directly so no async needed
@Deses Deses requested a review from manuel-rw May 1, 2026 22:12
@Meierschlumpf Meierschlumpf linked an issue May 1, 2026 that may be closed by this pull request
@Deses
Copy link
Copy Markdown
Author

Deses commented May 2, 2026

Can you @ me as soon as we can discuss the bounty? You can also DM me on Discord so we can organise the payment (Manicr..)

Oh btw, done! You have a friend request. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bounty Claimed] High Memory Usage

3 participants