Replies: 17 comments 23 replies
-
This is very helpful to use on google colab as well, good find 👍 |
Beta Was this translation helpful? Give feedback.
-
True magical! This is completely solution for me too, as I experienced similar issue here: #7479 |
Beta Was this translation helpful? Give feedback.
-
very cool few comments:
|
Beta Was this translation helpful? Give feedback.
-
also @vladmandic I believe you've gotten confused with your comment on |
Beta Was this translation helpful? Give feedback.
-
what would an equivalent command in windows be? |
Beta Was this translation helpful? Give feedback.
-
@mcmonkey4eva export affects current context and all child processes. so if its set in shell before you run webui, any other processes you start before or after webui will have the same effect. But yeah, if that session is used only to start webui, then no issues. @gsgoldma Windows works completely differently, there is no equivalent. |
Beta Was this translation helpful? Give feedback.
-
Thank you, looks like it worked for me. Running on WSL2, I had 8.5 GB taken after initial loading, then swapping to another model maxed out my memory, but swap I created only took 300 MB.. and after it loaded, memory usage was back to 8.5 GB. Earlier I couldn't even load a second model. Great fix! |
Beta Was this translation helpful? Give feedback.
-
Edit: Fix did not work for me, and caused webui to not be able to run until I rebooted my computer. Are there any other ways to apply this fix? |
Beta Was this translation helpful? Give feedback.
-
How can I use this on colab? I am not a Linux wizard, so I would be interested in specific commands. Thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
Wanted to set LD_PRELOAD when launching web-ui from terminal. Turned out I needed to specify full path to the library, like this: |
Beta Was this translation helpful? Give feedback.
-
Had this same RAM leak issue running with I use a custom startup wrapper script for Automatic1111 to always enable some command-line options. I think this is cleaner than editing one of the provided scripts. Added For others struggling with this: after you have installed |
Beta Was this translation helpful? Give feedback.
-
well, here's couple of mine... os.environ.setdefault('TF_CPP_MIN_LOG_LEVEL', '2')
os.environ.setdefault('ACCELERATE', 'True')
os.environ.setdefault('FORCE_CUDA', '1')
os.environ.setdefault('ATTN_PRECISION', 'fp16')
os.environ.setdefault('PYTORCH_CUDA_ALLOC_CONF', 'garbage_collection_threshold:0.9,max_split_size_mb:512')
os.environ.setdefault('CUDA_LAUNCH_BLOCKING', '0')
os.environ.setdefault('CUDA_CACHE_DISABLE', '0')
os.environ.setdefault('CUDA_AUTO_BOOST', '1')
os.environ.setdefault('CUDA_MODULE_LOADING', 'LAZY')
os.environ.setdefault('CUDA_DEVICE_DEFAULT_PERSISTING_L2_CACHE_PERCENTAGE_LIMIT', '0')
os.environ.setdefault('GRADIO_ANALYTICS_ENABLED', 'False')
os.environ.setdefault('SAFETENSORS_FAST_GPU', '1')
os.environ.setdefault('NUMEXPR_MAX_THREADS', '16') |
Beta Was this translation helpful? Give feedback.
-
I am using Debian testing and I installed the libtcmalloc_minimal package. I patched my webui.sh file like this to make it preload the library:
HTH |
Beta Was this translation helpful? Give feedback.
-
I had found a fix that used this on an automatic1111 notebook, it's similar to what's posted here. Been using it via google colab for weeks just fine until today. The code Today, for whatever reason, it just causes the google colab cell to just stop before showing the URLs.
Commenting out the memfix code lines gets the WebUI to run, but with the memory issue. Any idea of what went wrong? SD version: 1.4.0 • python: 3.10.6 • torch: 2.0.1+cu118 • xformers: 0.0.20 • gradio: 3.32.0 |
Beta Was this translation helpful? Give feedback.
-
So, I've read and re-read all this and I have tryed multiple different ways of getting this right and making A1111 work correctly. I always installed 'libgoogle-perftools-dev" without issues and A1111 always worked like a charm. I decided to dual boot linux (ubuntu & Mint/edge) and now I see to get the following.. ERROR: ld.so: object 'libtcmalloc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored. I'm not educated with linux like I wish I was, however, I am learning at a rate that makes this frustrating to figure out. A1111 still works and all, but my mind will not let this go. Like I said, this has always worked without fail before. And since Mint/edge is basically ubuntu "Jammy", I figured everything would fall in line. If I try to down grade xapp at all, it wants to take Cinnamon with it. What am I missing here? I've tryed adjusting /etc/ld.so.conf and that doesn't seem to work. I've added lined to webui-user.sh and no dice there either. Would love some guidance as I know I'm missing something and google searching and trying different things has gotten me multiple reinstalls of mint. lol Learning is full and all, but i figured it was time to ask for help. DeweyDecibel |
Beta Was this translation helpful? Give feedback.
-
that could be anything. first, make sure that libtcmalloc is set as env variable ONLY for sd. its NOT compatible to be a system wide allocator, so if you set it in .bashrc or something like that, it will result in exact errors you're seeing.
second, as with any system lib, my recommendation remains to make sure its resolvable by system and avoid using absolute paths.
|
Beta Was this translation helpful? Give feedback.
-
I have configured and used it according to the above method, and it can run normally. export LD_PRELOAD=libtcmalloc.so But when I use dwpose , an error occurs and the SD service is interrupted I found that as long as the ". onnx " file is used, there will be a service interruption log: How should I solve this problem? Thanks. |
Beta Was this translation helpful? Give feedback.
-
So, I use webui on XUbuntu, on a system with a Nvidia Optimus graphics card with only 2GB on it. As such, I need to run it on "lowvram" mode.
What frustrated me was an inconsistent pattern of memory leakage - each time I would generate an image, it would use up more memory, until my memory was full. And attempts at profiling to find where the memory was leaking failed repeatedly - none of the tools could see the excessively used memory (while the program was using more than 9 GB of memory, the profilers would only see about 250 MB of data).
While trying to figure out what was going on, I looked into optimisations, in the hopes it would give me ideas, and I came across the idea of using a different malloc to handle memory allocations.
And when I used tcmalloc... amazingly, the memory leak was gone. Not only that, but I got a maybe 2% speed boost in the process. Now I can run multiple batches, and don't need to worry about running out of memory!
To do it, you need to have the appropriate library installed - on Ubuntu 22.10, it's in libgoogle-perftools-dev (although it also seems to work pretty well with libtcmalloc-minimal4)...
Then, as an environment variable, add
LD_PRELOAD=libtcmalloc.so.4
(you may want to confirm the "4" at the end, I don't know if it varies from system to system). If you went with the minimal one, it'll be libtcmalloc_minimal rather than just libtcmalloc. You can do this on the command line as usual, or you can edit webui-user.sh to addexport LD_PRELOAD=libtcmalloc.so.4
This was such a substantial improvement, I'm considering putting it forward as a Feature Request Issue, to be added to webui-user.sh (commented out, ready to be uncommented) and mentioned in the wiki as an optimisation.
Beta Was this translation helpful? Give feedback.
All reactions