Replies: 17 comments 57 replies
-
|
I think we also need to look into minimizing (however possible without compromising test coverage) the amount of builds running. |
Beta Was this translation helpful? Give feedback.
-
|
How do we feel about disabling all automatic Github-hosted workflows for Pull Requests and delegate it to maintainers to manually decide which workflows to run and when for a given PR? The |
Beta Was this translation helpful? Give feedback.
-
|
I wonder if our main issue is that we have many long running |
Beta Was this translation helpful? Give feedback.
-
|
Well I was the one who set up the We also have way too many jobs in general. Aside from removing jobs you can also try spreading out the load between ARM and x86 machines if possible, some stuff like those cross compiles or webgpu runs can probably be done on ARM. There's also the new ubuntu-slim machine which they hopefully have more of and which we can use for simple jobs that only need 1 core and 5 gigs of memory. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
I can set up a dedicated Podman container with GPU access that starts automatically with my AI server (Ryzen 9 9950X3D 96GB DDR5). It won't interfere with my other workloads and can run CUDA and Vulkan workflows at full speed on a real GPU (RTX PRO 6000) ? It would be a clean pod with minimal Debian Containerfile / yaml, with the latest CUDA/Vulkan, that anyone in our group could download and instantiate to run the pipeline. |
Beta Was this translation helpful? Give feedback.
-
|
I have a dedicated server with AMD Ryzen 7 2700X (8C 16T), 32 GB DDR4, 4 TB storage and an NVIDIA RTX 2060 GPU doing nothing currently. I believe it can run both CUDA and Vulkan CI workloads. Let me know if my configuration is feasible and we can onboard my server as part of the self-hosted runners. |
Beta Was this translation helpful? Give feedback.
-
What needs to be updated is this: Not sure if this will break anything though, this is not yet done upstream, so no use syncing our fork yet either. |
Beta Was this translation helpful? Give feedback.
-
|
@ggerganov |
Beta Was this translation helpful? Give feedback.
-
|
I guess i am currently on my way of making this worse with #20430, i have quite a few more checks i would like to add to this workflow in the future. I would be happy to have this restricted to running when i press a button on prs that affect the hip backed. |
Beta Was this translation helpful? Give feedback.
-
|
The PR for SYCL backend: #20446 is merged. It separate the CI for SYCL backend from build.yml to build-sycl.yml and apply cache to skip download and install oneAPI package. Now if the updated code is not SYCL backend (ggml/src/ggml-sycl/*), the CI won't call build-sycl.yml. It will reduce the work load. build.yml is mandatory for all code changed. If we move the more backend build from it to build-xxx.yml, build.yml will become less. |
Beta Was this translation helpful? Give feedback.
-
|
I feel the CI just got really slow again the last few days. (i.e. very long queue times) |
Beta Was this translation helpful? Give feedback.
-
|
@ggerganov Is there any specific reason that I did a local experiment with and And got pretty decent results: and after Note: hardware - GB10 => Asus Ascend GX10 Cache will be stored in docker image itself If this something that is needed I can create a PR. |
Beta Was this translation helpful? Give feedback.
-
|
Something that I realized today is that we actually have a limit of 20 hosted runners for the organization at a given time:
So we can have at most 20 jobs running at a given time on hosted runners. |
Beta Was this translation helpful? Give feedback.
-
|
Hm, either Github Actions is having issues, or the
|
Beta Was this translation helpful? Give feedback.
-
These lines from the log showed that the slowness of that entire job was caused by the Vulkan renderer or device set to the software renderer (llvmpipe), which means that the runner that the job ran on had no GPUs or no GPU drivers
So, ubuntu-24-cmake-vulkan must be moved to one of the GPU runners |
Beta Was this translation helpful? Give feedback.
-
|
@ggerganov did you consider switching to nightly builds for releases? We currently have dozens of commits to master each day which means a lot of release builds. But we could feasibly trigger a single release build every 24 hours instead; for the vast majority of users that should be sufficient. The only concern I have would be trying to nail down the introduction of a bug to a single commit. But my experience has been that it is quite difficult to get this information from the kind of user that does not know how to compile the project in the first place. To make the whole thing a bit fancier we could use a language model to summarize and link PRs merged into master for each release. To my understanding @am17an has already built something similar for his personal use. |
Beta Was this translation helpful? Give feedback.










Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
With the current trend of the Github Actions runners becoming increasing slower and slower, we are close to not having a useable CI pipeline. Opening this discussion to discuss what we can do about this.
Queue time for last 6 months
TODOs
riscvworkflows only on themasterbranch and not in PRs.ymlfilesccache-action: ci : discuss optimization strategies #20446 (reply in thread)ggml-ci-*-cpu-*workflows to self-hosted runners to reduce some of the GH cachellama.cpp. Currently, thewhisper.cppworkflows are inefficient and can affect the global organization occupancy. Thewhisper.cppDocker images should be created only upon releases. Started here: ci : refactor + optimize whisper.cpp#3847ios-xcoderelease job? It does not seem cache-able.Upcoming self-hosted runners
One of the best way to minimize the queue times is to move the workflows to self-hosted runners that are dedicated to running ggml workflows. This requires provisioning and hosting hardware.
Beta Was this translation helpful? Give feedback.
All reactions