Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: thread autoscaling #1266

Open
wants to merge 208 commits into
base: main
Choose a base branch
from
Open

Conversation

Alliballibaba2
Copy link
Collaborator

I originally wanted to just create a PR that allows adding threads via the admin API, but after letting threads scale automatically, that PR kind of didn't make sense anymore by itself.

So here is what this PR does:

It adds 4 Caddy admin endpoints

POST     /frankenphp/workers/restart   # restarts workers (this can also be put into a smaller PR if necessary)
GET      /frankenphp/threads           # prints the current state of all threads (for debugging/caddytests)
PUT      /frankenphp/threads           # Adds a thread at runtime - accepts 'worker' and 'count' query parameters
DELETE   /frankenphp/threads           # Removes a thread at runtime - accepts 'worker' and 'count' query parameters

Additionally, the PR also introduces a new directive in the config: max_threads.

frankenphp {
    max_threads 200
    num_threads 40
}

If it's bigger than num_threads, worker and regular threads will attempt to autoscale after a request on a few different conditions:

  • no thread was available to immediately handle the request
  • the request was stalled for more than a few ms (15ms currently)
  • no other scaling is happening at that time
  • A CPU probe (50ms) successfully determines that PHP threads are consuming less than a predefined amount of CPU (80% currently)
  • we have not reached max_threads yet

This is all still a WIP. I'm not yet sure if max_threads is the best way to configure autoscaling or if it's even necessary to have the PUT/DELETE endpoints. Maybe it would also make sense to determine max_threads based on available memory.
I'll conduct some benchmarks showing that this approach performs better than default settings in a lot of different scenarios (and makes people worry less about thread configuration).

In regards to recent issues, spawning and destroying threads would also make the server more stable if we're experiencing timeouts (not sure yet how to safely destroy running threads).

@AlliBalliBaba
Copy link
Collaborator

Alright I think this is ready to merge. Not sure why watcher installation is failing right now in asan and msan

@AlliBalliBaba
Copy link
Collaborator

@dunglas I'll merge this branch if that's ok.

It looks like the newest watcher version needs CLANG20 and CLANGXX20 (ubuntu-latest in the pipeline has CLANG18, but it's not a problem with GCC). I fixed the used watcher version to 0.13.2 for msan and asan for now.

Also, not sure what's up with the sudden false positives in the markdown linter.

@furan917
Copy link

furan917 commented Feb 4, 2025

@AlliBalliBaba Feel free to tell me I am wrong as I am just passing by and skim reading.

But isn't the linter fail a true fail with a hint of misconfig, ithe GHA is linting main and picking up the RU docs are missing their EOLs, right?

@AlliBalliBaba
Copy link
Collaborator

AlliBalliBaba commented Feb 5, 2025

@furan917 Oh you are right, I didn't realize we had RU docs.

Copy link
Owner

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!! I made a bunch of comments, but this looks very close to be mergeable to me.

caddy/admin.go Show resolved Hide resolved
dev.Dockerfile Outdated Show resolved Hide resolved
caddy/caddy.go Show resolved Hide resolved
docs/performance.md Outdated Show resolved Hide resolved
internal/cpu/cpu_fallback.go Outdated Show resolved Hide resolved
testdata/performance/perf-test.md Outdated Show resolved Hide resolved
thread-inactive.go Show resolved Hide resolved
thread-regular.go Outdated Show resolved Hide resolved
thread-worker.go Show resolved Hide resolved
caddy/caddy.go Show resolved Hide resolved
@AlliBalliBaba
Copy link
Collaborator

Alright, the debug state is now a struct and the endpoint is marked as 'experimental' and returns a JSON. Additionally, I added support for a nested php.ini configuration like this, since I think it's a very convenient feature

php_ini opcache.jit tracing

# or

php_ini {
	mysqlnd.collect_statistics Off
	opcache.jit tracing
}

@AlliBalliBaba AlliBalliBaba requested a review from dunglas February 7, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants