Skip to content

Define domain specific workers in php_server and php blocks #1509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
May 5, 2025

Conversation

henderkes
Copy link
Contributor

@henderkes henderkes commented Apr 18, 2025

See: #1490

This enables:

{
    frankenphp
}

one.example.com {
    root /var/www/app/public/
    php_server {
        root /var/www/app/public/ # the root directive above is not possible to obtain inside this block, I think?
        env APP_ENV one
        worker index.php 16
    }
}
two.example.com {
    root /var/www/app/public/ # same path as one.example.com!
    php_server {
        root /var/www/app/public/
        env APP_ENV two
        worker {
            file index.php
            num 8
        }
    }
}

This is equivalent to:

{
    frankenphp {
        worker {
            file /var/www/app/public/index.php
            num 16
            env APP_ENV one
        }
        worker {
            file /var/www/symlinktoapp/public/index.php
            num 8
            env APP_ENV two
        }
}

one.example.com {
    root /var/www/app/public/
    php_server {
        root /var/www/app/public/ 
        env APP_ENV one
    }
}
two.example.com {
    root /var/www/symlinktoapp/public/
    php_server {
        root /var/www/symlinktoapp/public/
        env APP_ENV two
    }
}

But we don't need to define symlinks anymore!

Adds the ability to define workers in a domain that inherit their parent blocks environment variables and absolute path.

Surprisingly, my lack of Go knowledge didn't hurt as much as my lack of Caddy internals knowledge. As a consequence, there are many things that I'm very unhappy with. I'll list those in a review...

caddy/caddy.go Outdated
Comment on lines 35 to 36
// moduleWorkers is a package-level variable to store workers that can be accessed by both FrankenPHPModule and FrankenPHPApp
var moduleWorkers []workerConfig
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I absolutely hate this, but I haven't found another way to share information between FrankenPHPApp, where the workers live, and FrankenPHPModule.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure if there is a way to do this 'properly' with Caddy. The general issue stems from the fact that PHP has to be started as a singular global instance in the process, so we're probably not getting around some amount of globals anyways.
I think this is fine for now. In the future it might even make sense to refactor this into a global struct to make it possible to optionally omit the frankenphp directive in the Caddy configuration

@henderkes
Copy link
Contributor Author

@AlliBalliBaba Let me know what you think about the pain points I've mentioned here.

caddy/caddy.go Outdated
Comment on lines 35 to 36
// moduleWorkers is a package-level variable to store workers that can be accessed by both FrankenPHPModule and FrankenPHPApp
var moduleWorkers []workerConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure if there is a way to do this 'properly' with Caddy. The general issue stems from the fact that PHP has to be started as a singular global instance in the process, so we're probably not getting around some amount of globals anyways.
I think this is fine for now. In the future it might even make sense to refactor this into a global struct to make it possible to optionally omit the frankenphp directive in the Caddy configuration

@henderkes
Copy link
Contributor Author

#1509 (comment)

It might also be possible to use the worker name instead of the ID. We just would have to ensure worker names are unique.

I can't comment directly under it, so here: that would make specifying a worker name mandatory, wouldn't it? That's why I don't particularly like it.

@henderkes
Copy link
Contributor Author

#1509 (comment)

I think this is fine for now. In the future it might even make sense to refactor this into a global struct to make it possible to optionally omit the frankenphp directive in the Caddy configuration

Actually a good idea, by the time Provision is called we already have the information if any of our stuff is called or not. I'll refactor this out into a global struct instance so we can make it optional easier in a future PR.

@henderkes henderkes force-pushed the workers branch 2 times, most recently from ecb8ef6 to 571ce92 Compare April 19, 2025 16:54
@henderkes henderkes force-pushed the workers branch 3 times, most recently from 1d015bf to c7172d2 Compare April 19, 2025 17:17
… newWorker with a filepath that already has a suitable worker, simply add number of threads
@@ -32,6 +32,7 @@ func TestStartAndStopTheMainThreadWithOneInactiveThread(t *testing.T) {
}

func TestTransitionRegularThreadToWorkerThread(t *testing.T) {
workers = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi; this doesn't actually stop the workers (assuming they aren't stopped) -- it just loses the references to them. In other words, it may mask test failures in other tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't go a garbage collected language and would automatically claim back the resources and memory? If not, I don't think it matters all too much since this isn't something that would happen outside of tests, but I'll look into cleanly shutting them down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fine in this case, worker threads are stopped at the end of the test with drainPHPThreads() (I'll probably refactor this at some point)

@henderkes
Copy link
Contributor Author

Isn't go a garbage collected language and would automatically claim back the resources and memory? If not, I don't think it matters all too much since this isn't something that would happen outside of tests, but I'll look into cleanly shutting them down.

caddy/caddy.go Outdated
Comment on lines 602 to 607
// Check if a worker with this name already exists
for _, existingWorker := range moduleWorkers {
if existingWorker.Name == wc.Name {
return fmt.Errorf("workers must not have duplicate names: %s", wc.Name)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure if we should allow this or not.

Reasons to allow it: automatically named workers, based on environment variables, should absolutely be able to reuse the same name.

one.example.com {
    php {
        root /path/to/app/
        worker index.php 2
    }
}
two.example.com {
    php {
        root /path/to/app/
        worker index.php 1
    }
}

Would absolutely work, because they both have the same (no) environment variables.

one.example.com {
    php {
        root /path/to/app/
        env APP_ENV one
        worker {
            name wrk
            file index.php
        }
    }
}
two.example.com {
    php {
        root /path/to/app/
        env APP_ENV two
        worker {
            name wrk
            file index.php
        }
    }
}
# or
two.example.com {
    php {
        root /path/to/app/
        worker {
            name wrk
            file index2.php
        }
    }
}

Should absolutely not work, because they have different environment variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current behaviour is that both would fail, but I'm not so happy about the first case. It should just create a worker with 3 threads.

Comment on lines 1 to 2
//go:build !nocaddy
// +build !nocaddy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I've added them when creating and debugging the config_tests. I can remove the nocaddy tag.

caddy/caddy.go Outdated
Comment on lines 478 to 483
err = frankenphp.WithModuleWorker(workerName)(fc)
if err != nil {
return caddyhttp.Error(http.StatusInternalServerError, err)
}

fr, err := frankenphp.NewRequestWithExistingContext(r, fc)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a shot! Still very new to the project, so a lot of these design decisions will not be up to standards yet. Thank you for the hint :)

Comment on lines 128 to 134
func WithModuleWorker(name string) RequestOption {
return func(o *frankenPHPContext) error {
o.workerName = name

return nil
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it "WithWorkerName" so that it wouldn't accidentally be confused with the WithWorkers(workeropts...) function. I also don't think "WithWorker" is a good name, because it would imply we're passing a worker - but we're not!

caddy/caddy.go Outdated
Comment on lines 478 to 483
err = frankenphp.WithModuleWorker(workerName)(fc)
if err != nil {
return caddyhttp.Error(http.StatusInternalServerError, err)
}

fr, err := frankenphp.NewRequestWithExistingContext(r, fc)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think it is necessary. root + path doesn't strip out suffixes from the path information, doesn't follow the specified splitpath and may not correctly resolve the root correctly in case of an embedded app. What can be done is to call the existing NewRequestWithContext twice and parse the filename out after the first time... I don't think that makes sense.

Or I factor the request creation logic out like I did and create a new method to pass the already parsed out context, in order not to perform the same logic twice. I chose the latter and I think that's still the better choice. Wasting performance just to save a public function isn't worth it, imo, especially because libraries may have a legitimate need to parse the filename and other information out, too.

@@ -32,6 +32,7 @@ func TestStartAndStopTheMainThreadWithOneInactiveThread(t *testing.T) {
}

func TestTransitionRegularThreadToWorkerThread(t *testing.T) {
workers = nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't go a garbage collected language and would automatically claim back the resources and memory? If not, I don't think it matters all too much since this isn't something that would happen outside of tests, but I'll look into cleanly shutting them down.

worker.go Outdated
@@ -64,12 +67,25 @@ func initWorkers(opt []workerOpt) error {
return nil
}

func getWorkerKey(name string, filename string) string {
key := filename
if strings.HasPrefix(name, "🧩 ") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if a unicode character as prefix is a good idea since worker names are also used in metrics.

Copy link
Contributor Author

@henderkes henderkes Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took inspiration from the elephant in the logs. Is there something specific about metrics why this wouldn't be okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not using Prometheus metrics myself, maybe unicode characters are also fine @withinboredom wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are Prometheus tests already -- I'd maybe add a test for this case if it isn't covered already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe could we use a # or something like that instead. ASCII compatibility is safer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I've used m# before, see 4cc8893

@IndraGunawan
Copy link
Contributor

i might have lost track of the changes on this PR.

why do we need to add "identifier" to the worker name?

one.example.com {
    php {
        root /path/to/app/
        worker index.php 2
    }
}
two.example.com {
    php {
        root /path/to/app/
        worker index.php 1
    }
}

will it create 2 different workers or 1 worker with 3 thread?

@henderkes
Copy link
Contributor Author

it will create two different workers: m#/path/to/app/index.php with two threads and m#/path/to/app/index.php2 with one thread. The m# identifier is given to worker threads so that we can use the name in the slice rather than the filename without breaking BC for libraries that use global workers with names but expect the filename to be the key in the map.

Otherwise, I would have just reworked all code to always use the name, rather than the filename.

Copy link
Member

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that changes in workers config will be taken into account?

@henderkes
Copy link
Contributor Author

Are we sure that changes in workers config will be taken into account?

During runtime with the admin API or when?

@dunglas
Copy link
Member

dunglas commented Apr 29, 2025

Yes, during runtime.

@henderkes
Copy link
Contributor Author

I haven't looked into caddy's source code so I can't say for sure, but the way I've found it explained in the documentation is that the new configuration is parsed, all caddy modules are Provision()'ed and Start()'ed. This would call FrankenPHPApp::Start() again, which will create workers again:

// Add workers from FrankenPHPApp and FrankenPHPModule configurations
// f.Workers may have been set by JSON config, so keep them separate
for _, w := range append(f.Workers, moduleWorkerConfigs...) {
	opts = append(opts, frankenphp.WithWorkers(w.Name, repl.ReplaceKnown(w.FileName, ""), w.Num, w.Env, w.Watch))
}

frankenphp.Shutdown()
if err := frankenphp.Init(opts...); err != nil {
	return err
}

@dunglas
Copy link
Member

dunglas commented Apr 29, 2025

That would be nice to try, just to be sure.

@henderkes
Copy link
Contributor Author

I've added a test to load a module worker config dynamically.

Copy link
Member

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give me push access, please @henderkes? I've made some minor changes.

@@ -64,12 +65,30 @@ func initWorkers(opt []workerOpt) error {
return nil
}

func getWorkerKey(name string, filename string) string {
key := filename
if strings.HasPrefix(name, "m#") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we always use the name if the filename is duplicated? This would remove a soft coupling with the Caddy module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wanted to error out if the filename of global workers was duplicated, hence we'd still need a prefix check somewhere. I might be mistaken, but I don't see a way around it without infringing on BC in some way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a BC break, but it's mixing the responsibilities of the standalone library and of the Caddy module. The notion of "module" is Caddy-specific and should leak into the standalone lib.

@henderkes
Copy link
Contributor Author

Oh, I thought you had push access because you're in the static-php organisation. Added you!

@dunglas dunglas merged commit 1d74b2c into php:main May 5, 2025
42 checks passed
@dunglas
Copy link
Member

dunglas commented May 5, 2025

Thank you! Great piece of work.

@henderkes
Copy link
Contributor Author

🎉

@dunglas
Copy link
Member

dunglas commented May 16, 2025

We should add this new syntax to the docs!

@henderkes
Copy link
Contributor Author

I'll do that later today!

@henderkes
Copy link
Contributor Author

Documentation added to #1571

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants