Skip to content

Define domain specific workers in php_server and php blocks #1509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 53 commits into
base: main
Choose a base branch
from

Conversation

henderkes
Copy link
Contributor

@henderkes henderkes commented Apr 18, 2025

See: #1490

This enables:

{
    frankenphp
}

one.example.com {
    root /var/www/app/public/
    php_server {
        root /var/www/app/public/ # the root directive above is not possible to obtain inside this block, I think?
        env APP_ENV one
        worker index.php 16
    }
}
two.example.com {
    root /var/www/app/public/ # same path as one.example.com!
    php_server {
        root /var/www/app/public/
        env APP_ENV two
        worker {
            file index.php
            num 8
        }
    }
}

This is equivalent to:

{
    frankenphp {
        worker {
            file /var/www/app/public/index.php
            num 16
            env APP_ENV one
        }
        worker {
            file /var/www/symlinktoapp/public/index.php
            num 8
            env APP_ENV two
        }
}

one.example.com {
    root /var/www/app/public/
    php_server {
        root /var/www/app/public/ 
        env APP_ENV one
    }
}
two.example.com {
    root /var/www/symlinktoapp/public/
    php_server {
        root /var/www/symlinktoapp/public/
        env APP_ENV two
    }
}

But we don't need to define symlinks anymore!

Adds the ability to define workers in a domain that inherit their parent blocks environment variables and absolute path.

Surprisingly, my lack of Go knowledge didn't hurt as much as my lack of Caddy internals knowledge. As a consequence, there are many things that I'm very unhappy with. I'll list those in a review...

caddy/caddy.go Outdated
Comment on lines 35 to 36
// moduleWorkers is a package-level variable to store workers that can be accessed by both FrankenPHPModule and FrankenPHPApp
var moduleWorkers []workerConfig
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I absolutely hate this, but I haven't found another way to share information between FrankenPHPApp, where the workers live, and FrankenPHPModule.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure if there is a way to do this 'properly' with Caddy. The general issue stems from the fact that PHP has to be started as a singular global instance in the process, so we're probably not getting around some amount of globals anyways.
I think this is fine for now. In the future it might even make sense to refactor this into a global struct to make it possible to optionally omit the frankenphp directive in the Caddy configuration

@henderkes
Copy link
Contributor Author

@AlliBalliBaba Let me know what you think about the pain points I've mentioned here.

caddy/caddy.go Outdated
Comment on lines 35 to 36
// moduleWorkers is a package-level variable to store workers that can be accessed by both FrankenPHPModule and FrankenPHPApp
var moduleWorkers []workerConfig
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm not sure if there is a way to do this 'properly' with Caddy. The general issue stems from the fact that PHP has to be started as a singular global instance in the process, so we're probably not getting around some amount of globals anyways.
I think this is fine for now. In the future it might even make sense to refactor this into a global struct to make it possible to optionally omit the frankenphp directive in the Caddy configuration

@henderkes
Copy link
Contributor Author

#1509 (comment)

It might also be possible to use the worker name instead of the ID. We just would have to ensure worker names are unique.

I can't comment directly under it, so here: that would make specifying a worker name mandatory, wouldn't it? That's why I don't particularly like it.

@henderkes
Copy link
Contributor Author

#1509 (comment)

I think this is fine for now. In the future it might even make sense to refactor this into a global struct to make it possible to optionally omit the frankenphp directive in the Caddy configuration

Actually a good idea, by the time Provision is called we already have the information if any of our stuff is called or not. I'll refactor this out into a global struct instance so we can make it optional easier in a future PR.

@henderkes henderkes force-pushed the workers branch 2 times, most recently from ecb8ef6 to 571ce92 Compare April 19, 2025 16:54
@henderkes henderkes force-pushed the workers branch 3 times, most recently from 1d015bf to c7172d2 Compare April 19, 2025 17:17
… newWorker with a filepath that already has a suitable worker, simply add number of threads
@henderkes
Copy link
Contributor Author

henderkes commented Apr 24, 2025

@AlliBalliBaba I've picked in the suggestions of your branch that I could get in. I simply cannot manage to register the FrankenPHPApp workers into the workerConfig instead of just f.Workers. Any time I do, I get too many different errors to count.

Edit: Ah, it's because workerConfigs isn't being reset on re-parsing a frankenphp directive, so in the tests where the server isn't shutdown in between, it keeps adding the worker again and again, leading to the various errors.

Anyway, I'm choosing not to add the FrankenPHPApp workers to the workerConfigs because they may be set by json, so we still need to keep f.Workers and parse tham during Start(). If we added them, we'd need to do an indexed iteration in the Start() method to not add global workers twice. Too much hassle for no benefit, as FrankenPHPModule's UnmarshalCaddyfile couldn't possibly conflict with global workers, anyway due to the m# prefix.

@henderkes
Copy link
Contributor Author

henderkes commented Apr 24, 2025

I think the last commit may break BC for library users tests, but I'm not sure. What do you think?

Also, what would the expected behaviour be for loading a config through the admin endpoint - should that reset workers and module workers or keep existing ones loaded?

@IndraGunawan
Copy link
Contributor

IndraGunawan commented Apr 25, 2025

@DubbleClick i just made the same as you do, maybe this can be a reference , here main...IndraGunawan:dunglas-frankenphp:site-worker

{
	frankenphp {
		worker {
			name svc
			file ./svc/index.php
			num 1
			env SERVICE svc
		}
	}
	log {
		level debug
	}
}

:8080 {
	route {
		root ./svc
		php_server {
			env SERVICE php_server8080
			# will use global worker
		}
	}
}

:8081 {
	route {
		root ./svc
		php_server {
			env SERVICE php_server8081
			worker {
				file ./svc/index.php
				env SERVICE "8081-worker"
				num 1
			}
		}
	}
}

@henderkes
Copy link
Contributor Author

Similar enough, this is already ready as far as I would say. I think your implementation is missing that workers inherit the root path and environment variables, though. I think that's an important part because in real scenarios, I cannot think of a situation where you'd want to boot your app with one environment and then handle requests with another.

@AlliBalliBaba
Copy link
Collaborator

I've been thinking, what are your specific use cases for this feature @henderkes, is it always 1 worker per module?
It would also be possible to just always fall back to the worker in that case. That would probably allow putting the worker outside of the public directory, which might be useful for most frameworks.

@henderkes
Copy link
Contributor Author

henderkes commented Apr 26, 2025

For most projects only one, but in one project I have three different workers (public, internal admin access, contractor access).

That being said I could live with one worker per module perfectly well. But I still think global workers shouldn't even be a thing at all, so this option should be able to replace them entirely, if at all possible.

I'm happy with the PR the way it is atm.

@@ -32,6 +32,7 @@ func TestStartAndStopTheMainThreadWithOneInactiveThread(t *testing.T) {
}

func TestTransitionRegularThreadToWorkerThread(t *testing.T) {
workers = nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi; this doesn't actually stop the workers (assuming they aren't stopped) -- it just loses the references to them. In other words, it may mask test failures in other tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't go a garbage collected language and would automatically claim back the resources and memory? If not, I don't think it matters all too much since this isn't something that would happen outside of tests, but I'll look into cleanly shutting them down.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fine in this case, worker threads are stopped at the end of the test with drainPHPThreads() (I'll probably refactor this at some point)

@henderkes
Copy link
Contributor Author

Isn't go a garbage collected language and would automatically claim back the resources and memory? If not, I don't think it matters all too much since this isn't something that would happen outside of tests, but I'll look into cleanly shutting them down.

caddy/caddy.go Outdated
Comment on lines 602 to 607
// Check if a worker with this name already exists
for _, existingWorker := range moduleWorkers {
if existingWorker.Name == wc.Name {
return fmt.Errorf("workers must not have duplicate names: %s", wc.Name)
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure if we should allow this or not.

Reasons to allow it: automatically named workers, based on environment variables, should absolutely be able to reuse the same name.

one.example.com {
    php {
        root /path/to/app/
        worker index.php 2
    }
}
two.example.com {
    php {
        root /path/to/app/
        worker index.php 1
    }
}

Would absolutely work, because they both have the same (no) environment variables.

one.example.com {
    php {
        root /path/to/app/
        env APP_ENV one
        worker {
            name wrk
            file index.php
        }
    }
}
two.example.com {
    php {
        root /path/to/app/
        env APP_ENV two
        worker {
            name wrk
            file index.php
        }
    }
}
# or
two.example.com {
    php {
        root /path/to/app/
        worker {
            name wrk
            file index2.php
        }
    }
}

Should absolutely not work, because they have different environment variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current behaviour is that both would fail, but I'm not so happy about the first case. It should just create a worker with 3 threads.

Comment on lines 1 to 2
//go:build !nocaddy
// +build !nocaddy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I've added them when creating and debugging the config_tests. I can remove the nocaddy tag.

caddy/caddy.go Outdated
Comment on lines 478 to 483
err = frankenphp.WithModuleWorker(workerName)(fc)
if err != nil {
return caddyhttp.Error(http.StatusInternalServerError, err)
}

fr, err := frankenphp.NewRequestWithExistingContext(r, fc)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a shot! Still very new to the project, so a lot of these design decisions will not be up to standards yet. Thank you for the hint :)

Comment on lines +128 to +134
func WithModuleWorker(name string) RequestOption {
return func(o *frankenPHPContext) error {
o.workerName = name

return nil
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I named it "WithWorkerName" so that it wouldn't accidentally be confused with the WithWorkers(workeropts...) function. I also don't think "WithWorker" is a good name, because it would imply we're passing a worker - but we're not!

caddy/caddy.go Outdated
Comment on lines 478 to 483
err = frankenphp.WithModuleWorker(workerName)(fc)
if err != nil {
return caddyhttp.Error(http.StatusInternalServerError, err)
}

fr, err := frankenphp.NewRequestWithExistingContext(r, fc)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think it is necessary. root + path doesn't strip out suffixes from the path information, doesn't follow the specified splitpath and may not correctly resolve the root correctly in case of an embedded app. What can be done is to call the existing NewRequestWithContext twice and parse the filename out after the first time... I don't think that makes sense.

Or I factor the request creation logic out like I did and create a new method to pass the already parsed out context, in order not to perform the same logic twice. I chose the latter and I think that's still the better choice. Wasting performance just to save a public function isn't worth it, imo, especially because libraries may have a legitimate need to parse the filename and other information out, too.

@@ -32,6 +32,7 @@ func TestStartAndStopTheMainThreadWithOneInactiveThread(t *testing.T) {
}

func TestTransitionRegularThreadToWorkerThread(t *testing.T) {
workers = nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't go a garbage collected language and would automatically claim back the resources and memory? If not, I don't think it matters all too much since this isn't something that would happen outside of tests, but I'll look into cleanly shutting them down.

worker.go Outdated
@@ -64,12 +67,25 @@ func initWorkers(opt []workerOpt) error {
return nil
}

func getWorkerKey(name string, filename string) string {
key := filename
if strings.HasPrefix(name, "🧩 ") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if a unicode character as prefix is a good idea since worker names are also used in metrics.

Copy link
Contributor Author

@henderkes henderkes Apr 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took inspiration from the elephant in the logs. Is there something specific about metrics why this wouldn't be okay?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not using Prometheus metrics myself, maybe unicode characters are also fine @withinboredom wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are Prometheus tests already -- I'd maybe add a test for this case if it isn't covered already.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe could we use a # or something like that instead. ASCII compatibility is safer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I've used m# before, see 4cc8893

@IndraGunawan
Copy link
Contributor

i might have lost track of the changes on this PR.

why do we need to add "identifier" to the worker name?

one.example.com {
    php {
        root /path/to/app/
        worker index.php 2
    }
}
two.example.com {
    php {
        root /path/to/app/
        worker index.php 1
    }
}

will it create 2 different workers or 1 worker with 3 thread?

@henderkes
Copy link
Contributor Author

it will create two different workers: m#/path/to/app/index.php with two threads and m#/path/to/app/index.php2 with one thread. The m# identifier is given to worker threads so that we can use the name in the slice rather than the filename without breaking BC for libraries that use global workers with names but expect the filename to be the key in the map.

Otherwise, I would have just reworked all code to always use the name, rather than the filename.

Copy link
Owner

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that changes in workers config will be taken into account?

@henderkes
Copy link
Contributor Author

Are we sure that changes in workers config will be taken into account?

During runtime with the admin API or when?

@dunglas
Copy link
Owner

dunglas commented Apr 29, 2025

Yes, during runtime.

@henderkes
Copy link
Contributor Author

I haven't looked into caddy's source code so I can't say for sure, but the way I've found it explained in the documentation is that the new configuration is parsed, all caddy modules are Provision()'ed and Start()'ed. This would call FrankenPHPApp::Start() again, which will create workers again:

// Add workers from FrankenPHPApp and FrankenPHPModule configurations
// f.Workers may have been set by JSON config, so keep them separate
for _, w := range append(f.Workers, moduleWorkerConfigs...) {
	opts = append(opts, frankenphp.WithWorkers(w.Name, repl.ReplaceKnown(w.FileName, ""), w.Num, w.Env, w.Watch))
}

frankenphp.Shutdown()
if err := frankenphp.Init(opts...); err != nil {
	return err
}

@dunglas
Copy link
Owner

dunglas commented Apr 29, 2025

That would be nice to try, just to be sure.

@henderkes
Copy link
Contributor Author

I've added a test to load a module worker config dynamically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants