Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial dev doc #10776

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions DEV.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
Architecture:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Architecture:
# Architecture:


KGateway is a control plane for envoy based on the gateway-api. This means that we translate K8s objects like Gateways, HttpRoutes, Service, EndpointSlices and User policy into envoy configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
KGateway is a control plane for envoy based on the gateway-api. This means that we translate K8s objects like Gateways, HttpRoutes, Service, EndpointSlices and User policy into envoy configuration.
Kgateway is a control plane for envoy based on the gateway-api. This means that we translate K8s objects like Gateways, HttpRoutes, Service, EndpointSlices and User policy into envoy configuration.

nit: the capitalized form is Kgateway with lowercase g, there are a few more instances below too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/gateway-api/Gateway API /


Our goals with the architecture of the project are to make it scalable and extendable.

To make the project scalable, its importnat to keep the computation minimal when changes occur. For example, when a pod changes, or a policy is updated, we don't do the minimum amount of computation to update the envoy configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For we don't, do you mean we do? If not, the statement contractions itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To make the project scalable, its importnat to keep the computation minimal when changes occur. For example, when a pod changes, or a policy is updated, we don't do the minimum amount of computation to update the envoy configuration.
To make the project scalable, it's important to keep the computation minimal when changes occur. For example, when a pod changes, or a policy is updated, we do the minimum amount of computation to update the envoy configuration.

it says "we don't do the minimum amount" .. assume it should say "we do"?


With extendability, we KGateway to be the basis on-top of which users can easily add their own custom logic. to that end we have a plugin system that allows users to add their own custom logic to the control plane in a way that's opaque to the core code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/we KGateway to be/KGateway is/



Going down further, to enable these goals we use KRT based system. KRT gives us a few advantages:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For KRT, can you provide a link to the best reference for a user to gain additional context? Maybe this one: https://github.com/istio/istio/tree/master/pkg/kube/krt#krt-kubernetes-declarative-controller-runtime

- The ability to complement controllers of custom Intermediate representation (henceforth IR).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Intermediate representation/Intermediate Representation/

Can you expand or reword this sentence to better explain your point?

- Automatically track object dependencies and changes and only invoke logic that depends on the object that changed.

# CRD Journey
How does a user CRD make it into envoy?

We have 3 main translation lifecycles: Routes & Listeners, Clusters and Endpoints.

Let's focus on the first one - Routes and Listeners, as this is where the majority of the logic is.

Envoy Routes & Listeners translate from Gateways, HttpRoutes, and user policies (i.e. RoutePolicy, ListenerPolicy, etc).

## Policies
The first step is to convert each user policy into an IR form. This is done by creating a collection of these objects from k8s, and transforming the collection to an IR representation.

For policies, this is pluggable. Plugins can Contribute a policy to kgateway. Contributing a policy means that we add a policy collection to kgateway. It's the users plugin responsibility to convert the policy CRD to the IR form. Ideally, the IR should look as close as possible to the envoy configuration, so this translation only happens when the policy CRD changes.

You can see in the Plugin interface a field called `ContributesPolicies` which is a map of GK -> `PolicyPlugin`.
The policy plugin contains a bunch of fields, but for out discussion we'll focus on these two:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/out/our/

I agree that not all fields should be discussed here but it will be helpful to improvise the godocs throughout, especially for internal/kgateway/extensions2/plugin/plugin.go and IR types.


```go
type PolicyPlugin struct {
Policies krt.Collection[ir.PolicyWrapper]
NewGatewayTranslationPass func(ctx context.Context, tctx ir.GwTranslationCtx) ir.ProxyTranslationPass
// ... other fields
}
```
Policies is a the collection of policies that the plugin contributes. The plugin is responsible to create
this collection, usually by started with a CRD collection, and then translating to a `ir.PolicyWrapper` struct.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/started/starting/


Lets look at the important fields in the PolicyWrapper:

```go
type PolicyWrapper struct {
// The Group, Kind Name, Namespace to the original policy object
ObjectSource `json:",inline"`
// The IR of the policy objects. ideally with structural errors removed.
// Opaque to us other than metadata.
PolicyIR PolicyIR
// Where to attach the policy. This usually comes from the policy CRD.
TargetRefs []PolicyTargetRef
}
```

The system will make use of the traget refs to attach the policy IR to Listners and HttpRoutes. You will then
get access to the IR during translation (more on that later).

The second field, `NewGatewayTranslationPass` allocates a new translation state for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the lifecycle of the plugin (i.e. when does NewPlugin get called) vs. the translation pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - NewPlugin is called just once; i'll update

gateway/route translation. This function will be invoked during the Translation to xDS phase, so will expand on it later.

## HttpRotues and Gateways

HttpRoutes and Gateways are handled by KGateway. Kgateway builds an IR for HttpRoutes and Gateways, that looks very similar to
the original object, but in additional has an `AttachedPolicies` struct that contains all the policies that are attached to the object.

KGateway uses the `TargetRefs` field in the PolicyWrapper to opaquely attach the policy to the HttpRoute or Gateway.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we mention extensionRef too?


## Translation to xDS

When we reach this phase, we already ahve the Policy -> IR translation done; and we have all the HttpRoutes and Gateways in IR form with the policies attached to them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ahve/have/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should group the 2 phases as sub-headings in this doc: 1) policy -> IR translation (when NewPlugin is called?), 2) IR to xDS (when the ApplyFor* funcs are called)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the policy -> IR translation happens earlier; i'll expand on it in the earlier section


In the begining of the transaltion to xDS, we take all the contributed policies and allocate a `NewGatewayTranslationPass` for each of them. This will hold the state for the length of the translation.

This allows us for example translate a route, and in response to that hold state that tells us to add an http filter.

KGateway handles merging of httproutes per gw-api spec. When it translates GW-api route rules to
envoy routes, it reads the `AttachedPolicies` and calls the appropriate function in the `ProxyTranslationPass` and passes in
the attached policy IR. This let's the policy plugin code the modify the route or listener as needed, based on the policy IR.
32 changes: 21 additions & 11 deletions internal/kgateway/ir/iface.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,13 @@ type ProxyTranslationPass interface {
pCtx *ListenerContext,
out *envoy_config_listener_v3.Listener,
)
// called 1 time per filter chain after listeners
// called 1 time per filter chain after listeners and allows tweaking HCM settings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of having each plugin adhere to this monolothic interface, have we considered defining different plugin types so each plugin only needs to implement the funcs that are relevant? (similar to how gloo did it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did; i prefered this way so its very clear which function is called in which order, and which functions are allowed to share state. I recently added a default impl, so that plugins can only override the functions they need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we somewhere (either in the readme or in code) list explicitly the order in which these functions are called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order should be the order they appear in the interface; i can mention that in the code

ApplyHCM(ctx context.Context,
pCtx *HcmContext,
out *envoy_hcm.HttpConnectionManager) error

// called 1 time for all the routes in a filter chain.
// called 1 time for all the routes in a filter chain. Use this to set default PerFilterConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you state "called 1 time" means 1 time per translation pass, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, correct

// No policy is provided here.
ApplyRouteConfigPlugin(
ctx context.Context,
pCtx *RouteConfigContext,
Expand All @@ -81,32 +82,36 @@ type ProxyTranslationPass interface {
pCtx *VirtualHostContext,
out *envoy_config_route_v3.VirtualHost,
)
// called 0 or more times
// called 0 or more times (one for each route)
// Applies policy for an HTTPRoute that has a policy attached via a targetRef.
// The output configures the envoy_config_route_v3.Route
ApplyForRoute(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't need to be done in this PR, but i wonder if we can make some of these function names more self-explanatory, e.g. ApplyForRoute and ApplyForRouteBackend sound similar, just the attachment method is different. not sure what a better name is offhand..will have to think about it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApplyPolicyToHttpRoute, ApplyPolicyToBackendViaRoute, ApplyForBackend? (I think adding the policy part might make it more clear?)

ctx context.Context,
pCtx *RouteContext,
out *envoy_config_route_v3.Route) error
// runs for policy applied

// Appliesa policy attached to a specific Backend (via extensionRef on the BackendRef).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Appliesa/Applies a/

ApplyForRouteBackend(
ctx context.Context,
policy PolicyIR,
pCtx *RouteBackendContext,
) error
// no policy applied
// no policy applied - this is called for every backend in a route.
// For this to work the backend needs to register itself as a policy. TODO: rethink this.
ApplyForBackend(
ctx context.Context,
pCtx *RouteBackendContext,
in HttpBackend,
out *envoy_config_route_v3.Route,
) error

// called 1 time per listener
// if a plugin emits new filters, they must be with a plugin unique name.
// any filter returned from route config must be disabled, so it doesnt impact other routes.
// called 1 time per filter-chain.
// If a plugin emits new filters, they must be with a plugin unique name.
// filters added to impact specific routes should be disabled on the listener level, so they don't impact other routes.
HttpFilters(ctx context.Context, fc FilterChainCommon) ([]plugins.StagedHttpFilter, error)

NetworkFilters(ctx context.Context) ([]plugins.StagedNetworkFilter, error)
// called 1 time (per envoy proxy). replaces GeneratedResources
// called 1 time (per envoy proxy). replaces GeneratedResources and allows adding clusters to the envoy.
ResourcesToAdd(ctx context.Context) Resources
}

Expand Down Expand Up @@ -157,18 +162,21 @@ type PolicyIR interface {
}

type PolicyWrapper struct {
// A reference to the original policy object
ObjectSource `json:",inline"`
Policy metav1.Object
// The policy object itself. TODO: we can probably remove this
Policy metav1.Object

// Errors processing it for status.
// note: these errors are based on policy itself, regardless of whether it's attached to a resource.
// TODO: change for conditions
Errors []error

// original object. ideally with structural errors removed.
// The IR of the policy objects. ideally with structural errors removed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does "with structural errors removed" mean?

// Opaque to us other than metadata.
PolicyIR PolicyIR

// Where to attach the policy. This usually comes from the policy CRD.
TargetRefs []PolicyTargetRef
}

Expand Down Expand Up @@ -199,6 +207,8 @@ var (
)

type PolicyRun interface {
// Allocate state for single listener+rotue translation pass.
NewGatewayTranslationPass(ctx context.Context, tctx GwTranslationCtx) ProxyTranslationPass
// Process cluster for a backend
ProcessBackend(ctx context.Context, in BackendObjectIR, out *envoy_config_cluster_v3.Cluster) error
}
Loading