Skip to content

Add dynamic configuration to a BootConfig message. #158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

robshakir
Copy link
Contributor

@robshakir robshakir commented May 19, 2025

 * (M) proto/bootz.proto
   - In some cases, the configuration that needs to be applied
     through the boot process is not immutable per the current
     specification of the bootz process. This change adds the
     ability to provide dynamic configuration during an initial
     boot that may be overwritten later in the process.

Change justification

Consider the scenario that a device is being boostrapped from factory-reset (zero-ised) state. In such cases, the bootz server provides a BootConfig which contains the minimum manageable state of the device. This may include some configuration such as ACLs that protect the control-plane of the device. Currently, such an ACL becomes immutable, which means that to change it, a client needs to know that it must also call gnoi.bootconfig.SetBootConfig as well as gnmi.Set to change some configuration of the device. This is a workable solution -- but adds some complexity to the calling client system.

An alternative design -- proposed herein -- is to allow two different types of configuration to be applied during the initial bootstrap operation:

  1. immutable (precendence 10) boot configuration -- which contains the static minimum-manageable-state configuration of the device.
  2. dynamic (precedence 100) configuration -- which contains dynamic config that is required during the initial bootstrap.

This simplifies the overall operation for configuration management, since configuration that is truly immutable boot configuration still remains within the bootz namespace, and configuration that is required at boot time but is dynamic remains within a namespace where it can be mutated.

Alternate designs

There are alternate designs that could be considered here.

Configuration management system calls gnoi.bootconfig.SetBootConfig as well as gnmi.Set.

Pros:

  • Requires no changes to the existing RPCs that are available from a network device.

Cons:

  • Confuses the concept of the boot configuration, which today is considered immutable.
  • Requires the configuration management system to have permissions to update the bootconfig -- which affects the device's base management reachability.

(Proposed) Allow for initial dynamic configuration to be supplied through bootz.

Pros:

  • Clearly maintains the idea of an immutable boot configuration, and a dynamic configuration that has one RPC that can be used to manage it.
  • Allows the immutable minimum-manageable configuration to be kept static -- via an RPC that can have more constrained security policies reflected against it.

Cons:

  • Requires caution to ensure that the dynamic configuration is well-understood. In the factory-fresh bootstrap case it is clearer -- since there is no dynamic configuration. However, in future cases gnoi.bootconfig.SetBootConfig can be called, in these cases, the dynamic_*_config fields must not be specified otherwise this will override the dynamic configuration.

 * (M) proto/bootz.proto
   - In some cases, the configuration that needs to be applied
     through the boot process is not immutable per the current
     specification of the bootz process. This change adds the
     ability to provide dynamic configuration during an initial
     boot that may be overwritten later in the process.
@robshakir robshakir requested a review from marcushines May 19, 2025 21:17
Copy link

@LimeHat LimeHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please indicate how the boot process should consume all these configs? In the previous design, it was guaranteed that the boot config is a full, valid configuration file that can be consumed by the system (at least during the initial boot).

With two separate configurations:

  • do you expect that the NOS should create a single "candidate" before performing any validation, or
  • can (or should) we validate boot config (vendor_config/oc_config) and commit it in an independent manner?

@robshakir
Copy link
Contributor Author

In the current specification, there's already some need for combination -- since you might have oc_config and vendor_config specified -- i.e., the union of these two must be a valid configuration. (It's not clear to me what a "full" config means -- can you clarify please?)

The proposal here is that this union behaviour is extended, such that the 'candidate' becomes the union of (oc_config + vendor_config + dynamic_oc_config + dynamic_vendor_config). My expectation (and proposal) is that the contents of these two unions does not change in the initial process, but rather the behaviour change is for subsequent configuration sets.

I'd expect that if this proposal is accepted that validation is done on the union that is created per the above.

@LimeHat
Copy link

LimeHat commented May 20, 2025

In the current specification, there's already some need for combination -- since you might have oc_config and vendor_config specified -- i.e., the union of these two must be a valid configuration.

that's true (in theory; to my knowledge oc_config is not used much)

(It's not clear to me what a "full" config means -- can you clarify please?)

A configuration blob that can pass validation and be consumed by the system independently from the dynamic configuration; I didn't mean it in terms of overall completeness.

The proposal here is that this union behaviour is extended, such that the 'candidate' becomes the union of (oc_config + vendor_config + dynamic_oc_config + dynamic_vendor_config).

Are we still targeting a practical objective?

If an operator knows that

  • a union of four configuration objects (speaking of complexity!) is valid at the boot time,
  • two of these objects can be modified at a later time,

What guarantees that a union of the two "immutable" objects is a valid configuration at a later point?
If there's no guarantee, then what is the benefit? I believe one of the stated goals was to prevent the device from becoming "unmanageable" due to configuration errors.

@robshakir
Copy link
Contributor Author

In the current specification, there's already some need for combination -- since you might have oc_config and vendor_config specified -- i.e., the union of these two must be a valid configuration.

that's true (in theory; to my knowledge oc_config is not used much)

yet :-)

The proposal here is that this union behaviour is extended, such that the 'candidate' becomes the union of (oc_config + vendor_config + dynamic_oc_config + dynamic_vendor_config).

Are we still targeting a practical objective?

Yes -- here's the case that we need to deal with (although it can be simplified if vendors support this more coherently, but today, we find that the development cycle isn't fast enough to realise these simpler solutions):

  • there are knobs that are required to make a system act in a particular way that are not standard ("run the hardware in this way") -- today, multiple vendors need these to be supported as CLI, and they are immutable (vendor_config).
  • we have some standard configuration that can be applied across all vendors that is modelled (e.g., enabling gNMI), this is immutable as the box becomes unmanageable without it (oc_config).
  • we have some vendor-specific configuration because a vendor does not yet support a model that is required in dynamic config (dynamic_vendor_config)
  • we have some vendor-neutral configuration that is now supported in OC (dynamic_oc_config).

Today, we have all of these cases. If we got rid of oc_config then maybe it'd be simpler, but this puts cost onto the config generation system for something that is supported in a vendor-neutral model, which then continues to maintain system complexity.

If an operator knows that

  • a union of four configuration objects (speaking of complexity!) is valid at the boot time,
  • two of these objects can be modified at a later time,

What guarantees that a union of the two "immutable" objects is a valid configuration at a later point? If there's no guarantee, then what is the benefit? I believe one of the stated goals was to prevent the device from becoming "unmanageable" due to configuration errors.

Testing? We can update the "immutable" config via gnoi.bootconfig.SetBootConfig to ensure that this is there is compatibility between these different configurations.

The practical problem we have today is that ACLs are needed at startup (we don't want to leave a device with no ACLs from a security perspective), but they also need to be updated on an ongoing basis -- i.e., they are not immutable.

@LimeHat
Copy link

LimeHat commented May 20, 2025

Testing?

Sure. I'm just skeptical that the existence of the boot namespace is worth the cost at this point.
If gNMI changes can easily render the device unmanageable (and ACLs are a good example of that!), then we can move one step forward and say that all config except gNSI is dynamic; and union_replace/mixed-origin are used to manage it.

@robshakir
Copy link
Contributor Author

Testing?

Sure. I'm just skeptical that the existence of the boot namespace is worth the cost at this point. If gNMI changes can easily render the device unmanageable (and ACLs are a good example of that!), then we can move one step forward and say that all config except gNSI is dynamic; and union_replace/mixed-origin are used to manage it.

I'm not sure that this is the only concern though -- it's also whether there are different owners of this configuration, and the relationship between the two. In the scenario that there is a bootstrap configuration generator, then this can be used to store all the feature-flag/vendor-specific knobs that are required. I'd love to be able to get rid of these, but I do not yet see a path to do so. The boot namespace means that you can have these knobs be isolated, and ensure that they will always apply even if there's some SNAFU in the dynamic config generation.

I certainly see the concern -- but I think that this is something that should be handled outside of this change. If we don't have these new fields, the simplification that you're proposing is not actually possible in the future. (Especially given the bootz namespace is already implemented.)

@LimeHat
Copy link

LimeHat commented May 20, 2025

but I think that this is something that should be handled outside of this change.

I can agree with that.

For this change, though, can you please add a note that the configs are always unioned before validation/commit? Either to the proto or to the README text. Thanks.

@robshakir
Copy link
Contributor Author

but I think that this is something that should be handled outside of this change.

I can agree with that.

ACK -- happy to discuss it.

For this change, though, can you please add a note that the configs are always unioned before validation/commit? Either to the proto or to the README text. Thanks.

Yes, done in the proto.

@marcushines - PTAL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants