|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: "Spotlight on SIG Architecture: API Governance" |
| 4 | +slug: sig-architecture-api |
| 5 | +date: 2026-12-31 # placeholder |
| 6 | +draft: true |
| 7 | +author: "Frederico Muñoz (SAS Institute)" |
| 8 | +--- |
| 9 | + |
| 10 | +_This is the fifth interview of a SIG Architecture Spotlight series that covers the different |
| 11 | +subprojects, and we will be covering [SIG Architecture: API |
| 12 | +Governance](https://github.com/kubernetes/community/blob/master/sig-architecture/README.md#architecture-and-api-governance-1)._ |
| 13 | + |
| 14 | +In this SIG Architecture spotlight we talked with [Jordan Liggitt](https://github.com/liggitt), lead |
| 15 | +of the API Governance sub-project. |
| 16 | + |
| 17 | +## Introduction |
| 18 | + |
| 19 | +**FM: Hello Jordan, thank you for your availability. Tell us a bit about yourself, your role and how |
| 20 | +you got involved in Kubernetes.** |
| 21 | + |
| 22 | +**JL**: My name is Jordan Liggitt. I'm a Christian, husband, father of four, software engineer at |
| 23 | +[Google](https://about.google/) by day, and [amateur musician](https://www.youtube.com/watch?v=UDdr-VIWQwo) by stealth. I was born in Texas (and still |
| 24 | +like to claim it as my point of origin), but I've lived in North Carolina for most of my life. |
| 25 | + |
| 26 | +I've been working on Kubernetes since 2014. At that time, I was working on authentication and |
| 27 | +authorization at Red Hat, and my very first pull request to Kubernetes attempted to [add an OAuth |
| 28 | +server](https://github.com/kubernetes/kubernetes/pull/2328) to the Kubernetes API server. It never |
| 29 | +exited work-in-progress status. I ended up going with a different approach that layered on top of |
| 30 | +the core Kubernetes API server in a different project (spoiler alert: this is foreshadowing), and I |
| 31 | +closed it without merging six months later. |
| 32 | + |
| 33 | +Undeterred by that start, I stayed involved, helped build Kubernetes authentication and |
| 34 | +authorization capabilities, and got involved in the definition and evolution of the core Kubernetes |
| 35 | +APIs from early beta APIs, like `v1beta3` to `v1`. I got tagged as an API reviewer in 2016 based on |
| 36 | +those contributions, and was added as an API approver in 2017. |
| 37 | + |
| 38 | +Today, I help lead the API Governance and code organization subprojects for SIG Architecture, and I |
| 39 | +am a tech lead for SIG Auth. |
| 40 | + |
| 41 | +**FM: And when did you get specifically involved in the API Governance project?** |
| 42 | + |
| 43 | +**JL**: Around 2019. |
| 44 | + |
| 45 | +## Goals and scope of API Governance |
| 46 | + |
| 47 | +**FM: How would you describe the main goals and areas of intervention of the subproject?** |
| 48 | + |
| 49 | +The surface area includes all the various APIs Kubernetes has, and there are APIs that people do not |
| 50 | +always realize are APIs: command-line flags, configuration files, how binaries are run, how they |
| 51 | +talk to back-end components like the container runtime, and how they persist data. People often |
| 52 | +think of "the API" as only the [REST API](https://kubernetes.io/docs/reference/using-api/)... that |
| 53 | +is the biggest and most obvious one, and the one with the largest audience, but all of these other |
| 54 | +surfaces are also APIs. Their audiences are narrower, so there is more flexibility there, but they |
| 55 | +still require consideration. |
| 56 | + |
| 57 | +The goals are to be stable while still enabling innovation. Stability is easy if you never change |
| 58 | +anything, but that contradicts the goal of evolution and growth. So we balance "be stable" with |
| 59 | +"allow change". |
| 60 | + |
| 61 | +**FM: Speaking of changes, in terms of ensuring consistency and quality (which is clearly one of the |
| 62 | +reasons this project exists), what are the specific quality gates in the lifecycle of a Kubernetes |
| 63 | +change? Does API Governance get involved during the release cycle, prior to it through guidelines, |
| 64 | +or somewhere in between? At what points do you ensure the intended role is fulfilled?** |
| 65 | + |
| 66 | +**JL**: We have [guidelines and |
| 67 | +conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md), |
| 68 | +both for APIs in general and for how to change an API. These are living documents that we update as |
| 69 | +we encounter new scenarios. They are long and dense, so we also support them with involvement at |
| 70 | +either the design stage or the implementation stage. |
| 71 | + |
| 72 | +Sometimes, due to bandwidth constraints, teams move ahead with design work without feedback from [API Review](https://github.com/kubernetes/community/blob/master/sig-architecture/api-review-process.md). That’s fine, but it means that when implementation begins, the API review will happen then, |
| 73 | +and there may be substantial feedback. So we get involved when a new API is created or an existing |
| 74 | +API is changed, either at design or implementation. |
| 75 | + |
| 76 | +**FM: Is this during the Kubernetes Enhancement Proposal (KEP) process? Since KEPs are mandatory for |
| 77 | +enhancements, I assume part of the work intersects with API Governance?** |
| 78 | + |
| 79 | +**JL**: It can. [KEPs](https://github.com/kubernetes/enhancements/blob/master/keps/README.md) vary |
| 80 | +in how detailed they are. Some include literal API definitions. When they do, we can perform an API |
| 81 | +review at the design stage. Then implementation becomes a matter of checking fidelity to the design. |
| 82 | + |
| 83 | +Getting involved early is ideal. But some KEPs are conceptual and leave details to the |
| 84 | +implementation. That’s not wrong; it just means the implementation will be more exploratory. Then |
| 85 | +API Review gets involved later, possibly recommending structural changes. |
| 86 | + |
| 87 | +There’s a trade-off regardless: detailed design upfront versus iterative discovery during |
| 88 | +implementation. People and teams work differently, and we’re flexible and happy to consult early or |
| 89 | +at implementation time. |
| 90 | + |
| 91 | +**FM: This reminds me of what Fred Brooks wrote in "The Mythical Man-Month" about conceptual |
| 92 | +integrity being central to product quality... No matter how you structure the process, there must be |
| 93 | +a point where someone looks at what is coming and ensures conceptual integrity. Kubernetes uses APIs |
| 94 | +everywhere -- externally and internally -- so API Governance is critical to maintaining that |
| 95 | +integrity. How is this captured?** |
| 96 | + |
| 97 | +**JL**: Yes, the conventions document captures patterns we’ve learned over time: what to do in |
| 98 | +various situations. We also have automated linters and checks to ensure correctness around patterns |
| 99 | +like spec/status semantics. These automated tools help catch issues even when humans miss them. |
| 100 | + |
| 101 | +As new scenarios arise -- and they do constantly -- we think through how to approach them and fold |
| 102 | +the results back into our documentation and tools. Sometimes it takes a few attempts before we |
| 103 | +settle on an approach that works well. |
| 104 | + |
| 105 | +**FM: Exactly. Each new interaction improves the guidelines.** |
| 106 | + |
| 107 | +**JL**: Right. And sometimes the first approach turns out to be wrong. It may take two or three |
| 108 | +iterations before we land on something robust. |
| 109 | + |
| 110 | +## The impact of Custom Resource Definitions |
| 111 | + |
| 112 | +**FM: Is there any particular change, episode, or domain that stands out as especially noteworthy, |
| 113 | +complex, or interesting in your experience?** |
| 114 | + |
| 115 | +**JL**: The watershed moment was [Custom Resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/). |
| 116 | +Prior to that, every API was handcrafted by us and fully reviewed. There were inconsistencies, but |
| 117 | +we understood and controlled every type and field. |
| 118 | + |
| 119 | +When Custom Resources arrived, anyone could define anything. The first version did not even require |
| 120 | +a schema. That made it extremely powerful -- it enabled change immediately -- but it left us playing |
| 121 | +catch-up on stability and consistency. |
| 122 | + |
| 123 | +When Custom Resources graduated to General Availability (GA), schemas became required, but escape |
| 124 | +hatches still existed for backward compatibility. Since then, we’ve been working on giving CRD |
| 125 | +authors validation capabilities comparable to built-ins. Built-in validation rules for CRDs have |
| 126 | +only just reached GA in the last few releases. |
| 127 | + |
| 128 | +So CRDs opened the "anything is possible" era. Built-in validation rules are the second major |
| 129 | +milestone: bringing consistency back. |
| 130 | + |
| 131 | +The three major themes have been defining schemas, validating data, and handling pre-existing |
| 132 | +invalid data. With ratcheting validation (allowing data to improve without breaking existing |
| 133 | +objects), we can now guide CRD authors toward conventions without breaking the world. |
| 134 | + |
| 135 | +## API Governance in context |
| 136 | + |
| 137 | +**FM: How does API Governance relate to SIG Architecture and API Machinery?** |
| 138 | + |
| 139 | +**JL**: [API Machinery](https://github.com/kubernetes/apimachinery) provides the actual code and |
| 140 | +tools that people build APIs on. They don’t review APIs for storage, networking, scheduling, etc. |
| 141 | + |
| 142 | +SIG Architecture sets the overall system direction and works with API Machinery to ensure the system |
| 143 | +supports that direction. API Governance works with other SIGs building on that foundation to define |
| 144 | +conventions and patterns, ensuring consistent use of what API Machinery provides. |
| 145 | + |
| 146 | +**FM: Thank you. That clarifies the flow. Going back to [release cycles](https://kubernetes.io/releases/release/): do release phases -- enhancements freeze, code |
| 147 | +freeze -- change your workload? Or is API Governance mostly continuous?** |
| 148 | + |
| 149 | +**JL**: We get involved in two places: design and implementation. Design involvement increases |
| 150 | +before enhancements freeze; implementation involvement increases before code freeze. However, many |
| 151 | +efforts span multiple releases, so there is always some design and implementation happening, even |
| 152 | +for work targeting future releases. Between those intense periods, we often have time to work on |
| 153 | +long-term design work. |
| 154 | + |
| 155 | +An anti-pattern we see is teams thinking about a large feature for months and then presenting it |
| 156 | +three weeks before enhancements freeze, saying, "Here is the design, please review." For big changes |
| 157 | +with API impact, it’s much better to involve API Governance early. |
| 158 | + |
| 159 | +And there are good times in the cycle for this -- between freezes -- when people have bandwidth. |
| 160 | +That’s when long-term review work fits best. |
| 161 | + |
| 162 | +## Getting involved |
| 163 | + |
| 164 | +**FM: Clearly. Now, regarding team dynamics and new contributors: how can someone get involved in |
| 165 | +API Governance? What should they focus on?** |
| 166 | + |
| 167 | +**JL**: It’s usually best to follow a specific change rather than trying to learn everything at |
| 168 | +once. Pick a small API change, perhaps one someone else is making or one you want to make, and |
| 169 | +observe the full process: design, implementation, review. |
| 170 | + |
| 171 | +High-bandwidth review -- live discussion over video -- is often very effective. If you’re making or |
| 172 | +following a change, ask whether there’s a time to go over the design or PR together. Observing those |
| 173 | +discussions is extremely instructive. |
| 174 | + |
| 175 | +Start with a small change. Then move to a bigger one. Then maybe a new API. That builds |
| 176 | +understanding of conventions as they are applied in practice. |
| 177 | + |
| 178 | +**FM: Excellent. Any final comments, or anything we missed?** |
| 179 | + |
| 180 | +**JL**: Yes... the reason we care so much about compatibility and stability is for our users. It’s |
| 181 | +easy for contributors to see those requirements as painful obstacles preventing cleanup or requiring |
| 182 | +tedious work... but users integrated with our system, and we made a promise to them: we want them to |
| 183 | +trust that we won’t break that contract. So even when it requires more work, moves slower, or |
| 184 | +involves duplication, we choose stability. |
| 185 | + |
| 186 | +We are not trying to be obstructive; we are trying to make life good for users. |
| 187 | + |
| 188 | +A lot of our questions focus on the future: you want to do something now... how will you evolve it |
| 189 | +later without breaking it? We assume we will know more in the future, and we want the design to |
| 190 | +leave room for that. |
| 191 | + |
| 192 | +We also assume we will make mistakes. The question then is: how do we leave ourselves avenues to |
| 193 | +improve while keeping compatibility promises? |
| 194 | + |
| 195 | +**FM: Exactly. Jordan, thank you, I think we’ve covered everything. This has been an insightful view |
| 196 | +into the API Governance project and its role in the wider Kubernetes project.** |
| 197 | + |
| 198 | +**JL**: Thank you. |
0 commit comments