Skip to content

Commit 7280b58

Browse files
authored
Merge pull request #621 from fsmunoz/sig-arch-api-spotlight
SIG Architecture API Governance Spotlight interview.
2 parents bfab831 + 5885629 commit 7280b58

File tree

1 file changed

+198
-0
lines changed

1 file changed

+198
-0
lines changed
Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
---
2+
layout: blog
3+
title: "Spotlight on SIG Architecture: API Governance"
4+
slug: sig-architecture-api
5+
date: 2026-12-31 # placeholder
6+
draft: true
7+
author: "Frederico Muñoz (SAS Institute)"
8+
---
9+
10+
_This is the fifth interview of a SIG Architecture Spotlight series that covers the different
11+
subprojects, and we will be covering [SIG Architecture: API
12+
Governance](https://github.com/kubernetes/community/blob/master/sig-architecture/README.md#architecture-and-api-governance-1)._
13+
14+
In this SIG Architecture spotlight we talked with [Jordan Liggitt](https://github.com/liggitt), lead
15+
of the API Governance sub-project.
16+
17+
## Introduction
18+
19+
**FM: Hello Jordan, thank you for your availability. Tell us a bit about yourself, your role and how
20+
you got involved in Kubernetes.**
21+
22+
**JL**: My name is Jordan Liggitt. I'm a Christian, husband, father of four, software engineer at
23+
[Google](https://about.google/) by day, and [amateur musician](https://www.youtube.com/watch?v=UDdr-VIWQwo) by stealth. I was born in Texas (and still
24+
like to claim it as my point of origin), but I've lived in North Carolina for most of my life.
25+
26+
I've been working on Kubernetes since 2014. At that time, I was working on authentication and
27+
authorization at Red Hat, and my very first pull request to Kubernetes attempted to [add an OAuth
28+
server](https://github.com/kubernetes/kubernetes/pull/2328) to the Kubernetes API server. It never
29+
exited work-in-progress status. I ended up going with a different approach that layered on top of
30+
the core Kubernetes API server in a different project (spoiler alert: this is foreshadowing), and I
31+
closed it without merging six months later.
32+
33+
Undeterred by that start, I stayed involved, helped build Kubernetes authentication and
34+
authorization capabilities, and got involved in the definition and evolution of the core Kubernetes
35+
APIs from early beta APIs, like `v1beta3` to `v1`. I got tagged as an API reviewer in 2016 based on
36+
those contributions, and was added as an API approver in 2017.
37+
38+
Today, I help lead the API Governance and code organization subprojects for SIG Architecture, and I
39+
am a tech lead for SIG Auth.
40+
41+
**FM: And when did you get specifically involved in the API Governance project?**
42+
43+
**JL**: Around 2019.
44+
45+
## Goals and scope of API Governance
46+
47+
**FM: How would you describe the main goals and areas of intervention of the subproject?**
48+
49+
The surface area includes all the various APIs Kubernetes has, and there are APIs that people do not
50+
always realize are APIs: command-line flags, configuration files, how binaries are run, how they
51+
talk to back-end components like the container runtime, and how they persist data. People often
52+
think of "the API" as only the [REST API](https://kubernetes.io/docs/reference/using-api/)... that
53+
is the biggest and most obvious one, and the one with the largest audience, but all of these other
54+
surfaces are also APIs. Their audiences are narrower, so there is more flexibility there, but they
55+
still require consideration.
56+
57+
The goals are to be stable while still enabling innovation. Stability is easy if you never change
58+
anything, but that contradicts the goal of evolution and growth. So we balance "be stable" with
59+
"allow change".
60+
61+
**FM: Speaking of changes, in terms of ensuring consistency and quality (which is clearly one of the
62+
reasons this project exists), what are the specific quality gates in the lifecycle of a Kubernetes
63+
change? Does API Governance get involved during the release cycle, prior to it through guidelines,
64+
or somewhere in between? At what points do you ensure the intended role is fulfilled?**
65+
66+
**JL**: We have [guidelines and
67+
conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md),
68+
both for APIs in general and for how to change an API. These are living documents that we update as
69+
we encounter new scenarios. They are long and dense, so we also support them with involvement at
70+
either the design stage or the implementation stage.
71+
72+
Sometimes, due to bandwidth constraints, teams move ahead with design work without feedback from [API Review](https://github.com/kubernetes/community/blob/master/sig-architecture/api-review-process.md). That’s fine, but it means that when implementation begins, the API review will happen then,
73+
and there may be substantial feedback. So we get involved when a new API is created or an existing
74+
API is changed, either at design or implementation.
75+
76+
**FM: Is this during the Kubernetes Enhancement Proposal (KEP) process? Since KEPs are mandatory for
77+
enhancements, I assume part of the work intersects with API Governance?**
78+
79+
**JL**: It can. [KEPs](https://github.com/kubernetes/enhancements/blob/master/keps/README.md) vary
80+
in how detailed they are. Some include literal API definitions. When they do, we can perform an API
81+
review at the design stage. Then implementation becomes a matter of checking fidelity to the design.
82+
83+
Getting involved early is ideal. But some KEPs are conceptual and leave details to the
84+
implementation. That’s not wrong; it just means the implementation will be more exploratory. Then
85+
API Review gets involved later, possibly recommending structural changes.
86+
87+
There’s a trade-off regardless: detailed design upfront versus iterative discovery during
88+
implementation. People and teams work differently, and we’re flexible and happy to consult early or
89+
at implementation time.
90+
91+
**FM: This reminds me of what Fred Brooks wrote in "The Mythical Man-Month" about conceptual
92+
integrity being central to product quality... No matter how you structure the process, there must be
93+
a point where someone looks at what is coming and ensures conceptual integrity. Kubernetes uses APIs
94+
everywhere -- externally and internally -- so API Governance is critical to maintaining that
95+
integrity. How is this captured?**
96+
97+
**JL**: Yes, the conventions document captures patterns we’ve learned over time: what to do in
98+
various situations. We also have automated linters and checks to ensure correctness around patterns
99+
like spec/status semantics. These automated tools help catch issues even when humans miss them.
100+
101+
As new scenarios arise -- and they do constantly -- we think through how to approach them and fold
102+
the results back into our documentation and tools. Sometimes it takes a few attempts before we
103+
settle on an approach that works well.
104+
105+
**FM: Exactly. Each new interaction improves the guidelines.**
106+
107+
**JL**: Right. And sometimes the first approach turns out to be wrong. It may take two or three
108+
iterations before we land on something robust.
109+
110+
## The impact of Custom Resource Definitions
111+
112+
**FM: Is there any particular change, episode, or domain that stands out as especially noteworthy,
113+
complex, or interesting in your experience?**
114+
115+
**JL**: The watershed moment was [Custom Resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/).
116+
Prior to that, every API was handcrafted by us and fully reviewed. There were inconsistencies, but
117+
we understood and controlled every type and field.
118+
119+
When Custom Resources arrived, anyone could define anything. The first version did not even require
120+
a schema. That made it extremely powerful -- it enabled change immediately -- but it left us playing
121+
catch-up on stability and consistency.
122+
123+
When Custom Resources graduated to General Availability (GA), schemas became required, but escape
124+
hatches still existed for backward compatibility. Since then, we’ve been working on giving CRD
125+
authors validation capabilities comparable to built-ins. Built-in validation rules for CRDs have
126+
only just reached GA in the last few releases.
127+
128+
So CRDs opened the "anything is possible" era. Built-in validation rules are the second major
129+
milestone: bringing consistency back.
130+
131+
The three major themes have been defining schemas, validating data, and handling pre-existing
132+
invalid data. With ratcheting validation (allowing data to improve without breaking existing
133+
objects), we can now guide CRD authors toward conventions without breaking the world.
134+
135+
## API Governance in context
136+
137+
**FM: How does API Governance relate to SIG Architecture and API Machinery?**
138+
139+
**JL**: [API Machinery](https://github.com/kubernetes/apimachinery) provides the actual code and
140+
tools that people build APIs on. They don’t review APIs for storage, networking, scheduling, etc.
141+
142+
SIG Architecture sets the overall system direction and works with API Machinery to ensure the system
143+
supports that direction. API Governance works with other SIGs building on that foundation to define
144+
conventions and patterns, ensuring consistent use of what API Machinery provides.
145+
146+
**FM: Thank you. That clarifies the flow. Going back to [release cycles](https://kubernetes.io/releases/release/): do release phases -- enhancements freeze, code
147+
freeze -- change your workload? Or is API Governance mostly continuous?**
148+
149+
**JL**: We get involved in two places: design and implementation. Design involvement increases
150+
before enhancements freeze; implementation involvement increases before code freeze. However, many
151+
efforts span multiple releases, so there is always some design and implementation happening, even
152+
for work targeting future releases. Between those intense periods, we often have time to work on
153+
long-term design work.
154+
155+
An anti-pattern we see is teams thinking about a large feature for months and then presenting it
156+
three weeks before enhancements freeze, saying, "Here is the design, please review." For big changes
157+
with API impact, it’s much better to involve API Governance early.
158+
159+
And there are good times in the cycle for this -- between freezes -- when people have bandwidth.
160+
That’s when long-term review work fits best.
161+
162+
## Getting involved
163+
164+
**FM: Clearly. Now, regarding team dynamics and new contributors: how can someone get involved in
165+
API Governance? What should they focus on?**
166+
167+
**JL**: It’s usually best to follow a specific change rather than trying to learn everything at
168+
once. Pick a small API change, perhaps one someone else is making or one you want to make, and
169+
observe the full process: design, implementation, review.
170+
171+
High-bandwidth review -- live discussion over video -- is often very effective. If you’re making or
172+
following a change, ask whether there’s a time to go over the design or PR together. Observing those
173+
discussions is extremely instructive.
174+
175+
Start with a small change. Then move to a bigger one. Then maybe a new API. That builds
176+
understanding of conventions as they are applied in practice.
177+
178+
**FM: Excellent. Any final comments, or anything we missed?**
179+
180+
**JL**: Yes... the reason we care so much about compatibility and stability is for our users. It’s
181+
easy for contributors to see those requirements as painful obstacles preventing cleanup or requiring
182+
tedious work... but users integrated with our system, and we made a promise to them: we want them to
183+
trust that we won’t break that contract. So even when it requires more work, moves slower, or
184+
involves duplication, we choose stability.
185+
186+
We are not trying to be obstructive; we are trying to make life good for users.
187+
188+
A lot of our questions focus on the future: you want to do something now... how will you evolve it
189+
later without breaking it? We assume we will know more in the future, and we want the design to
190+
leave room for that.
191+
192+
We also assume we will make mistakes. The question then is: how do we leave ourselves avenues to
193+
improve while keeping compatibility promises?
194+
195+
**FM: Exactly. Jordan, thank you, I think we’ve covered everything. This has been an insightful view
196+
into the API Governance project and its role in the wider Kubernetes project.**
197+
198+
**JL**: Thank you.

0 commit comments

Comments
 (0)