feat(zrpc): add local kubeconfig fallback for k8s resolver#5383
feat(zrpc): add local kubeconfig fallback for k8s resolver#5383tangentgo wants to merge 2 commits intozeromicro:masterfrom
Conversation
tangentgo
commented
Jan 22, 2026
- Support automatic fallback to ~/.kube/config when not in cluster.
- Use errors.Join to aggregate configuration loading errors.
- Improve developer experience for local debugging of k8s:// targets.
There was a problem hiding this comment.
Pull request overview
This PR enhances the Kubernetes resolver to support local development by adding automatic fallback to ~/.kube/config when in-cluster configuration is unavailable. This improves developer experience when debugging services that use k8s:// targets outside of a Kubernetes cluster.
Changes:
- Added fallback logic in
kubeBuilder.Build()to load local kubeconfig whenInClusterConfig()fails - Used
errors.Jointo aggregate multiple configuration loading errors for better error reporting - Updated tests to properly isolate environment by clearing k8s service environment variables and setting HOME to temp directory
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| zrpc/resolver/internal/kubebuilder.go | Implements fallback to ~/.kube/config using clientcmd.BuildConfigFromFlags when InClusterConfig fails |
| zrpc/resolver/internal/kubebuilder_test.go | Refactored to table-driven tests with proper environment isolation for testing error paths |
| go.mod | Added indirect dependencies for clientcmd support (mergo, pflag) |
| go.sum | Updated checksums for new transitive dependencies |
| kubeconfig := filepath.Join(home, ".kube", "config") | ||
| localConfig, errLocal := clientcmd.BuildConfigFromFlags("", kubeconfig) | ||
| if errLocal != nil { | ||
| return nil, fmt.Errorf("k8s config load failed: %w", errors.Join(err, errLocal)) |
There was a problem hiding this comment.
The error wrapping here uses both fmt.Errorf with %w and errors.Join, which creates a nested wrapper. Consider simplifying to just errors.Join(err, errLocal) since the "k8s config load failed" message is generic and doesn't add much context. The joined errors already provide specific failure information from both InClusterConfig and BuildConfigFromFlags.
| return nil, fmt.Errorf("k8s config load failed: %w", errors.Join(err, errLocal)) | |
| return nil, errors.Join(err, errLocal) |
| @@ -51,7 +55,17 @@ func (b *kubeBuilder) Build(target resolver.Target, cc resolver.ClientConn, | |||
|
|
|||
There was a problem hiding this comment.
The fallback mechanism lacks documentation. Consider adding a comment before line 56 to explain the fallback behavior, such as: "Try in-cluster config first, then fall back to ~/.kube/config for local development." This would help developers understand the configuration loading strategy.
| // Try in-cluster config first, then fall back to ~/.kube/config for local development. |
| func TestKubeBuilder_Build(t *testing.T) { | ||
| t.Setenv("HOME", t.TempDir()) | ||
| t.Setenv("KUBERNETES_SERVICE_HOST", "") | ||
| t.Setenv("KUBERNETES_SERVICE_PORT", "") | ||
|
|
||
| var b kubeBuilder | ||
| u, err := url.Parse(fmt.Sprintf("%s://%s", KubernetesScheme, "a,b")) | ||
| assert.NoError(t, err) | ||
| cc := &mockedClientConn{} | ||
|
|
||
| _, err = b.Build(resolver.Target{ | ||
| URL: *u, | ||
| }, nil, resolver.BuildOptions{}) | ||
| assert.Error(t, err) | ||
| tests := []struct { | ||
| name string | ||
| input string | ||
| }{ | ||
| { | ||
| name: "invalid host", | ||
| input: fmt.Sprintf("%s://%s", KubernetesScheme, "a,b"), | ||
| }, | ||
| { | ||
| name: "bad endpoint format", | ||
| input: fmt.Sprintf("%s://%s:9100/a:b:c", KubernetesScheme, "a,b,c,d"), | ||
| }, | ||
| } | ||
|
|
||
| u, err = url.Parse(fmt.Sprintf("%s://%s:9100/a:b:c", KubernetesScheme, "a,b,c,d")) | ||
| assert.NoError(t, err) | ||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| u, err := url.Parse(tt.input) | ||
| assert.NoError(t, err) | ||
|
|
||
| _, err = b.Build(resolver.Target{ | ||
| URL: *u, | ||
| }, nil, resolver.BuildOptions{}) | ||
| assert.Error(t, err) | ||
| } | ||
| _, err = b.Build(resolver.Target{ | ||
| URL: *u, | ||
| }, cc, resolver.BuildOptions{}) | ||
| assert.Error(t, err) | ||
| }) | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
The test isolates the environment correctly by setting HOME to a temp directory and clearing k8s service environment variables. However, there's no test coverage for the successful kubeconfig fallback path. While testing the full k8s client initialization is complex, consider adding a test case that verifies the fallback logic attempts to read from the correct path when InClusterConfig fails. This would help ensure the fallback mechanism works as intended during local development.
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
This way, when debugging locally, there’s no need to modify configuration files or use tools like Telepresence. |
0668123 to
8d94c79
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Thanks for the PR! The goal of improving local development experience is valuable.
However, I have a concern about the silent fallback behavior that could lead to unintended production access.
Security Concern
Consider this scenario:
- Developer has
~/.kube/configwith multiple contexts (dev, staging, prod) - Their
current-contexthappens to be set to production - They run a service locally for testing
InClusterConfig()fails (expected - not in k8s)- Fallback silently loads kubeconfig with prod context
- RPC calls unexpectedly go to production services
This could cause:
- Unintended data modifications in production
- Security/compliance issues
- Debugging confusion ("why is my local service hitting prod?")
Suggested Mitigations
Please consider the following approach:
Dev/Test Mode Only
Only enable fallback in non-production modes:
if mode == "dev" || mode == "test" {
// fallback logic
}The implicit nature of this change could surprise users who aren't aware their local testing is hitting real clusters. An explicit opt-in mechanism would make this feature both useful and safe.
39cc8b4 to
a9ba0e8
Compare
|
@kevwan Regarding the implementation: If we determine the mode through the schema, it seems I can only do so from the RPC client side. This would require adding a mode attribute to RpcClientConf, but I’m concerned this change might be a bit too intrusive for the current architecture. Would it be better to leave the decision to the developer instead? For instance, allowing them to define the mode via environment variables might offer more flexibility. What do you think? |
- Support automatic fallback to ~/.kube/config when not in cluster. - Use errors.Join to aggregate configuration loading errors. - Improve developer experience for local debugging of k8s:// targets.
a9ba0e8 to
dced64e
Compare
…llback and relevant tests