Skip to content

feat(zrpc): add local kubeconfig fallback for k8s resolver#5383

Open
tangentgo wants to merge 2 commits intozeromicro:masterfrom
tangentgo:feat/k8s-resolver-local-fallback
Open

feat(zrpc): add local kubeconfig fallback for k8s resolver#5383
tangentgo wants to merge 2 commits intozeromicro:masterfrom
tangentgo:feat/k8s-resolver-local-fallback

Conversation

@tangentgo
Copy link

  • Support automatic fallback to ~/.kube/config when not in cluster.
  • Use errors.Join to aggregate configuration loading errors.
  • Improve developer experience for local debugging of k8s:// targets.

Copilot AI review requested due to automatic review settings January 22, 2026 08:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the Kubernetes resolver to support local development by adding automatic fallback to ~/.kube/config when in-cluster configuration is unavailable. This improves developer experience when debugging services that use k8s:// targets outside of a Kubernetes cluster.

Changes:

  • Added fallback logic in kubeBuilder.Build() to load local kubeconfig when InClusterConfig() fails
  • Used errors.Join to aggregate multiple configuration loading errors for better error reporting
  • Updated tests to properly isolate environment by clearing k8s service environment variables and setting HOME to temp directory

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.

File Description
zrpc/resolver/internal/kubebuilder.go Implements fallback to ~/.kube/config using clientcmd.BuildConfigFromFlags when InClusterConfig fails
zrpc/resolver/internal/kubebuilder_test.go Refactored to table-driven tests with proper environment isolation for testing error paths
go.mod Added indirect dependencies for clientcmd support (mergo, pflag)
go.sum Updated checksums for new transitive dependencies

kubeconfig := filepath.Join(home, ".kube", "config")
localConfig, errLocal := clientcmd.BuildConfigFromFlags("", kubeconfig)
if errLocal != nil {
return nil, fmt.Errorf("k8s config load failed: %w", errors.Join(err, errLocal))
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error wrapping here uses both fmt.Errorf with %w and errors.Join, which creates a nested wrapper. Consider simplifying to just errors.Join(err, errLocal) since the "k8s config load failed" message is generic and doesn't add much context. The joined errors already provide specific failure information from both InClusterConfig and BuildConfigFromFlags.

Suggested change
return nil, fmt.Errorf("k8s config load failed: %w", errors.Join(err, errLocal))
return nil, errors.Join(err, errLocal)

Copilot uses AI. Check for mistakes.
@@ -51,7 +55,17 @@ func (b *kubeBuilder) Build(target resolver.Target, cc resolver.ClientConn,

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback mechanism lacks documentation. Consider adding a comment before line 56 to explain the fallback behavior, such as: "Try in-cluster config first, then fall back to ~/.kube/config for local development." This would help developers understand the configuration loading strategy.

Suggested change
// Try in-cluster config first, then fall back to ~/.kube/config for local development.

Copilot uses AI. Check for mistakes.
Comment on lines 17 to 50
func TestKubeBuilder_Build(t *testing.T) {
t.Setenv("HOME", t.TempDir())
t.Setenv("KUBERNETES_SERVICE_HOST", "")
t.Setenv("KUBERNETES_SERVICE_PORT", "")

var b kubeBuilder
u, err := url.Parse(fmt.Sprintf("%s://%s", KubernetesScheme, "a,b"))
assert.NoError(t, err)
cc := &mockedClientConn{}

_, err = b.Build(resolver.Target{
URL: *u,
}, nil, resolver.BuildOptions{})
assert.Error(t, err)
tests := []struct {
name string
input string
}{
{
name: "invalid host",
input: fmt.Sprintf("%s://%s", KubernetesScheme, "a,b"),
},
{
name: "bad endpoint format",
input: fmt.Sprintf("%s://%s:9100/a:b:c", KubernetesScheme, "a,b,c,d"),
},
}

u, err = url.Parse(fmt.Sprintf("%s://%s:9100/a:b:c", KubernetesScheme, "a,b,c,d"))
assert.NoError(t, err)
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
u, err := url.Parse(tt.input)
assert.NoError(t, err)

_, err = b.Build(resolver.Target{
URL: *u,
}, nil, resolver.BuildOptions{})
assert.Error(t, err)
}
_, err = b.Build(resolver.Target{
URL: *u,
}, cc, resolver.BuildOptions{})
assert.Error(t, err)
})
}
} No newline at end of file
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test isolates the environment correctly by setting HOME to a temp directory and clearing k8s service environment variables. However, there's no test coverage for the successful kubeconfig fallback path. While testing the full k8s client initialization is complex, consider adding a test case that verifies the fallback logic attempts to read from the correct path when InClusterConfig fails. This would help ensure the fallback mechanism works as intended during local development.

Copilot uses AI. Check for mistakes.
@tangentgo
Copy link
Author

@copilot open a new pull request to apply changes based on the comments in this thread

@tangentgo
Copy link
Author

This way, when debugging locally, there’s no need to modify configuration files or use tools like Telepresence.

@kevwan kevwan force-pushed the feat/k8s-resolver-local-fallback branch from 0668123 to 8d94c79 Compare January 24, 2026 07:21
@codecov
Copy link

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zrpc/resolver/internal/kubebuilder.go 62.50% 2 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Contributor

@kevwan kevwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! The goal of improving local development experience is valuable.

However, I have a concern about the silent fallback behavior that could lead to unintended production access.

Security Concern

Consider this scenario:

  1. Developer has ~/.kube/config with multiple contexts (dev, staging, prod)
  2. Their current-context happens to be set to production
  3. They run a service locally for testing
  4. InClusterConfig() fails (expected - not in k8s)
  5. Fallback silently loads kubeconfig with prod context
  6. RPC calls unexpectedly go to production services

This could cause:

  • Unintended data modifications in production
  • Security/compliance issues
  • Debugging confusion ("why is my local service hitting prod?")

Suggested Mitigations

Please consider the following approach:

Dev/Test Mode Only

Only enable fallback in non-production modes:

if mode == "dev" || mode == "test" {
    // fallback logic
}

The implicit nature of this change could surprise users who aren't aware their local testing is hitting real clusters. An explicit opt-in mechanism would make this feature both useful and safe.

@tangentgo tangentgo force-pushed the feat/k8s-resolver-local-fallback branch from 39cc8b4 to a9ba0e8 Compare January 24, 2026 14:43
@tangentgo tangentgo marked this pull request as draft January 26, 2026 02:26
@tangentgo
Copy link
Author

@kevwan
Thank you so much for your guidance and the help you've provided! As a contributor, I really appreciate your patience.

Regarding the implementation: If we determine the mode through the schema, it seems I can only do so from the RPC client side. This would require adding a mode attribute to RpcClientConf, but I’m concerned this change might be a bit too intrusive for the current architecture.

Would it be better to leave the decision to the developer instead? For instance, allowing them to define the mode via environment variables might offer more flexibility. What do you think?

- Support automatic fallback to ~/.kube/config when not in cluster.
- Use errors.Join to aggregate configuration loading errors.
- Improve developer experience for local debugging of k8s:// targets.
@tangentgo tangentgo force-pushed the feat/k8s-resolver-local-fallback branch from a9ba0e8 to dced64e Compare January 30, 2026 07:38
@tangentgo tangentgo marked this pull request as ready for review January 30, 2026 07:59
@tangentgo tangentgo requested a review from kevwan January 30, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants