This document describes how KubeStellar Console provides cluster context and tool-calling capabilities to Kagenti agents.
Kagenti agents running in your Kubernetes clusters can now access cluster state through a bridge provided by the KubeStellar Console backend. When you send a message to a Kagenti agent through the console, the system automatically:
- Injects cluster context — A structured summary of all clusters, their health status, node counts, and pod counts is prepended to your message
- Exposes console tools — Agents can invoke tools to query cluster state in real-time
- Routes tool calls — Tool invocations are proxied through the console backend to the appropriate Kubernetes API handlers
sequenceDiagram
participant User
participant Frontend
participant Console Backend
participant Kagenti Agent
participant K8s API
User->>Frontend: "How many pods are running?"
Frontend->>Console Backend: POST /api/kagenti-provider/chat
Console Backend->>K8s API: Get cluster summary
K8s API-->>Console Backend: Cluster list with health/nodes/pods
Console Backend->>Console Backend: Inject context into message
Console Backend->>Kagenti Agent: Invoke with enriched message
Kagenti Agent->>Kagenti Agent: Process message + context
Kagenti Agent->>Console Backend: (optional) Call tool: get_pod_list
Console Backend->>K8s API: Get pods from cluster
K8s API-->>Console Backend: Pod list
Console Backend-->>Kagenti Agent: Tool result
Kagenti Agent-->>Console Backend: Stream response
Console Backend-->>Frontend: SSE stream
Frontend-->>User: "You have 42 pods running across 3 clusters"
Every chat message sent to a Kagenti agent is automatically enriched with cluster context. The console backend prepends a system context block to the user's message.
--- SYSTEM CONTEXT ---
You have access to the following Kubernetes clusters:
Cluster: production-us-east
Status: Healthy
Nodes: 12
Pods: 287
Cluster: staging-eu-west
Status: Healthy
Nodes: 4
Pods: 58
Cluster: dev-local
Status: Unhealthy
Nodes: 1
Pods: 5
You can use the following tools to query cluster state:
- get_cluster_list: Returns detailed cluster information
- get_pod_list(cluster, namespace): Returns pods in a namespace
- get_events(cluster, namespace): Returns recent warning events
--- END CONTEXT ---
[User's original message here]
- User sends a message via the console UI
- Frontend calls
POST /api/kagenti-provider/chatwith the message - Console backend:
- Queries
DeduplicatedClusters()to get cluster list - Builds context block with cluster names, health, node count, pod count
- Prepends context block to the user's message
- Queries
- Enriched message is forwarded to the Kagenti agent
- Agent receives both the context and the user's query in a single invocation
Cluster context gathering has a 10-second timeout. If cluster queries fail or time out, the original message is forwarded without context enrichment. This ensures agent availability even when cluster access is degraded.
Returns a list of all deduplicated Kubernetes clusters with health status, node count, and pod count.
Input Schema:
{
"type": "object",
"properties": {}
}Example Response:
{
"tool": "get_cluster_list",
"result": [
{
"name": "production-us-east",
"context": "prod-us-east-ctx",
"healthy": true,
"nodeCount": 12,
"podCount": 287
},
{
"name": "staging-eu-west",
"context": "staging-eu-ctx",
"healthy": true,
"nodeCount": 4,
"podCount": 58
}
]
}Returns a list of pods in a specific cluster and namespace.
Input Schema:
{
"type": "object",
"properties": {
"cluster": {
"type": "string",
"description": "Cluster name"
},
"namespace": {
"type": "string",
"description": "Kubernetes namespace (leave empty for all namespaces)"
}
},
"required": ["cluster"]
}Example Invocation:
{
"tool": "get_pod_list",
"args": {
"cluster": "production-us-east",
"namespace": "default"
}
}Example Response:
{
"tool": "get_pod_list",
"result": [
{
"name": "nginx-7d8b49c9bf-xjk2m",
"namespace": "default",
"cluster": "production-us-east",
"status": "Running",
"ready": 1,
"total": 1,
"restarts": 0,
"age": "2d3h",
"containers": [
{
"name": "nginx",
"image": "nginx:1.21",
"ready": true,
"state": "running"
}
]
}
]
}Returns recent warning events from a specific cluster and namespace.
Input Schema:
{
"type": "object",
"properties": {
"cluster": {
"type": "string",
"description": "Cluster name"
},
"namespace": {
"type": "string",
"description": "Kubernetes namespace (leave empty for all namespaces)"
},
"limit": {
"type": "number",
"description": "Maximum number of events to return (default: 50)"
}
},
"required": ["cluster"]
}Example Invocation:
{
"tool": "get_events",
"args": {
"cluster": "production-us-east",
"namespace": "kube-system",
"limit": 10
}
}Example Response:
{
"tool": "get_events",
"result": [
{
"type": "Warning",
"reason": "BackOff",
"message": "Back-off restarting failed container",
"involvedObject": {
"kind": "Pod",
"name": "coredns-5d78c9869d-abc12",
"namespace": "kube-system"
},
"firstTimestamp": "2024-01-15T10:30:00Z",
"lastTimestamp": "2024-01-15T10:35:00Z",
"count": 5
}
]
}Lists available console tools for Kagenti agents.
Response:
{
"tools": [
{
"name": "get_cluster_list",
"description": "Returns a list of all Kubernetes clusters with health status, node count, and pod count",
"inputSchema": {
"type": "object",
"properties": {}
}
},
{
"name": "get_pod_list",
"description": "Returns a list of pods in a specific cluster and namespace",
"inputSchema": {
"type": "object",
"properties": {
"cluster": {
"type": "string",
"description": "Cluster name"
},
"namespace": {
"type": "string",
"description": "Kubernetes namespace (leave empty for all namespaces)"
}
},
"required": ["cluster"]
}
},
{
"name": "get_events",
"description": "Returns recent warning events from a specific cluster and namespace",
"inputSchema": {
"type": "object",
"properties": {
"cluster": {
"type": "string",
"description": "Cluster name"
},
"namespace": {
"type": "string",
"description": "Kubernetes namespace (leave empty for all namespaces)"
},
"limit": {
"type": "number",
"description": "Maximum number of events to return (default: 50)"
}
},
"required": ["cluster"]
}
}
]
}Routes tool calls to the appropriate console handlers.
Request:
{
"tool": "get_pod_list",
"args": {
"cluster": "production-us-east",
"namespace": "default"
}
}Response:
{
"tool": "get_pod_list",
"result": [...]
}Error Response:
{
"error": "cluster parameter is required"
}Streams a Kagenti agent conversation via SSE with automatic cluster context injection.
Request:
{
"agent": "my-agent",
"namespace": "default",
"message": "How many pods are running?",
"contextId": "optional-session-id"
}Response: SSE stream with data: [text] events and data: [DONE] terminator.
To add a new tool to the Kagenti tool bridge:
Add a new handler method in pkg/api/handlers/kagenti_provider_proxy.go:
// handleGetDeployments implements the get_deployments tool
func (h *KagentiProviderProxyHandler) handleGetDeployments(c *fiber.Ctx, args map[string]any) error {
cluster, ok := args["cluster"].(string)
if !ok || cluster == "" {
return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "cluster parameter is required"})
}
namespace := ""
if ns, ok := args["namespace"].(string); ok {
namespace = ns
}
ctx, cancel := context.WithTimeout(c.Context(), clusterContextTimeout)
defer cancel()
deployments, err := h.k8sClient.GetDeployments(ctx, cluster, namespace)
if err != nil {
slog.Error("get_deployments failed", "error", err, "cluster", cluster, "namespace", namespace)
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{"error": "failed to fetch deployments"})
}
return c.JSON(fiber.Map{
"tool": "get_deployments",
"result": deployments,
})
}Update the GetTools() method to include your new tool:
tools = append(tools, map[string]any{
"name": "get_deployments",
"description": "Returns a list of deployments in a specific cluster and namespace",
"inputSchema": map[string]any{
"type": "object",
"properties": map[string]any{
"cluster": map[string]any{
"type": "string",
"description": "Cluster name",
},
"namespace": map[string]any{
"type": "string",
"description": "Kubernetes namespace (leave empty for all namespaces)",
},
},
"required": []string{"cluster"},
},
})Add a case to the CallToolDirect switch statement:
switch req.Tool {
case "get_cluster_list":
return h.handleGetClusterList(c)
case "get_pod_list":
return h.handleGetPodList(c, req.Args)
case "get_events":
return h.handleGetEvents(c, req.Args)
case "get_deployments":
return h.handleGetDeployments(c, req.Args)
default:
return c.Status(fiber.StatusNotFound).JSON(fiber.Map{"error": "unknown tool"})
}If your tool should be advertised in the cluster context block, update enrichMessageWithClusterContext():
contextBuilder.WriteString("You can use the following tools to query cluster state:\n")
contextBuilder.WriteString("- get_cluster_list: Returns detailed cluster information\n")
contextBuilder.WriteString("- get_pod_list(cluster, namespace): Returns pods in a namespace\n")
contextBuilder.WriteString("- get_events(cluster, namespace): Returns recent warning events\n")
contextBuilder.WriteString("- get_deployments(cluster, namespace): Returns deployments\n")Here's a full example showing the entire flow for a hypothetical get_services tool:
// In pkg/api/handlers/kagenti_provider_proxy.go
// handleGetServices implements the get_services tool
func (h *KagentiProviderProxyHandler) handleGetServices(c *fiber.Ctx, args map[string]any) error {
cluster, ok := args["cluster"].(string)
if !ok || cluster == "" {
return c.Status(fiber.StatusBadRequest).JSON(fiber.Map{"error": "cluster parameter is required"})
}
namespace := ""
if ns, ok := args["namespace"].(string); ok {
namespace = ns
}
ctx, cancel := context.WithTimeout(c.Context(), clusterContextTimeout)
defer cancel()
// Assuming GetServices exists in k8s.MultiClusterClient
services, err := h.k8sClient.GetServices(ctx, cluster, namespace)
if err != nil {
slog.Error("get_services failed", "error", err, "cluster", cluster, "namespace", namespace)
return c.Status(fiber.StatusInternalServerError).JSON(fiber.Map{"error": "failed to fetch services"})
}
return c.JSON(fiber.Map{
"tool": "get_services",
"result": services,
})
}
// In GetTools(), add:
tools = append(tools, map[string]any{
"name": "get_services",
"description": "Returns a list of Kubernetes services in a specific cluster and namespace",
"inputSchema": map[string]any{
"type": "object",
"properties": map[string]any{
"cluster": map[string]any{
"type": "string",
"description": "Cluster name",
},
"namespace": map[string]any{
"type": "string",
"description": "Kubernetes namespace (leave empty for all namespaces)",
},
},
"required": []string{"cluster"},
},
})
// In CallToolDirect(), add:
case "get_services":
return h.handleGetServices(c, req.Args)- In-cluster kubeconfig: The console must have access to an in-cluster kubeconfig or a kubeconfig with credentials for the target clusters
- RBAC permissions: The console's ServiceAccount must have permissions to list clusters, pods, events, and any other resources exposed via tools
- Network reachability: The console backend must be able to reach the Kubernetes API servers
- Tool callback URL: The Kagenti agent must be configured to call tools back to the console endpoint (
/api/kagenti-provider/tools/call-direct) - Agent-to-agent communication: If using the Kagenti A2A protocol, agents must have network access to the console backend
Mission Control can reach Kagenti in two supported in-cluster layouts:
-
Controller mode (default)
- The console talks to the Kagenti controller and discovers agents from
GET /api/kagenti-provider/agents - Missions require at least one registered agent in that discovery list
- If the controller is reachable but returns zero agents, Mission Control cannot start a mission
- The console talks to the Kagenti controller and discovers agents from
-
Direct-agent mode
- The console skips controller discovery and targets one agent service directly
- Use this when you want Missions to always talk to a specific agent
Helm values
kagenti:
directAgentUrl: "http://my-agent.my-namespace.svc:8080"
directAgentName: "my-agent"
directAgentNamespace: "my-namespace"Equivalent environment variables
KAGENTI_AGENT_URL=http://my-agent.my-namespace.svc:8080
KAGENTI_AGENT_NAME=my-agent
KAGENTI_AGENT_NAMESPACE=my-namespacekagenti.directAgentUrl/KAGENTI_AGENT_URLenables direct-agent modekagenti.directAgentName/KAGENTI_AGENT_NAMEcontrols the displayed agent name in the consolekagenti.directAgentNamespace/KAGENTI_AGENT_NAMESPACEcontrols the displayed namespace in the console- If you stay in controller mode, make sure at least one Kagenti agent is registered before opening Mission Control
- Cluster context timeout: Context gathering has a 10-second timeout per chat invocation. For clusters with slow API servers, context may be omitted
- Tool call latency: Tool calls are synchronous HTTP requests to Kubernetes APIs. Large result sets or slow clusters can cause delays
- No caching: Cluster context is fetched fresh on every chat invocation. Future versions may add caching
- Privilege escalation: Tools run with the console backend's ServiceAccount privileges, not the user's. Ensure proper RBAC is configured
- Data exposure: Cluster state is exposed to Kagenti agents. Only deploy agents you trust
- No audit trail: Tool invocations are logged but not stored in an audit log. Monitor console backend logs for suspicious activity
Symptom: Agent responds with "I don't have access to cluster data" even though clusters are configured.
Diagnosis:
- Check console backend logs for "failed to fetch cluster list for kagenti context"
- Verify console has a valid kubeconfig:
kubectl config view(inside console pod) - Test cluster connectivity:
curl -k https://<api-server>/api/v1/namespaces
Solution: Ensure the console's kubeconfig is correctly mounted and has cluster access.
Symptom: POST /api/kagenti-provider/tools/call-direct returns 503.
Diagnosis:
- Check if
h.k8sClientis nil (console started without a kubeconfig) - Verify
KUBECONFIGenvironment variable or in-cluster config
Solution: Provide a valid kubeconfig to the console backend.
Symptom: Tool invocations hang or return errors after 10 seconds.
Diagnosis:
- Check if the target cluster is reachable from the console pod
- Review cluster health:
GET /api/kagenti-provider/tools/call-directwith{"tool": "get_cluster_list"} - Check for slow API servers or large namespaces
Solution: Increase clusterContextTimeout in pkg/api/handlers/kagenti_provider_proxy.go or optimize cluster queries.
Symptom: Mission Control says Kagenti is available, but no agents are discovered.
Diagnostic path:
- Call
GET /api/kagenti-provider/status- If
available=false, fix connectivity first (controller URL, Service, RBAC, or direct-agent URL)
- If
- Call
GET /api/kagenti-provider/agents- If the response is
{"agents":[]}, the console can reach Kagenti but no agent is registered for Mission Control to use
- If the response is
- Check which deployment mode you intended:
- Controller mode: confirm at least one agent is registered with the Kagenti controller
- Direct-agent mode: confirm
kagenti.directAgentUrl,kagenti.directAgentName, andkagenti.directAgentNamespace(or theKAGENTI_AGENT_*env vars) are set on the console deployment
- If using Helm, inspect the rendered Deployment and verify the
KAGENTI_AGENT_URL,KAGENTI_AGENT_NAME, andKAGENTI_AGENT_NAMESPACEenvironment variables are present when direct-agent mode is expected
Solution: Register at least one Kagenti agent in controller mode, or configure direct-agent mode so the console can synthesize a single reachable agent.
Symptom: Agent receives the user's message but not the --- SYSTEM CONTEXT --- block.
Diagnosis:
- Check console backend logs for "failed to fetch cluster list for kagenti context"
- Verify
DeduplicatedClusters()returns non-empty results - Check if cluster query timed out (10s limit)
Solution: Fix cluster access issues or increase timeout.
Planned improvements for the Kagenti tool integration:
- Context caching: Cache cluster summary for 60 seconds to reduce API load
- Tool discovery: Agents can query
GET /api/kagenti-provider/toolsto discover available tools dynamically - Streaming tools: Support long-running tool invocations (e.g., watching pod logs)
- User-scoped tools: Execute tools with the user's kubeconfig instead of the console's ServiceAccount
- Audit logging: Store tool invocations in a structured audit log for compliance
- MCP bridge: Expose console tools as MCP servers for standardized tool calling