feat(eval-hub): add eval-hub MCP server configuration and reconciliation#734
feat(eval-hub): add eval-hub MCP server configuration and reconciliation#734julpayne wants to merge 5 commits into
Conversation
This commit introduces the EvalHubMCPSpec and EvalHubMCPStatus types to define the configuration and status of an optional MCP server deployment. It includes the implementation of reconciliation logic for the MCP server, including the creation and management of ConfigMaps, Deployments, Services, and Routes. Additionally, it updates the EvalHub CRD and sample configurations to support MCP deployment options, enhancing the overall functionality of the EvalHub component.
This commit refines the MCP handling in the EvalHub component by updating the GetMCPReplicas method to ensure proper defaults and encapsulation. It also modifies the RBAC configuration to include necessary permissions for managing jobs, namespaces, and pods, while removing redundant rules. Additionally, it updates the sample configuration to clarify database usage for different replica scenarios and improves logging for MCP deployment status updates. Unit tests are added to validate the new MCP configuration logic.
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/test all |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
controllers/evalhub/mcp_deployment.go (1)
197-198: ⚡ Quick winUse
mcpConfigMapName(instance)for the mounted ConfigMap reference.Line 197 rebuilds the ConfigMap name manually. Reusing the helper avoids future name drift between reconcilers.
Suggested patch
ConfigMap: &corev1.ConfigMapVolumeSource{ LocalObjectReference: corev1.LocalObjectReference{ - Name: mcpDeploymentName(instance) + "-config", + Name: mcpConfigMapName(instance), }, },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controllers/evalhub/mcp_deployment.go` around lines 197 - 198, The ConfigMap reference is being built manually using mcpDeploymentName(instance) + "-config"—replace that with the helper mcpConfigMapName(instance) so the mounted ConfigMap name stays in sync; locate the place setting Name: mcpDeploymentName(instance) + "-config" (the ConfigMap volume/ConfigMapRef in the Deployment spec) and change it to Name: mcpConfigMapName(instance).controllers/evalhub/evalhub_controller.go (1)
248-265: ⚡ Quick winEmit Kubernetes events for MCP reconciliation failures.
Lines 248-265 add new fatal MCP failure exits but don’t emit events, which makes failures harder to diagnose from cluster events.
Suggested patch
if err := r.reconcileMCPConfigMap(ctx, instance); err != nil { log.Error(err, "Failed to reconcile MCP ConfigMap") instance.SetStatus("Ready", "Error", fmt.Sprintf("Failed to reconcile MCP ConfigMap: %v", err), corev1.ConditionFalse) + r.EventRecorder.Event(instance, corev1.EventTypeWarning, "MCPConfigMapReconcileFailed", err.Error()) r.Status().Update(ctx, instance) return RequeueWithError(err) } if err := r.reconcileMCPDeployment(ctx, instance); err != nil { log.Error(err, "Failed to reconcile MCP Deployment") instance.SetStatus("Ready", "Error", fmt.Sprintf("Failed to reconcile MCP Deployment: %v", err), corev1.ConditionFalse) + r.EventRecorder.Event(instance, corev1.EventTypeWarning, "MCPDeploymentReconcileFailed", err.Error()) r.Status().Update(ctx, instance) return RequeueWithError(err) } if err := r.reconcileMCPService(ctx, instance); err != nil { log.Error(err, "Failed to reconcile MCP Service") instance.SetStatus("Ready", "Error", fmt.Sprintf("Failed to reconcile MCP Service: %v", err), corev1.ConditionFalse) + r.EventRecorder.Event(instance, corev1.EventTypeWarning, "MCPServiceReconcileFailed", err.Error()) r.Status().Update(ctx, instance) return RequeueWithError(err) }As per coding guidelines,
Record events via recorder.Event(instance, EventType, reason, message) to emit Kubernetes events from reconcilers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controllers/evalhub/evalhub_controller.go` around lines 248 - 265, The three reconciliation failure branches (reconcileMCPConfigMap, reconcileMCPDeployment, reconcileMCPService) log the error and set instance status but do not emit Kubernetes events; update each failure branch to call the controller recorder.Event(instance, corev1.EventTypeWarning, "<Reason>", fmt.Sprintf("Failed to reconcile MCP %s: %v", "<Resource>", err)) (use distinct reasons like "MCPConfigMapReconcileFailed", "MCPDeploymentReconcileFailed", "MCPServiceReconcileFailed" and replace "<Resource>" accordingly) before calling instance.SetStatus, r.Status().Update(ctx, instance) and return RequeueWithError(err), so cluster events surface these failures alongside existing logs and status updates.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controllers/evalhub/evalhub_controller.go`:
- Around line 550-557: The MCP status block currently only sets
instance.Status.MCP when it is non-nil; change it so that whenever
instance.Spec.IsMCPEnabled() returns false you unconditionally set
instance.Status.MCP = &evalhubv1alpha1.EvalHubMCPStatus{Phase: "Disabled",
Ready: false} (i.e. remove the nil guard) so new resources also get an explicit
Disabled status; update the branch that returns after the check in the
controller handling where instance.Spec.IsMCPEnabled() is evaluated (refer to
instance.Spec.IsMCPEnabled(), instance.Status.MCP and
evalhubv1alpha1.EvalHubMCPStatus).
In `@controllers/evalhub/mcp_service.go`:
- Around line 50-70: The create branch currently sets up the Service object but
then falls through to calling r.Update, causing creation to fail; change the
logic so that when errors.IsNotFound(getErr) is true you call r.Create(ctx,
service) (after controllerutil.SetControllerReference(instance, service,
r.Scheme)) and return immediately on success or error, otherwise (existing
object) proceed to update via r.Update; reference the Service object named
service, desiredSpec, tlsSecretName, controllerutil.SetControllerReference,
r.Create and r.Update to locate and modify the code.
In `@controllers/evalhub/unit_test.go`:
- Around line 451-479: Convert the TestGenerateMCPConfigData testify-style unit
test into a Ginkgo v2 spec using Gomega matchers and the controller test suite
(envtest); replace func TestGenerateMCPConfigData with a
Describe("GenerateMCPConfigData", ...) and It blocks, use
Expect(err).ToNot(HaveOccurred()) and
Expect(data).To(HaveKey(mcpConfigFileName))/Expect(cfg.Transport).To(Equal("http-sse"))
or "http" as appropriate, create the EvalHub objects with
evalHubv1alpha1.EvalHub and call r.generateMCPConfigData(EvalHub) inside the
spec, and ensure the spec runs under the project's BeforeSuite/AfterSuite
envtest setup (or add it to the existing controller test suite) so
controller-runtime envtest and Gomega are used instead of testing/testify;
reference TestGenerateMCPConfigData, generateMCPConfigData, EvalHubReconciler,
MCPConfig and mcpConfigFileName when making the changes.
---
Nitpick comments:
In `@controllers/evalhub/evalhub_controller.go`:
- Around line 248-265: The three reconciliation failure branches
(reconcileMCPConfigMap, reconcileMCPDeployment, reconcileMCPService) log the
error and set instance status but do not emit Kubernetes events; update each
failure branch to call the controller recorder.Event(instance,
corev1.EventTypeWarning, "<Reason>", fmt.Sprintf("Failed to reconcile MCP %s:
%v", "<Resource>", err)) (use distinct reasons like
"MCPConfigMapReconcileFailed", "MCPDeploymentReconcileFailed",
"MCPServiceReconcileFailed" and replace "<Resource>" accordingly) before calling
instance.SetStatus, r.Status().Update(ctx, instance) and return
RequeueWithError(err), so cluster events surface these failures alongside
existing logs and status updates.
In `@controllers/evalhub/mcp_deployment.go`:
- Around line 197-198: The ConfigMap reference is being built manually using
mcpDeploymentName(instance) + "-config"—replace that with the helper
mcpConfigMapName(instance) so the mounted ConfigMap name stays in sync; locate
the place setting Name: mcpDeploymentName(instance) + "-config" (the ConfigMap
volume/ConfigMapRef in the Deployment spec) and change it to Name:
mcpConfigMapName(instance).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e102a94e-6b58-43be-ad21-ccafbbeb7063
📒 Files selected for processing (12)
api/evalhub/v1alpha1/evalhub_types.goapi/evalhub/v1alpha1/zz_generated.deepcopy.goconfig/components/evalhub/crd/trustyai.opendatahub.io_evalhubs.yamlconfig/samples/evalhub_v1alpha1_evalhub_with_mcp.yamlcontrollers/evalhub/constants.gocontrollers/evalhub/evalhub_controller.gocontrollers/evalhub/evaluation_job_failure_reconciler.gocontrollers/evalhub/mcp_configmap.gocontrollers/evalhub/mcp_deployment.gocontrollers/evalhub/mcp_route.gocontrollers/evalhub/mcp_service.gocontrollers/evalhub/unit_test.go
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
This commit introduces the EvalHubMCPSpec and EvalHubMCPStatus types to define the configuration and status of an optional MCP server deployment. It includes the implementation of reconciliation logic for the MCP server, including the creation and management of ConfigMaps, Deployments, Services, and Routes. Additionally, it updates the EvalHub CRD and sample configurations to support MCP deployment options, enhancing the overall functionality of the EvalHub component.
Summary by CodeRabbit
Release Notes