Skip to content

feat(eval-hub): add eval-hub MCP server configuration and reconciliation#734

Draft
julpayne wants to merge 5 commits into
mainfrom
feat/evalhub-mcp-support
Draft

feat(eval-hub): add eval-hub MCP server configuration and reconciliation#734
julpayne wants to merge 5 commits into
mainfrom
feat/evalhub-mcp-support

Conversation

@julpayne
Copy link
Copy Markdown
Collaborator

@julpayne julpayne commented May 13, 2026

This commit introduces the EvalHubMCPSpec and EvalHubMCPStatus types to define the configuration and status of an optional MCP server deployment. It includes the implementation of reconciliation logic for the MCP server, including the creation and management of ConfigMaps, Deployments, Services, and Routes. Additionally, it updates the EvalHub CRD and sample configurations to support MCP deployment options, enhancing the overall functionality of the EvalHub component.

Summary by CodeRabbit

Release Notes

  • New Features
    • EvalHub now supports optional MCP (Model Context Protocol) server deployment and management within the resource configuration.
    • Configure MCP with replicas, transport protocol, custom image, authentication secrets, environment variables, and resource requirements.
    • Track MCP server status including readiness, phase, and access URL.
    • Sample configuration provided demonstrating MCP setup.

Review Change Stack

julpayne added 2 commits May 13, 2026 16:12
This commit introduces the EvalHubMCPSpec and EvalHubMCPStatus types to define the configuration and status of an optional MCP server deployment. It includes the implementation of reconciliation logic for the MCP server, including the creation and management of ConfigMaps, Deployments, Services, and Routes. Additionally, it updates the EvalHub CRD and sample configurations to support MCP deployment options, enhancing the overall functionality of the EvalHub component.
This commit refines the MCP handling in the EvalHub component by updating the GetMCPReplicas method to ensure proper defaults and encapsulation. It also modifies the RBAC configuration to include necessary permissions for managing jobs, namespaces, and pods, while removing redundant rules. Additionally, it updates the sample configuration to clarify database usage for different replica scenarios and improves logging for MCP deployment status updates. Unit tests are added to validate the new MCP configuration logic.
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 13, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5305d6c1-c0bc-4af1-be78-7983fdac49ca

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/evalhub-mcp-support

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@julpayne
Copy link
Copy Markdown
Collaborator Author

/test all

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
controllers/evalhub/mcp_deployment.go (1)

197-198: ⚡ Quick win

Use mcpConfigMapName(instance) for the mounted ConfigMap reference.

Line 197 rebuilds the ConfigMap name manually. Reusing the helper avoids future name drift between reconcilers.

Suggested patch
 				ConfigMap: &corev1.ConfigMapVolumeSource{
 					LocalObjectReference: corev1.LocalObjectReference{
-						Name: mcpDeploymentName(instance) + "-config",
+						Name: mcpConfigMapName(instance),
 					},
 				},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controllers/evalhub/mcp_deployment.go` around lines 197 - 198, The ConfigMap
reference is being built manually using mcpDeploymentName(instance) +
"-config"—replace that with the helper mcpConfigMapName(instance) so the mounted
ConfigMap name stays in sync; locate the place setting Name:
mcpDeploymentName(instance) + "-config" (the ConfigMap volume/ConfigMapRef in
the Deployment spec) and change it to Name: mcpConfigMapName(instance).
controllers/evalhub/evalhub_controller.go (1)

248-265: ⚡ Quick win

Emit Kubernetes events for MCP reconciliation failures.

Lines 248-265 add new fatal MCP failure exits but don’t emit events, which makes failures harder to diagnose from cluster events.

Suggested patch
 if err := r.reconcileMCPConfigMap(ctx, instance); err != nil {
 	log.Error(err, "Failed to reconcile MCP ConfigMap")
 	instance.SetStatus("Ready", "Error", fmt.Sprintf("Failed to reconcile MCP ConfigMap: %v", err), corev1.ConditionFalse)
+	r.EventRecorder.Event(instance, corev1.EventTypeWarning, "MCPConfigMapReconcileFailed", err.Error())
 	r.Status().Update(ctx, instance)
 	return RequeueWithError(err)
 }
 if err := r.reconcileMCPDeployment(ctx, instance); err != nil {
 	log.Error(err, "Failed to reconcile MCP Deployment")
 	instance.SetStatus("Ready", "Error", fmt.Sprintf("Failed to reconcile MCP Deployment: %v", err), corev1.ConditionFalse)
+	r.EventRecorder.Event(instance, corev1.EventTypeWarning, "MCPDeploymentReconcileFailed", err.Error())
 	r.Status().Update(ctx, instance)
 	return RequeueWithError(err)
 }
 if err := r.reconcileMCPService(ctx, instance); err != nil {
 	log.Error(err, "Failed to reconcile MCP Service")
 	instance.SetStatus("Ready", "Error", fmt.Sprintf("Failed to reconcile MCP Service: %v", err), corev1.ConditionFalse)
+	r.EventRecorder.Event(instance, corev1.EventTypeWarning, "MCPServiceReconcileFailed", err.Error())
 	r.Status().Update(ctx, instance)
 	return RequeueWithError(err)
 }

As per coding guidelines, Record events via recorder.Event(instance, EventType, reason, message) to emit Kubernetes events from reconcilers.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controllers/evalhub/evalhub_controller.go` around lines 248 - 265, The three
reconciliation failure branches (reconcileMCPConfigMap, reconcileMCPDeployment,
reconcileMCPService) log the error and set instance status but do not emit
Kubernetes events; update each failure branch to call the controller
recorder.Event(instance, corev1.EventTypeWarning, "<Reason>",
fmt.Sprintf("Failed to reconcile MCP %s: %v", "<Resource>", err)) (use distinct
reasons like "MCPConfigMapReconcileFailed", "MCPDeploymentReconcileFailed",
"MCPServiceReconcileFailed" and replace "<Resource>" accordingly) before calling
instance.SetStatus, r.Status().Update(ctx, instance) and return
RequeueWithError(err), so cluster events surface these failures alongside
existing logs and status updates.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@controllers/evalhub/evalhub_controller.go`:
- Around line 550-557: The MCP status block currently only sets
instance.Status.MCP when it is non-nil; change it so that whenever
instance.Spec.IsMCPEnabled() returns false you unconditionally set
instance.Status.MCP = &evalhubv1alpha1.EvalHubMCPStatus{Phase: "Disabled",
Ready: false} (i.e. remove the nil guard) so new resources also get an explicit
Disabled status; update the branch that returns after the check in the
controller handling where instance.Spec.IsMCPEnabled() is evaluated (refer to
instance.Spec.IsMCPEnabled(), instance.Status.MCP and
evalhubv1alpha1.EvalHubMCPStatus).

In `@controllers/evalhub/mcp_service.go`:
- Around line 50-70: The create branch currently sets up the Service object but
then falls through to calling r.Update, causing creation to fail; change the
logic so that when errors.IsNotFound(getErr) is true you call r.Create(ctx,
service) (after controllerutil.SetControllerReference(instance, service,
r.Scheme)) and return immediately on success or error, otherwise (existing
object) proceed to update via r.Update; reference the Service object named
service, desiredSpec, tlsSecretName, controllerutil.SetControllerReference,
r.Create and r.Update to locate and modify the code.

In `@controllers/evalhub/unit_test.go`:
- Around line 451-479: Convert the TestGenerateMCPConfigData testify-style unit
test into a Ginkgo v2 spec using Gomega matchers and the controller test suite
(envtest); replace func TestGenerateMCPConfigData with a
Describe("GenerateMCPConfigData", ...) and It blocks, use
Expect(err).ToNot(HaveOccurred()) and
Expect(data).To(HaveKey(mcpConfigFileName))/Expect(cfg.Transport).To(Equal("http-sse"))
or "http" as appropriate, create the EvalHub objects with
evalHubv1alpha1.EvalHub and call r.generateMCPConfigData(EvalHub) inside the
spec, and ensure the spec runs under the project's BeforeSuite/AfterSuite
envtest setup (or add it to the existing controller test suite) so
controller-runtime envtest and Gomega are used instead of testing/testify;
reference TestGenerateMCPConfigData, generateMCPConfigData, EvalHubReconciler,
MCPConfig and mcpConfigFileName when making the changes.

---

Nitpick comments:
In `@controllers/evalhub/evalhub_controller.go`:
- Around line 248-265: The three reconciliation failure branches
(reconcileMCPConfigMap, reconcileMCPDeployment, reconcileMCPService) log the
error and set instance status but do not emit Kubernetes events; update each
failure branch to call the controller recorder.Event(instance,
corev1.EventTypeWarning, "<Reason>", fmt.Sprintf("Failed to reconcile MCP %s:
%v", "<Resource>", err)) (use distinct reasons like
"MCPConfigMapReconcileFailed", "MCPDeploymentReconcileFailed",
"MCPServiceReconcileFailed" and replace "<Resource>" accordingly) before calling
instance.SetStatus, r.Status().Update(ctx, instance) and return
RequeueWithError(err), so cluster events surface these failures alongside
existing logs and status updates.

In `@controllers/evalhub/mcp_deployment.go`:
- Around line 197-198: The ConfigMap reference is being built manually using
mcpDeploymentName(instance) + "-config"—replace that with the helper
mcpConfigMapName(instance) so the mounted ConfigMap name stays in sync; locate
the place setting Name: mcpDeploymentName(instance) + "-config" (the ConfigMap
volume/ConfigMapRef in the Deployment spec) and change it to Name:
mcpConfigMapName(instance).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e102a94e-6b58-43be-ad21-ccafbbeb7063

📥 Commits

Reviewing files that changed from the base of the PR and between e80b92e and 76a82bb.

📒 Files selected for processing (12)
  • api/evalhub/v1alpha1/evalhub_types.go
  • api/evalhub/v1alpha1/zz_generated.deepcopy.go
  • config/components/evalhub/crd/trustyai.opendatahub.io_evalhubs.yaml
  • config/samples/evalhub_v1alpha1_evalhub_with_mcp.yaml
  • controllers/evalhub/constants.go
  • controllers/evalhub/evalhub_controller.go
  • controllers/evalhub/evaluation_job_failure_reconciler.go
  • controllers/evalhub/mcp_configmap.go
  • controllers/evalhub/mcp_deployment.go
  • controllers/evalhub/mcp_route.go
  • controllers/evalhub/mcp_service.go
  • controllers/evalhub/unit_test.go

Comment thread controllers/evalhub/evalhub_controller.go
Comment thread controllers/evalhub/mcp_service.go
Comment thread controllers/evalhub/unit_test.go Outdated
julpayne and others added 3 commits May 13, 2026 18:04
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant