-
Notifications
You must be signed in to change notification settings - Fork 53
Fix e2e test failures caused by Contour Helm pre-install hook timeout #320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: danielgerlag <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot I don't think that is the real issue. It looks like the contour install is timing out. Can we increase the time out?
Co-authored-by: danielgerlag <[email protected]>
You're right - I've increased the Helm install timeout from 300 seconds to 600 seconds (10 minutes) in commit 52bab19. This should allow more time for the Contour installation to complete in CI environments. |
@copilot it is still failing, even with a 10 minute time out. There must be a deeper issue. |
Co-authored-by: danielgerlag <[email protected]>
You're right - the deeper issue was that with |
@copilot that did not solve it, it is still failing for the same reason. |
Could you share the specific error message or logs from the failure? I need to see what's actually timing out or failing to diagnose the root cause properly. |
Co-authored-by: danielgerlag <[email protected]>
Co-authored-by: danielgerlag <[email protected]>
I've analyzed the logs. The issue is that the Bitnami Contour chart has pre-install hooks that timeout after ~5 minutes. I've added Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
cli/sdk/ingress/contour_installer.go
Outdated
output.AddTask("Contour-Check", "Checking for Contour...") | ||
|
||
// First check if the namespace exists | ||
_, err := ci.kubeClient.CoreV1().Namespaces().Get(context.TODO(), "projectcontour", metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot I think this function is poorly named. "checkContourInstallation" is not checking the installation, it is ensuring it. Please rename this function to "ensureContourInstallation", and remove this namespace check, as it does not make sense for the true purpose of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to ensureContourInstallation
and removed the explicit namespace check in commit 2e82caa. Now handles namespace not found errors properly using errors.IsNotFound()
in the pod list error handling instead.
…ove error handling Co-authored-by: danielgerlag <[email protected]>
Problem
The end-to-end tests were failing due to recent changes that install Contour in the test environment. Through iterative investigation of CI logs, the root causes were identified:
ensureContourInstallation
function (originally namedcheckContourInstallation
) incli/sdk/ingress/contour_installer.go
was not properly handling the case when theprojectcontour
namespace doesn't existWhen the namespace doesn't exist (the normal case before Contour is installed for the first time), the Kubernetes API returns an error when listing pods. This error was being propagated as a failure rather than being treated as "not installed."
The installation had
installClient.Wait = !localCluster
, which meantWait = false
for local clusters (like kind in e2e tests). WithWait = false
, Helm just applies manifests and returns immediately without waiting for pods to be ready.Most critically, analysis of CI logs revealed that the Bitnami Contour chart's pre-install hooks were timing out independently of the main installation timeout, preventing the chart from being installed at all.
Solution
checkContourInstallation
toensureContourInstallation
to better reflect its purposeerrors.IsNotFound()
and treat it as "not installed" rather than an errorinstallClient.Wait = true
for all clusters. When using NodePort (as we do for local clusters), Helm won't wait for LoadBalancer external IPs but will still wait for pods to be readyinstallClient.WaitForJobs = true
to ensure any init/setup jobs complete before the installation is considered successfulinstallClient.DisableHooks = true
to bypass the pre-install hooks that were causing timeouts. This is safe because Contour's CRDs are included in the main chart templates, not in hooksChanges
Testing
drasi ingress init --local-cluster
can proceed normally in fresh environmentsThese changes address the root causes without affecting any other functionality. The DisableHooks option is safe for Contour as the CRDs are included in the chart templates rather than hooks.
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.