Skip to content

Simplify install-konveyor.sh and improve debug output handling #512

@fabianvf

Description

@fabianvf

The current hack/install-konveyor.sh script has several issues that make debugging difficult and can cause problems in CI:

  1. Browser tab freezing - The script outputs massive amounts of YAML to stdout, which can freeze the GitHub Actions web viewer
  2. Mixed concerns - The script handles OLM installation, operator deployment, CR creation, and debug output all in one place
  3. Duplicated debug code - The debug() function has separate branches for CI vs non-CI that largely duplicate the same commands
  4. Inconsistent waiting patterns - Mix of timeout, kubectl wait, and bash loops
  5. Verbose output - Full YAML dumps make logs hard to read on failure

Proposed changes:

  1. Always save debug output to files instead of stdout. Save to /tmp/konveyor-debug/:

    • namespace.yaml
    • all-resources.yaml
    • operator-resources.yaml
    • tackle.yaml
    • pod-{name}-describe.txt
    • pod-{name}-logs.txt
    • events.txt
  2. The install-konveyor action already uploads artifacts on failure. Make sure we capture everything:

    • name: konveyor-debug-logs-${{ github.run_id }}-${{ github.run_attempt }}
    • path: /tmp/konveyor-debug/
    • retention-days: 7
  3. Replace complex bash loops with kubectl wait. Instead of:

    timeout 2m bash -c "until kubectl --namespace ${NAMESPACE} get deployment/tackle-operator; do sleep 10; done"
    

    Use:

    kubectl wait --for=condition=available deployment/tackle-operator -n "${NAMESPACE}" --timeout=2m
    
  4. Remove full YAML dumps and show status instead:

    • Line 77: Remove kubectl get csv -n "${NAMESPACE}" -o yaml
    • Line 111: Remove full tackle CR yaml dump
    • Line 128: Remove full deployments yaml dump

    Just show: kubectl get tackles.tackle.konveyor.io/tackle -n "${NAMESPACE}" --no-headers

Benefits:

  • No frozen browser tabs
  • Downloadable debug artifacts for offline analysis
  • Cleaner CI logs with status messages
  • Backwards compatible
  • Organized debug file structure

Notes:

  • Many repos depend on this script via the install-konveyor action, so changes must be backwards compatible
  • Consider adding a VERBOSE flag for users who want the old behavior

Questions:

  1. Should we keep the current parameter handling for the Tackle CR fields (KAI_* variables)?
  2. Are there other specific debug outputs we should capture (PVC status, ingress details)?
  3. Should we add timestamps to the debug filenames for multiple runs?

Related to the LLM proxy testing work where failures and massive logs are causing headaches

Metadata

Metadata

Assignees

Labels

kind/cleanupCategorizes issue or PR as related to cleaning up code, process, or technical debt.priority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions