Skip to content

Conversation

@sideangleside
Copy link
Member

Based upon a few requests I have gotten, I want to propose the following patch to check the health of a Foreman installation prior to registration. The general premise being:

If the services fail the health-check, exit, and remove the katello-ca-consumer-latest package.

Questions that I have:

  • is a response time of 1000 milliseconds an appropriate amount of time to consider a service too busy to service clients
  • should we finish up Validate user credendials before trying to bootstrap. #209 in this PR? (In that PR, we were attempting to validate creds before installing the katello-ca-consumer-latest RPM, which I don't think we need to do. (assuming we clean up if the creds fail).

print_generic("Ensuring services are online and accessible on within %s milliseconds" % options.service_timeout)
url = "https://" + options.foreman_fqdn + ":" + API_PORT + "/katello/api/ping"
features = get_json(url)['services']
svc_list = ["candlepin", "pulp", "pulp_auth", "candlepin_auth", "foreman_tasks"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this list will vary between Katello versions, why not iterating over features.keys() directly?

print("PUPPET CA PORT - %s" % options.puppet_ca_port)
print("IGNORE REGISTRATION FAILURES - %s" % options.ignore_registration_failures)
print("PRESERVE RHSM PROXY CONFIGURATION - %s" % options.preserve_rhsm_proxy)
print("SERVICE TIMEOUT - %s" % options.service_timeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KATELLO SERVICE TIMEOUT

url = "https://" + options.foreman_fqdn + ":" + API_PORT + "/katello/api/ping"
features = get_json(url)['services']
svc_list = ["candlepin", "pulp", "pulp_auth", "candlepin_auth", "foreman_tasks"]
for service in svc_list:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the next 10 lines are pretty repetitive, how about:

    for service in svc_list:
        service_error = None
        if features[service]['status'] != 'ok':
            service_error = "service %s reports a state of %s" % (service, features[service]['status'])
        elif int(features[service]['duration_ms']) >= options.service_timeout:
            service_error = "service %s reponds in %s ms which exceeds maximum of %s ms" % (service, features[service]['duration_ms'], str(options.service_timeout))
        if service_error:
            print_error("Error detected in the Service Health Check. Cleaning up install")
            clean_katello_agent()
            print_error(service_error)
            sys.exit(3)

@evgeni
Copy link
Member

evgeni commented Feb 19, 2019

I think 1000ms is enough, but I've long not seen heavy used katello installs 🤷‍♀️

One thing that bothers me with this setup: we do changes to the bootstraped system, before telling "oh, no, we won't bootstrap today". And while we do try to revert those changes (you clean the ca rpm), the system is not (might not) be bit-identical as before. So yes, I think piggy-backing this ontop of #209 and making it not even install the CA RPM would be awesome.

(My worry case is when migrating from one Katello to another, we already removed the old CA rpm, but we cannot reliably restore it and thus leave the machine without updates)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants