What keywords did you search in tectonic-installer issues before filing this one?
tectonic-identity. I also looked through recently opened and closed bugs.
Is this a BUG REPORT or FEATURE REQUEST?
BUG REPORT
After our cluster updated itself to 1.9.6-tectonic.2, we're getting a lot of alerts about tectonic-identity pods frequently restarting.
containerStatuses:
- name: tectonic-identity
state:
running:
startedAt: '2018-12-12T15:11:10Z'
lastState:
terminated:
exitCode: 137
reason: OOMKilled
startedAt: '2018-12-12T15:05:39Z'
finishedAt: '2018-12-12T15:10:47Z'
containerID: >-
docker://861e6545dc9ac82aca3d101cb4b0e4b9129e005b345612649b7a8cd001ec0b5d
ready: true
restartCount: 1018
image: 'quay.io/coreos/dex:v2.8.1'
imageID: >-
docker-pullable://quay.io/coreos/dex@sha256:19510b560e851bce6a27023fcbab9b6b8a3928d493de11c026e06df854cb37e1
containerID: >-
docker://45ed2b5b0958ba6235e7d5567e5fc327a3530a87533d19308f9e481fa2185458

(This pattern repeats itself since upgrading to the 1.9.6-tectonic.2 release)
Versions
- Tectonic version (release or commit hash): 1.9.6-tectonic.2
- Terraform version (
terraform version): Terraform v0.11.8
- Platform (aws|azure|openstack|metal): AWS
What happened?
The cluster, having auto-update enabled, updated itself to 1.9.6-tectonic.2. Afterwards the tectonic-identity pods started failing. When they're up and running, everything works fine and we can access and interact with the tectonic console. When they're down, we're getting 503 errors when accessing the console.
What you expected to happen?
The tectonic-identity pods should be stable
How to reproduce it (as minimally and precisely as possible)?
Update to 1.9.6-tectonic.2
Anything else we need to know?
All other components of our cluster operate normally