-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Added information about Online project hibernation #4470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@damemi @sallyom @luciddreamz PTAL. Some questions:
Huge thanks! |
@ahardin-rh |
@damemi Okay, so the user could look in the project for the Also, what about # 2? What, if anything, should the user know about auto-idling? Even if there's nothing to set, how are they experiencing it and is there any guidance they need to follow? |
@ahardin-rh the user will only see a As for auto-idling, @sallyom can probably speak more in detail but it basically monitors network traffic to pods in the project and if it falls below a certain threshold every service in the project is idled. again, there's no monitor to see if they're approaching this threshold or not (it's all handled in the controller) |
@damemi Thanks. Is that something that the user would know to check? Is that something worth documenting, or no? As mentioned in today's scrum, there's a lot of information on force-sleep/hibernation and auto-idling, but I need guidance from you and @sallyom to help me distill down what the user actually needs to know. It seems like a lot of this is behind the scenes and not focused on user action. However, I would like to know what else, if anything, impacts user experience and would be worth capturing in the docs. |
I would say the user should know what triggers force-sleep, how to identify your project is in force-sleep, and what causes it to be removed. You're right that it's very behind-the-scenes as of now, but would probably refer to @abhgupta to find out what we want to publicly document about it. |
@ahardin-rh I haven't followed this PR/discussion very closely. I will take a look at this tomorrow (on mostly-pto today). |
ifdef::openshift-online[] | ||
[[projects-hibenation]] | ||
=== Project Hibernation | ||
In {product-title} Starter, projects hibernate after 30 minutes of inactivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this? "In {product-title} Starter, projects with 30 minutes of inactivity are placed in an idled state with resources scaled to zero.
Upon receiving network activity, project resources are un-idled (scaled back up). In addition to auto-idling, projects must hibernate 18 hours in a 72-hour period. During the hibernation, all project resources are given a hard quota of zero (they cannot be scaled up)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh @sallyom There is a idle_threshold actually, if the the network activity received is less than the idle_threshold the project will be idled. I think this might need to be referred to in the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Whatever the hours is described in the design or the introduction, the hibernate is 8 hours and the period is 24 hours, but at here, they are 18 of 72, are our design changed to this?
- About the last sentence, from the testing on devenv, the project will be given an extra quota force-sleep which only hard code the pod limit to zero, not all the resources(dc,rc,pvc can be successfully created). So during the hibernation, no pod is active(can't be scaled up, and created).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh @yasun1 @damemi
"In {product-title} Starter, inactive projects are placed in an idled state with resources scaled to zero. A project is considered inactive when that project's cumulative network traffic received over 30 minutes is below a configured threshold. Upon receiving any network activity, an idled project's resources are un-idled (scaled back up)."
48694d8
to
fc8b332
Compare
@sallyom Thanks! This is now updated 🌟 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
an idled state with resources scaled to zero. Upon receiving network activity, | ||
project resources are un-idled (scaled back up). In addition to auto-idling, | ||
projects must hibernate 18 hours in a 72-hour period. During the hibernation, | ||
all project resources are given a hard quota of zero (they cannot be scaled up). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numerals for the two "zeros" since there's > 10 numerals in the paragraph? Stylepedia
an idled state with resources scaled to zero. Upon receiving network activity, | ||
project resources are un-idled (scaled back up). In addition to auto-idling, | ||
projects must hibernate 18 hours in a 72-hour period. During the hibernation, | ||
all project resources are given a hard quota of zero (they cannot be scaled up). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahardin-rh 'all project resources are given a...' should be changed to 'pods are given a..' my bad, sorry :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@damemi @ahardin-rh - how about 'projects will hibernate for a configurable amount of time in a configurable time period.' or perhaps, ' projects will hibernate for a configurable amount of time in a configurable time period, currently set to 18 hours in a 72-hour period.' (if we want to give more information)
edit: should keep that sentence 'projects must hibernate 18 hours in a 72-hour period.' as/is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to be clearer something like this:
Any project exceeding 54 cumulative quota-hours of usage in a rolling 72-hour period must hibernate for the next 18 hours. Quota-hours are calculated as the maximum between percentage of terminating and non-terminating pod resource quota consumed, multiplied by the running time of those pods. For example, 2 compute pods each using half of the available memory quota for 1 hour will be counted as 1 Quota-hour.
if we want to be specific, since technically not all projects must hibernate 18 hours every 72 hours. the original description might just be easier to understand though
0b895c5
to
9f2070f
Compare
@ahardin-rh @sallyom @damemi
|
9f2070f
to
e95a12d
Compare
@yasun1 Thanks. This is updated. |
@ahardin-rh getting closer but hibernation is not deployed in clusters yet. When it is, I'll update here, there may be more to add for docs, explanation on what to expect when a project is 'unidled' will need to be added. |
@sallyom Okay, thanks! We'll stand by. |
@ahardin-rh auto-idling is now deployed to starter clusters. |
individual pods are deleted. All PVCs and PVs in the project are left untouched. | ||
After the force-sleep period is over, a project is put in an idled state, where | ||
the replica count is `0`, but the force-sleep resource quota is removed. Upon | ||
receiving network traffic, the project's replica counts will be restored to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*** maybe should note: 'If network traffic does not restore a project's replica counts, then a user may have to manually scale up the deployment.' This is bc we've been seeing issues regarding unidling times, unidling in general
the replica count is `0`, but the force-sleep resource quota is removed. Upon | ||
receiving network traffic, the project's replica counts will be restored to | ||
their pre-sleep value and pods will be created. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe note that in web-console, you'll see your deployment as 'Idled due to inactivity' whereupon you can manually scale the deployment back up.
6ea8edc
to
7838d7e
Compare
7838d7e
to
fcd8c65
Compare
@ahardin-rh: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@abhgupta @sallyom - please take a look. @ahardin-rh - are you able to rebase? |
No description provided.