Skip to content

Review content release logs for next-build #20815

Closed
@laflannery

Description

@laflannery

Description

We need to determine what the impact of failing the build on network failure might be. In order to do this we need to review the next-build content release logs and see how often there is a network failure. Is this happening multiple times a day, once a week, once a month? Depending on what we find, that will help us see how disruptive failing the build completely would be.

Additional Info

  1. What specific impact are we trying to assess, aka are we just documenting these or brainstorming possible solutions too?
    • We just need to document the failures. The meeting the next-build team had 2-ish weeks ago was more of the brainstorming solutions and these were the steps that came out of that (this ticket and the 2 follow ups)
  2. Are we only considering network failures that cause a build to fail, or all network-related failures in the system, and and are we looking for specific types of network failures?
    • Currently network failures do NOT cause the build to fail. That's what we are going to determine if we want to change.
    • So I would say that we would be looking for any type of network failure. I think Tim/Chris might be able to tell you exactly what to look for. He kept referring to it as a network failure so that's why I said that in here but perhaps he can give you more specific details about what exactly to look for.
    • FWIW, On 2/17 around 3:30/4:00et is an instance of the type of failure that caused what we are trying to avoid. I have no idea where these mysterious logs are so I can't see what an example of this might be for you but maybe look there to start and see what that shows and that would give you a good example of what to look for in general
    • From a practical standpoint, what happens is that next-build can't reach the CMS and therefore some of the next-build pages do not get built and that's the failure we are trying to prevent but I don't necessarily know what that looks like exactly
  3. How long should logs be reviewed? I know it says XX so I was curious if you had a number in mind or its a part of the ticket to determine that
    • I think it might be a bit determined by the ticket.
    • For example, next-build launched on Dec 14, 2024 so we really only have 3 months of data to look at but if you happen to see that in the past month there are multiple failures per day we don't necessarily need to go back to look at month 2 and 3.
    • So I would say, look at the past month first. If there isn't significant data there then go back to look at month 2 and 3.
  4. Will it be mostly an ongoing investigation?
    • I don't think so? The intention of this initial investigation is to determine if failing the build on network failure will be disruptive (this ticket)
    • I could see us potentially looking at the logs again if we want to revisit our decision to fail or not fail the build though. That is way why have a 3rd ticket for the future so that hopefully if we do need to investigate again it will be easier and more visible to find what we need.

Acceptance Criteria

  • Logs are reviewed for a period of 1-3 months
  • Number of network failures within that period is documented

Metadata

Metadata

Assignees

Labels

CMS TeamCMS Product team that manages both editor exp and devopsnext-buildFE Repository that will replace content-build. Uses NextJS, builds static pages.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions