Skip to content

Next Build does not protect against missing CMS data during content release #20643

Open
0 of 3 issues completed
@timcosgrove

Description

@timcosgrove

User Story or Problem Statement

When the CMS is unavailable, Next Build should not attempt to build and deploy static content from CMS data.

Description or Additional Context

This is a follow-up to #20594. The situation is:

  1. Next Build makes calls to the CMS to retrieve content data as it is building static content
  2. The CMS goes through a daily deploy, during which the CMS content API is unavailable
  3. If the content data cannot be retrieved, Next Build does not build the page
  4. This 'missing' page then becomes part of the static content deploy, and the page is removed from the production site.

Next Build currently has 6 retries on Content API requests before giving up, but it does not fail the build if that threshold is passed.

#20594 was a first step that makes Next Build content release wait until the CMS is available before proceeding with static content build. However, there are still cases where content could be inadvertently removed:

  • transient failures in Content API request/response could prevent build of single pages
  • CMS deploy can still start while a Next Build content release is underway, removing the Content API mid-build.

Considerations

A mixture of the following could help this situation:

  • Next Build can and should fail build upon failing a Content API 6 times; however, this will require shoring up of the Content API connections for CI, lower environment build, etc.
  • Next Build could be prevented from starting a content release around the time of the CMS deploy; this is fuzzy and brittle but may be sufficient until we get to on-demand publishing
  • Next Build could be prevented from deleting content upon sync; but, in this case, we need a different mechanism to delete pages on production when they are archived on the CMS

There may be other solutions that would help the issue as well.

Steps for Implementation

  • Review content release logs for network failures to see how often this happens
  • Push logs into DataDog for easier visibility
  • From the logs, ensure that if we start failing the build on network failure that this will not have a serious detrimental impact (i.e. are we seeing failures multiple times a day, once a week, never, etc.)
  • It was also discussed to consider, increasing the number of retries if the network request straight-up fails (see detailed discussion notes below)

Acceptance Criteria

  • Next Build content release is modified to fail upon multiple Content API failures for a given request
  • Other solutions are also looked at and implemented (this is not a great AC; we may want to split this into multiple tickets)

Sub-issues

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    CMS TeamCMS Product team that manages both editor exp and devopsNeeds refiningIssue statusnext-buildFE Repository that will replace content-build. Uses NextJS, builds static pages.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions