Skip to content

Show new VCR failures separately from preexisting ones in VCR status reports #12365

Open
@rileykarson

Description

@rileykarson

After a run of our presubmit "VCR" tests, we display a summary of the tests that failed after punching through to the live APIs. We'd expected the number of failures to be near-zero, but in practice, we tend to accumulate a couple failures a week that we burn through intermittently. We haven't built a process to systematically address these failures, and struggle to find the cycles to do so. These failures generally start because:

  • Tests that don't work with VCR due to parallel execution of identical resources are not tagged as VCR-unfriendly
  • A quota problem will get introduced by our nightlies, and then a client library update will force a re-recording of many tests when the API message we send changes.
  • An API changes its behaviour, and then a client library update will force a re-recording of many tests wen the API message we send changes.

As a result, when new contributors interact with the repo, the Magician (un)helpfully reports a large number of test failures to them. In cases like GoogleCloudPlatform/magic-modules#6412 (comment), most of the failures are completely unrelated to the user's change.

We should display more pertinent information to users in the VCR status report, highlighting tests that are newly failing as of their change. For example:

Tests passed during RECORDING mode:
TestAccCloudfunctions2function_cloudfunctions2BasicAuditlogsExample
TestAccFirebaserulesRelease_BasicRelease

Tests newly failing during RECORDING mode:
TestAccContainerCluster_withNodeConfigReservationAffinitySpecific
Please fix these to complete your PR

<status> Already-failing tests </status>
<details>

Tests already failing:
TestAccComputeInstance_networkPerformanceConfig
TestAccComputeInstance_soleTenantNodeAffinities
TestAccComputeGlobalForwardingRule_internalLoadBalancing
TestAccCloudRunService_cloudRunServiceStaticOutboundExample
TestAccPrivatecaCertificateAuthority_privatecaCertificateAuthoritySubordinateExample
TestAccSqlDatabaseInstance_withPrivateNetwork_withAllocatedIpRange
</details>

View the [build log](https://storage.cloud.google.com/ci-vcr-logs/beta/refs/heads/auto-pr-6412/artifacts/6bfc7900-b1fa-417b-b6cb-9838383c26e1/build-log/recording_test.log) or the [debug log](https://console.cloud.google.com/storage/browser/ci-vcr-logs/beta/refs/heads/auto-pr-6412/artifacts/6bfc7900-b1fa-417b-b6cb-9838383c26e1/recording) for each test

We can likely use a pretty simple heuristic here by comparing against main.

We'd add a REPLAY run step, run on commit merge, costing around half an hour of machine time per commit, and store the list of failing tests in a GCS bucket. Each PR submitted against the repo will have a branch point from main- its merge base- and we can look up the results in the GCS bucket to determine what tests were failing.

There is a slight timing issue as folks could open a PR before the post-submit replay finished. That's generally unlikely, as a half hour is pretty short, but we could choose a strategy to handle it- return a warning to the user, and avoid filtering on that PR, wait for the commit (w/ some timeout, say an hour), or step back in commits until we find a match.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions