Skip to content

[Bug]: Some file replication jobs are failing #3068

@tw4l

Description

@tw4l

Browsertrix Version

v1.20.1-6f1c342

What did you expect to happen? What happened instead?

I expect file replication jobs for crawls and browser profiles to succeed. Some recently appear to be failing. We should determine why, fix the issue, and then retry them.

Reproduction instructions

  1. Run a Browsertrix instance with a replica storage location configured
  2. Create browser profiles
  3. Verify that some browser profile replication jobs fail and an email is sent to the superadmin email

Screenshots / Video

No response

Environment

No response

Additional details

There appear to be a few distinct issues:

  1. Replication jobs are being attempted for failed crawls, and these typically fail (solution: wait until a crawl succeeds to replicate its crawl files)
  2. Replication jobs for profiles seem to be occasionally failing after profiles are re-uploaded at the end of crawls. In at least one case, the rclone job returned exit code 9 due to our use of the --error-on-no-transfer flag, and was subsequently handled by Browsertrix as a failure, because the file already existed in the replication location with the same checksum so there was nothing to transfer.

Related to #3067, except the more recent failures should have success and finished set in the database, so we might want to retry these separately.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions