-
Notifications
You must be signed in to change notification settings - Fork 2
Description
We have lost thousands of dollars of data now due to COS/OSF instability. DataPipe is doing its job perfectly, but it's increasingly clear to me that OSF is the 'weak link' in the stack and has just been losing data without being able to recover it. This really undermines the 'born open' model. I think a lot of people are beginning to migrate away from OSF, and it would be helpful to have more trustworthy endpoints as an option.
OSF messages
What happened?
On Friday, May 23, there was a disruption to OSF’s database. The cloud service provider for OSF, Google Cloud Platform (GCP), triggered an update at 5:24:16 AM GMT-4 that conflicted with the versions and settings of OSF services and corrupted the database. COS staff were notified of the service irregularities at 5:34 AM GMT-4, and OSF services were immediately taken offline to assess and address the disruption. We identified the cause and resolved it by restoring a database backup. The most recent backup was created on Thursday, May 22, 2025, at 10:03:24 PM GMT-4, meaning that OSF content modified and actions within a 7.5-hour window between May 22, 2025, 10:03:24 PM GMT-4 and May 23, 2025 at 5:24:16 AM GMT-4 was unrecoverable. We cannot recover the lost content, however no data was exposed or compromised during this disruption. All services, including the database backup, were restored at 10:03 AM GMT-4, approximately 4.5 hours after the disruption occurred.
Are my records affected?
If you posted files or made modifications during the 7.5-hour window, then those content and modifications were lost. We conducted a manual review of notification and log systems and identified all affected users. We have contacted those users by email to inform them of the disruption and offer assistance. If you believe that you were affected and did not receive an email, please contact us via [email protected].
Why was the downtime so long?
We initiated assessment and correction immediately upon observing the disruption and kept the services offline until validating that the database was fully operational and recovered to the most recent version for continued use.