docs: update outdated eventual consistency wording for cloud object stores (#12424)

mj006648 · web-flow · commit 173cc03d6e6b · 2026-05-20T18:13:03.000+02:00
Refreshes the Architecture page so it no longer describes modern cloud object stores as eventually consistent. AWS S3 has provided strong read-after-write consistency since December 2020, and GCS / Azure Blob have been strongly consistent since launch. The technical motivation for Nessie is preserved: cloud object stores still lack atomic multi-object swap / rename, which is the real reason a Hive-metastore-like component (or Nessie) is needed. Closes #5349 Signed-off-by: mj006648 <uckdekf@gmail.com>
diff --git a/site/docs/develop/index.md b/site/docs/develop/index.md
@@ -1,7 +1,7 @@
 # Architecture
 
 Nessie builds on the recent ecosystem developments around table formats. The rise of
-very large metadata and eventually consistent cloud data lakes (S3 specifically) drove
+very large metadata and cloud data lakes (S3 specifically) drove
 the need for an updated model around metadata management. Where consistent directory
 listings in HDFS used to be sufficient, there were many features lacking. This includes
 snapshotting, consistency and fast planning. Apache Iceberg was created to help alleviate
@@ -21,8 +21,10 @@ require a pointer to the active metadata set to function. This pointer allows th
 current schema, files and partitions in the dataset. Iceberg currently relies on the Hive metastore or hdfs to perform
 this role. The requirements for this root pointer store is it must hold (at least) information about the location of the
 current up-to-date metadata file, and it must be able to update this location atomically. In Hive this is accomplished by
-locks and in hdfs by using atomic file swap operations. These operations don’t exist in eventually consistent cloud
-object stores, necessitating a Hive metastore for cloud data lakes. The Nessie system is designed to store the
+locks and in hdfs by using atomic file swap operations. While modern cloud object stores
+(S3 since December 2020, GCS and Azure Blob since launch) provide strong read-after-write
+consistency, they still lack atomic multi-object swap operations, necessitating a Hive
+metastore for cloud data lakes. The Nessie system is designed to store the
 root metadata pointer and perform atomic updates to this pointer, obviating the need for a Hive metastore. Removing the
 need for a Hive metastore simplifies deployment and broadens the reach of tools that can work with Iceberg tables.