You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Adds documentation for R2 Data Catalog
* Added managing catalogs documentation and R2 Data Catalog as a product.
* Add changelog entry
* PCX review
* Fix PR comments/typos.
* Added PySpark example configuration.
* Update src/content/docs/r2/data-catalog/config-examples/spark-scala.mdx
* Added more context for data catalog auth
* Add access policy example for r2 data catalog API tokens
---------
Co-authored-by: Jun Lee <[email protected]>
title: R2 Data Catalog is a managed Apache Iceberg data catalog built directly into R2 buckets
3
+
description: A managed Apache Iceberg data catalog built directly into R2 buckets
4
+
products:
5
+
- r2
6
+
date: 2025-04-10T13:00:00Z
7
+
hidden: true
8
+
---
9
+
10
+
Today, we're launching [R2 Data Catalog](/r2/data-catalog/) in open beta, a managed Apache Iceberg catalog built directly into your [Cloudflare R2](/r2/) bucket.
11
+
12
+
If you're not already familiar with it, [Apache Iceberg](https://iceberg.apache.org/) is an open table format designed to handle large-scale analytics datasets stored in object storage, offering ACID transactions and schema evolution. R2 Data Catalog exposes a standard Iceberg REST catalog interface, so you can connect engines like [Spark](/r2/data-catalog/config-examples/spark-scala/), [Snowflake](/r2/data-catalog/config-examples/snowflake/), and [PyIceberg](/r2/data-catalog/config-examples/pyiceberg/) to start querying your tables using the tools you already know.
13
+
14
+
To enable a data catalog on your R2 bucket, find **R2 Data Catalog** in your buckets settings in the dashboard, or run:
15
+
16
+
```bash
17
+
npx wrangler r2 bucket catalog enable my-bucket
18
+
```
19
+
20
+
And that's it. You'll get a catalog URI and warehouse you can plug into your favorite Iceberg engines.
21
+
22
+
Visit our [getting started guide](/r2/data-catalog/get-started/) for step-by-step instructions on enabling R2 Data Catalog, creating tables, and running your first queries.
| Admin Read & Write | Allows the ability to create, list and delete buckets, and edit bucket configurations in addition to list, write, and read object access. |
51
-
| Admin Read only | Allows the ability to list buckets and view bucket configuration in addition to list and read object access. |
52
-
| Object Read & Write | Allows the ability to read, write, and list objects in specific buckets. |
53
-
| Object Read only | Allows the ability to read and list objects in specific buckets. |
| Admin Read & Write | Allows the ability to create, list, and delete buckets, edit bucket configuration, read, write, and list objects, and read and write to data catalog tables and associated metadata. |
51
+
| Admin Read only | Allows the ability to list buckets and view bucket configuration, read and list objects, and read from the data catalog tables and associated metadata. |
52
+
| Object Read & Write | Allows the ability to read, write, and list objects in specific buckets. |
53
+
| Object Read only | Allows the ability to read and list objects in specific buckets. |
54
+
55
+
:::note
56
+
57
+
Currently **Admin Read & Write** or **Admin Read only** permission is required to use [R2 Data Catalog](/r2/data-catalog/).
58
+
59
+
:::
54
60
55
61
## Create API tokens via API
56
62
@@ -90,7 +96,7 @@ All buckets in an account are represented as:
90
96
91
97
#### Permission groups
92
98
93
-
Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#permission-groups) should be applied. There are four relevant permission groups for R2.
99
+
Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#permission-groups) should be applied.
94
100
95
101
<table>
96
102
<tbody>
@@ -101,7 +107,7 @@ Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#perm
101
107
Resource
102
108
</th>
103
109
<thcolspan="5"rowspan="1">
104
-
Permission
110
+
Description
105
111
</th>
106
112
<tr>
107
113
<tdcolspan="5"rowspan="1">
@@ -111,7 +117,8 @@ Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#perm
111
117
Account
112
118
</td>
113
119
<tdcolspan="5"rowspan="1">
114
-
Admin Read & Write
120
+
Can create, delete, and list buckets, edit bucket configuration, and
121
+
read, write, and list objects.
115
122
</td>
116
123
</tr>
117
124
<tr>
@@ -122,7 +129,8 @@ Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#perm
122
129
Account
123
130
</td>
124
131
<tdcolspan="5"rowspan="1">
125
-
Admin Read only
132
+
Can list buckets and view bucket configuration, and read and list
133
+
objects.
126
134
</td>
127
135
</tr>
128
136
<tr>
@@ -133,7 +141,7 @@ Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#perm
133
141
Bucket
134
142
</td>
135
143
<tdcolspan="5"rowspan="1">
136
-
Object Read & Write
144
+
Can read, write, and list objects in buckets.
137
145
</td>
138
146
</tr>
139
147
<tr>
@@ -144,7 +152,31 @@ Determine what [permission groups](/fundamentals/api/how-to/create-via-api/#perm
144
152
Bucket
145
153
</td>
146
154
<tdcolspan="5"rowspan="1">
147
-
Object Read only
155
+
Can read and list objects in buckets.
156
+
</td>
157
+
</tr>
158
+
<tr>
159
+
<tdcolspan="5"rowspan="1">
160
+
<code>Workers R2 Data Catalog Write</code>
161
+
</td>
162
+
<tdcolspan="5"rowspan="1">
163
+
Account
164
+
</td>
165
+
<tdcolspan="5"rowspan="1">
166
+
Can read from and write to data catalogs. This permission allows
167
+
access to the Iceberg REST catalog interface.
168
+
</td>
169
+
</tr>
170
+
<tr>
171
+
<tdcolspan="5"rowspan="1">
172
+
<code>Workers R2 Data Catalog Read</code>
173
+
</td>
174
+
<tdcolspan="5"rowspan="1">
175
+
Account
176
+
</td>
177
+
<tdcolspan="5"rowspan="1">
178
+
Can read from data catalogs. This permission allows read-only
Below is an example of using [PyIceberg](https://py.iceberg.apache.org/) to connect to R2 Data Catalog.
7
+
8
+
## Prerequisites
9
+
10
+
- Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up/workers-and-pages).
11
+
-[Create an R2 bucket](/r2/buckets/create-buckets/) and [enable the data catalog](/r2/data-catalog/manage-catalogs/#enable-r2-data-catalog-on-a-bucket).
12
+
-[Create an R2 API token](/r2/api/tokens/) with both [R2 and data catalog permissions](/r2/api/tokens/#permissions).
13
+
- Install the [PyIceberg](https://py.iceberg.apache.org/#installation) and [PyArrow](https://arrow.apache.org/docs/python/install.html) libraries.
14
+
15
+
## Example usage
16
+
17
+
```py
18
+
import pyarrow as pa
19
+
from pyiceberg.catalog.rest import RestCatalog
20
+
from pyiceberg.exceptions import NamespaceAlreadyExistsError
Below is an example of using [Snowflake](https://docs.snowflake.com/en/user-guide/tables-iceberg-configure-catalog-integration-rest) to connect and query data from R2 Data Catalog (read-only).
7
+
8
+
## Prerequisites
9
+
10
+
- Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up/workers-and-pages).
11
+
-[Create an R2 bucket](/r2/buckets/create-buckets/) and [enable the data catalog](/r2/data-catalog/manage-catalogs/#enable-r2-data-catalog-on-a-bucket).
12
+
-[Create an R2 API token](/r2/api/tokens/) with both [R2 and data catalog permissions](/r2/api/tokens/#permissions).
13
+
- A [Snowflake](https://www.snowflake.com/) account with the necessary privileges to create external volumes and catalog integrations.
14
+
15
+
## Example usage
16
+
17
+
In your Snowflake [SQL worksheet](https://docs.snowflake.com/en/user-guide/ui-snowsight-worksheets-gs) or [notebook](https://docs.snowflake.com/en/user-guide/ui-snowsight/notebooks), run the following commands:
18
+
19
+
```sql
20
+
-- Create a database (if you don't already have one) to organize your external data
21
+
CREATEDATABASEIF NOT EXISTS r2_example_db;
22
+
23
+
-- Create an external volume pointing to your R2 bucket
Below is an example of using [PySpark](https://spark.apache.org/docs/latest/api/python/index.html) to connect to R2 Data Catalog.
7
+
8
+
## Prerequisites
9
+
10
+
- Sign up for a [Cloudflare account](https://dash.cloudflare.com/sign-up/workers-and-pages).
11
+
-[Create an R2 bucket](/r2/buckets/create-buckets/) and [enable the data catalog](/r2/data-catalog/manage-catalogs/#enable-r2-data-catalog-on-a-bucket).
12
+
-[Create an R2 API token](/r2/api/tokens/) with both [R2 and data catalog permissions](/r2/api/tokens/#permissions).
13
+
- Install the [PySpark](https://spark.apache.org/docs/latest/api/python/getting_started/install.html) library.
0 commit comments