Skip to content

Commit adcc62d

Browse files
committed
Some more docs
1 parent 1d2c866 commit adcc62d

File tree

1 file changed

+65
-19
lines changed

1 file changed

+65
-19
lines changed

docs/functions.md

Lines changed: 65 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,8 @@ Note: `ALTER EXTENSION pg_duckdb WITH SCHEMA schema` is not currently supported.
2929
| Name | Description |
3030
| :--- | :---------- |
3131
| [`duckdb.install_extension`](#install_extension) | Installs a DuckDB extension |
32-
| [`duckdb.raw_query`](#raw_query) | Runs a query directly against DuckDB (meant for debugging)|
32+
| [`duckdb.query`](#query) | Runs a SELECT query directly against DuckDB |
33+
| [`duckdb.raw_query`](#raw_query) | Runs any query directly against DuckDB (meant for debugging)|
3334
| [`duckdb.recycle_ddb`](#recycle_ddb) | Force a reset the DuckDB instance in the current connection (meant for debugging) |
3435

3536
## Motherduck Functions
@@ -40,14 +41,16 @@ Note: `ALTER EXTENSION pg_duckdb WITH SCHEMA schema` is not currently supported.
4041

4142
## Detailed Descriptions
4243

43-
#### <a name="read_parquet"></a>`read_parquet(path TEXT or TEXT[], /* optional parameters */) -> SETOF record`
44+
#### <a name="read_parquet"></a>`read_parquet(path TEXT or TEXT[], /* optional parameters */) -> SETOF duckdb.row`
4445

4546
Reads a parquet file, either from a remote location (via httpfs) or a local file.
4647

47-
Returns a record set (`SETOF record`). Functions that return record sets need to have their columns and types specified using `AS`. You must specify at least one column and any columns used in your query. For example:
48+
This returns DuckDB rows, you can expand them using `*` or you can select specific columns using the `r['mycol']` syntax. If you want to select specific columns you should give the function call an easy alias, like `r`. For example:
4849

4950
```sql
50-
SELECT COUNT(i) FROM read_parquet('file.parquet') AS (int i);
51+
SELECT * FROM read_parquet('file.parquet');
52+
SELECT r['id'], r['name'] FROM read_parquet('file.parquet') r WHERE r['age'] > 21;
53+
SELECT COUNT(*) FROM read_parquet('file.parquet');
5154
```
5255

5356
Further information:
@@ -65,14 +68,16 @@ Further information:
6568

6669
Optional parameters mirror [DuckDB's read_parquet function](https://duckdb.org/docs/data/parquet/overview.html#parameters). To specify optional parameters, use `parameter := 'value'`.
6770

68-
#### <a name="read_csv"></a>`read_csv(path TEXT or TEXT[], /* optional parameters */) -> SETOF record`
71+
#### <a name="read_csv"></a>`read_csv(path TEXT or TEXT[], /* optional parameters */) -> SETOF duckdb.row`
6972

7073
Reads a CSV file, either from a remote location (via httpfs) or a local file.
7174

72-
Returns a record set (`SETOF record`). Functions that return record sets need to have their columns and types specified using `AS`. You must specify at least one column and any columns used in your query. For example:
75+
This returns DuckDB rows, you can expand them using `*` or you can select specific columns using the `r['mycol']` syntax. If you want to select specific columns you should give the function call an easy alias, like `r`. For example:
7376

7477
```sql
75-
SELECT COUNT(i) FROM read_csv('file.csv') AS (int i);
78+
SELECT * FROM read_csv('file.csv');
79+
SELECT r['id'], r['name'] FROM read_csv('file.csv') r WHERE r['age'] > 21;
80+
SELECT COUNT(*) FROM read_csv('file.csv');
7681
```
7782

7883
Further information:
@@ -95,14 +100,16 @@ Compatibility notes:
95100
* `columns` is not currently supported.
96101
* `nullstr` must be an array (`TEXT[]`).
97102

98-
#### <a name="read_json"></a>`read_json(path TEXT or TEXT[], /* optional parameters */) -> SETOF record`
103+
#### <a name="read_json"></a>`read_json(path TEXT or TEXT[], /* optional parameters */) -> SETOF duckdb.row`
99104

100105
Reads a JSON file, either from a remote location (via httpfs) or a local file.
101106

102-
Returns a record set (`SETOF record`). Functions that return record sets need to have their columns and types specified using `AS`. You must specify at least one column and any columns used in your query. For example:
107+
This returns DuckDB rows, you can expand them using `*` or you can select specific columns using the `r['mycol']` syntax. If you want to select specific columns you should give the function call an easy alias, like `r`. For example:
103108

104109
```sql
105-
SELECT COUNT(i) FROM read_json('file.json') AS (int i);
110+
SELECT * FROM read_parquet('file.parquet');
111+
SELECT r['id'], r['name'] FROM read_parquet('file.parquet') r WHERE r['age'] > 21;
112+
SELECT COUNT(*) FROM read_parquet('file.parquet');
106113
```
107114

108115
Further information:
@@ -123,7 +130,7 @@ Compatibility notes:
123130

124131
* `columns` is not currently supported.
125132

126-
#### <a name="iceberg_scan"></a>`iceberg_scan(path TEXT, /* optional parameters */) -> SETOF record`
133+
#### <a name="iceberg_scan"></a>`iceberg_scan(path TEXT, /* optional parameters */) -> SETOF duckdb.row`
127134

128135
Reads an Iceberg table, either from a remote location (via httpfs) or a local directory.
129136

@@ -133,10 +140,12 @@ To use `iceberg_scan`, you must enable the `iceberg` extension:
133140
SELECT duckdb.install_extension('iceberg');
134141
```
135142

136-
Returns a record set (`SETOF record`). Functions that return record sets need to have their columns and types specified using `AS`. You must specify at least one column and any columns used in your query. For example:
143+
This returns DuckDB rows, you can expand them using `*` or you can select specific columns using the `r['mycol']` syntax. If you want to select specific columns you should give the function call an easy alias, like `r`. For example:
137144

138145
```sql
139-
SELECT COUNT(i) FROM iceberg_scan('data/iceberg/table') AS (int i);
146+
SELECT * FROM iceberg_scan('data/iceberg/table');
147+
SELECT r['id'], r['name'] FROM iceberg_scan('data/iceberg/table') r WHERE r['age'] > 21;
148+
SELECT COUNT(*) FROM iceberg_scan('data/iceberg/table');
140149
```
141150

142151
Further information:
@@ -209,22 +218,25 @@ Optional parameters mirror DuckDB's `iceberg_metadata` function based on the Duc
209218

210219
TODO
211220

212-
#### <a name="delta_scan"></a>`delta_scan(path TEXT) -> SETOF record`
221+
#### <a name="delta_scan"></a>`delta_scan(path TEXT) -> SETOF duckdb.row`
213222

214223
Reads a delta dataset, either from a remote (via httpfs) or a local location.
215224

216-
Returns a record set (`SETOF record`). Functions that return record sets need to have their columns and types specified using `AS`. You must specify at least one column and any columns used in your query. For example:
217-
218225
To use `delta_scan`, you must enable the `delta` extension:
219226

220227
```sql
221228
SELECT duckdb.install_extension('delta');
222229
```
223230

231+
This returns DuckDB rows, you can expand them using `*` or you can select specific columns using the `r['mycol']` syntax. If you want to select specific columns you should give the function call an easy alias, like `r`. For example:
232+
224233
```sql
225-
SELECT COUNT(i) FROM delta_scan('/path/to/delta/dataset') AS (int i);
234+
SELECT * FROM delta_scan('/path/to/delta/dataset');
235+
SELECT r['id'], r['name'] FROM delta_scan('/path/to/delta/dataset') r WHERE r['age'] > 21;
236+
SELECT COUNT(*) FROM delta_scan('/path/to/delta/dataset');
226237
```
227238

239+
228240
Further information:
229241

230242
* [DuckDB Delta extension documentation](https://duckdb.org/docs/extensions/delta)
@@ -248,7 +260,6 @@ Note that cache management is not automated. Cached data must be deleted manuall
248260
| path | text | The path to a remote httpfs location to cache. |
249261
| type | text | File type, either `parquet` or `csv` |
250262

251-
252263
#### <a name="cache_info"></a>`duckdb.cache_info() -> (remote_path text, cache_key text, cache_file_size BIGINT, cache_file_timestamp TIMESTAMPTZ)`
253264

254265
Inspects which remote files are currently cached in DuckDB. The returned data is as follows:
@@ -280,6 +291,34 @@ WHERE remote_path = '...';
280291

281292
#### <a name="install_extension"></a>`duckdb.install_extension(extension_name TEXT) -> bool`
282293

294+
Installs a DuckDB extension and configures it to be loaded automatically in
295+
every session that uses pg_duckdb.
296+
297+
```sql
298+
SELECT duckdb.install_extension('iceberg');
299+
```
300+
301+
##### Security
302+
303+
Since this function can be used to install and download any of the official
304+
extensions it can only be executed by a superuser by default. To allow
305+
execution by some other admin user, such as `my_admin`, you can grant such a
306+
user the following permissions:
307+
308+
```sql
309+
GRANT ALL ON FUNCTION duckdb.install_extension(TEXT) TO my_admin;
310+
GRANT ALL ON TABLE duckdb.extensions TO my_admin;
311+
GRANT ALL ON SEQUENCE duckdb.extensions_table_seq TO my_admin;
312+
```
313+
314+
##### Required Arguments
315+
316+
| Name | Type | Description |
317+
| :--- | :--- | :---------- |
318+
| extension_name | text | The name of the extension to install |
319+
320+
#### <a name="query"></a>`duckdb.query(query TEXT) -> SETOF duckdb.row`
321+
283322
TODO
284323

285324
#### <a name="raw_query"></a>`duckdb.raw_query(extension_name TEXT) -> void`
@@ -288,7 +327,14 @@ TODO
288327

289328
#### <a name="recycle_ddb"></a>`duckdb.recycle_ddb() -> void`
290329

291-
TODO
330+
pg_duckdb keeps the DuckDB instance open inbetween transactions. This is done
331+
to save session level state, such as manually done `SET` commands. If you want
332+
to clear this session level state for some reason you can close the currently
333+
open DuckDB instance using:
334+
335+
```sql
336+
CALL duckdb.recycle_ddb();
337+
```
292338

293339
#### <a name="force_motherduck_sync"></a>`duckdb.force_motherduck_sync(drop_with_cascade BOOLEAN DEFAULT false)`
294340

0 commit comments

Comments
 (0)