Skip to content

Commit 93a7338

Browse files
Add PolarsDataFrameIterator documentation to polars.rst
Document the PolarsDataFrameIterator class and its as_polars() method for consistency with pandas.rst documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent e8873dd commit 93a7338

File tree

1 file changed

+45
-0
lines changed

1 file changed

+45
-0
lines changed

docs/polars.rst

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -334,6 +334,51 @@ The chunked iteration also works with the unload option:
334334
# Process Parquet data in chunks
335335
process_chunk(chunk)
336336
337+
When the chunksize option is used, the object returned by the ``as_polars`` method is a ``PolarsDataFrameIterator`` object.
338+
This object provides the same chunked iteration interface and can be used in the same way:
339+
340+
.. code:: python
341+
342+
from pyathena import connect
343+
from pyathena.polars.cursor import PolarsCursor
344+
345+
cursor = connect(s3_staging_dir="s3://YOUR_S3_BUCKET/path/to/",
346+
region_name="us-west-2",
347+
cursor_class=PolarsCursor).cursor(chunksize=50_000)
348+
df_iter = cursor.execute("SELECT * FROM many_rows").as_polars()
349+
for df in df_iter:
350+
print(df.describe())
351+
print(df.head())
352+
353+
The ``PolarsDataFrameIterator`` also has an ``as_polars()`` method that collects all chunks into a single DataFrame:
354+
355+
.. code:: python
356+
357+
from pyathena import connect
358+
from pyathena.polars.cursor import PolarsCursor
359+
360+
cursor = connect(s3_staging_dir="s3://YOUR_S3_BUCKET/path/to/",
361+
region_name="us-west-2",
362+
cursor_class=PolarsCursor).cursor(chunksize=50_000)
363+
df_iter = cursor.execute("SELECT * FROM many_rows").as_polars()
364+
df = df_iter.as_polars() # Collect all chunks into a single DataFrame
365+
366+
This is equivalent to using `polars.concat`_:
367+
368+
.. code:: python
369+
370+
import polars as pl
371+
from pyathena import connect
372+
from pyathena.polars.cursor import PolarsCursor
373+
374+
cursor = connect(s3_staging_dir="s3://YOUR_S3_BUCKET/path/to/",
375+
region_name="us-west-2",
376+
cursor_class=PolarsCursor).cursor(chunksize=50_000)
377+
df_iter = cursor.execute("SELECT * FROM many_rows").as_polars()
378+
df = pl.concat(list(df_iter))
379+
380+
.. _`polars.concat`: https://docs.pola.rs/api/python/stable/reference/api/polars.concat.html
381+
337382
.. _async-polars-cursor:
338383

339384
AsyncPolarsCursor

0 commit comments

Comments
 (0)