BigQuery DataFrames provides a Pythonic DataFrame and machine learning (ML) API powered by the BigQuery engine.
bigframes.pandas
provides a pandas-compatible API for analytics.bigframes.ml
provides a scikit-learn-like API for ML.
BigQuery DataFrames is an open-source package. You can run
pip install --upgrade bigframes
to install the latest version.
Version 2.0 introduces breaking changes for improved security and performance. Key default behaviors have changed:
- Large Results (>10GB): The default value for
allow_large_results
has changed to False. Methods liketo_pandas()
will now fail if the query result's compressed data size exceeds 10GB, unless large results are explicitly permitted. - Remote Function Security:The library no longer automatically lets the Compute Engine default service account become the identity of the Cloud Run functions. If that is desired, it has to be indicated by passing cloud_function_service_account="default". And network ingress now defaults to "internal-only".
- @remote_function Argument Passing: Arguments other than input_types, output_type, and dataset to remote_function must now be passed using keyword syntax, as positional arguments are no longer supported.
- Endpoint Connections: Automatic fallback to locational endpoints in certain regions is removed.
- LLM Updates (Gemini Integration): Integrations now default to the
gemini-2.0-flash-001
model. PaLM2 support has been removed; please migrate any existing PaLM2 usage to Gemini. Note: The current default model will be removed in Version 3.0.
Important: If you are not ready to adapt to these changes, please pin your dependency to a version less than 2.0 (e.g., bigframes==1.42.0
) to avoid disruption.
To learn about these changes and how to migrate to version 2.0, see: updated introduction guide.
- BigQuery DataFrames source code (GitHub)
- BigQuery DataFrames sample notebooks
- BigQuery DataFrames API reference
- BigQuery DataFrames supported pandas APIs
Read Introduction to BigQuery DataFrames and try the BigQuery DataFrames quickstart to get up and running in just a few minutes.
BigQuery DataFrames is distributed with the Apache-2.0 license.
It also contains code derived from the following third-party packages:
For details, see the third_party directory.
For further help and provide feedback, you can email us at [email protected].