BigQuery DataFrames (BigFrames)#

GA pypi versions

BigQuery DataFrames (also known as BigFrames) provides a Pythonic DataFrame and machine learning (ML) API powered by the BigQuery engine. It provides modules for many use cases, including:

  • bigframes.pandas is a pandas API for analytics. Many workloads can be migrated from pandas to bigframes by just changing a few imports.

  • bigframes.ml is a scikit-learn-like API for ML.

  • bigframes.bigquery.ai are a collection of powerful AI methods, powered by Gemini.

BigQuery DataFrames is an open-source package.

Getting started with BigQuery DataFrames#

The easiest way to get started is to try the BigFrames quickstart in a notebook in BigQuery Studio.

To use BigFrames in your local development environment,

  1. Run pip install --upgrade bigframes to install the latest version.

  2. Setup Application default credentials for your local development environment enviroment.

  3. Create a GCP project with the BigQuery API enabled.

  4. Use the bigframes package to query data.

import bigframes.pandas as bpd

bpd.options.bigquery.project = your_gcp_project_id  # Optional in BQ Studio.
bpd.options.bigquery.ordering_mode = "partial"  # Recommended for performance.
df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013")
print(
    df.groupby("name")
    .agg({"number": "sum"})
    .sort_values("number", ascending=False)
    .head(10)
    .to_pandas()
)

Documentation#

To learn more about BigQuery DataFrames, visit these pages

License#

BigQuery DataFrames is distributed with the Apache-2.0 license.

It also contains code derived from the following third-party packages:

For details, see the third_party directory.

Contact Us#

For further help and provide feedback, you can email us at bigframes-feedback@google.com.

API reference#

Changelog#

For a list of all BigQuery DataFrames releases: