Welcome to BigQuery DataFrames#
BigQuery DataFrames (bigframes) provides a Pythonic interface for data analysis that scales to petabytes. It gives you the best of both worlds: the familiar API of pandas and scikit-learn, powered by the distributed computing engine of BigQuery.
BigQuery DataFrames consists of three main components:
bigframes.pandas: A pandas-compatible API for data exploration and transformation.
bigframes.ml: A scikit-learn-like interface for BigQuery ML, including integration with Gemini.
bigframes.bigquery: Specialized functions for managing BigQuery resources and deploying custom logic.
Why BigQuery DataFrames?#
BigFrames allows you to process data where it lives. Instead of downloading massive datasets to your local machine, BigFrames translates your Python code into SQL and executes it across the BigQuery fleet.
Scalability: Work with datasets that exceed local memory limits without complex refactoring.
Collaboration & Extensibility: Bridge the gap between Python and SQL. Deploy custom Python functions to BigQuery, making your logic accessible to SQL-based teammates and data analysts.
Production-Ready Pipelines: Move seamlessly from interactive notebooks to production. BigFrames simplifies data engineering by integrating with tools like dbt and Airflow, offering a simpler operational model than Spark.
Security & Governance: Keep your data within the BigQuery perimeter. Benefit from enterprise-grade security, auditing, and data governance throughout your entire Python workflow.
Familiarity: Use
read_gbq,merge,groupby, andpivot_tablejust like you do in pandas.
Quickstart#
Install the library via pip:
pip install --upgrade bigframes
Load and aggregate a public dataset in just a few lines:
import bigframes.pandas as bpd
# Load data from BigQuery
df = bpd.read_gbq("bigquery-public-data.usa_names.usa_1910_2013")
# Perform familiar pandas operations at scale
top_names = (
df.groupby("name")
.agg({"number": "sum"})
.sort_values("number", ascending=False)
.head(10)
)
print(top_names.to_pandas())
User Guide#
API reference#
- API Reference
- bigframes._config
- bigframes.bigquery
- bigframes.bigquery.approx_top_count
- bigframes.bigquery.array_agg
- bigframes.bigquery.array_length
- bigframes.bigquery.array_to_string
- bigframes.bigquery.create_external_table
- bigframes.bigquery.create_vector_index
- bigframes.bigquery.json_extract
- bigframes.bigquery.json_extract_array
- bigframes.bigquery.json_extract_string_array
- bigframes.bigquery.json_keys
- bigframes.bigquery.json_query
- bigframes.bigquery.json_query_array
- bigframes.bigquery.json_set
- bigframes.bigquery.json_value
- bigframes.bigquery.json_value_array
- bigframes.bigquery.load_data
- bigframes.bigquery.parse_json
- bigframes.bigquery.sql_scalar
- bigframes.bigquery.st_area
- bigframes.bigquery.st_buffer
- bigframes.bigquery.st_centroid
- bigframes.bigquery.st_convexhull
- bigframes.bigquery.st_difference
- bigframes.bigquery.st_distance
- bigframes.bigquery.st_intersection
- bigframes.bigquery.st_isclosed
- bigframes.bigquery.st_length
- bigframes.bigquery.st_regionstats
- bigframes.bigquery.st_simplify
- bigframes.bigquery.struct
- bigframes.bigquery.to_json
- bigframes.bigquery.to_json_string
- bigframes.bigquery.unix_micros
- bigframes.bigquery.unix_millis
- bigframes.bigquery.unix_seconds
- bigframes.bigquery.vector_search
- bigframes.bigquery.ai
- bigframes.bigquery.ai.classify
- bigframes.bigquery.ai.forecast
- bigframes.bigquery.ai.generate
- bigframes.bigquery.ai.generate_bool
- bigframes.bigquery.ai.generate_double
- bigframes.bigquery.ai.generate_embedding
- bigframes.bigquery.ai.generate_int
- bigframes.bigquery.ai.generate_table
- bigframes.bigquery.ai.generate_text
- bigframes.bigquery.ai.if_
- bigframes.bigquery.ai.score
- bigframes.bigquery.ml
- bigframes.bigquery.obj
- bigframes.enums
- bigframes.exceptions
- bigframes.exceptions.format_message
- bigframes.exceptions.AmbiguousWindowWarning
- bigframes.exceptions.ApiDeprecationWarning
- bigframes.exceptions.BadIndexerKeyWarning
- bigframes.exceptions.CleanupFailedWarning
- bigframes.exceptions.DefaultIndexWarning
- bigframes.exceptions.DefaultLocationWarning
- bigframes.exceptions.FunctionAxisOnePreviewWarning
- bigframes.exceptions.FunctionConflictTypeHintWarning
- bigframes.exceptions.FunctionPackageVersionWarning
- bigframes.exceptions.JSONDtypeWarning
- bigframes.exceptions.MaximumResultRowsExceeded
- bigframes.exceptions.NullIndexError
- bigframes.exceptions.NullIndexPreviewWarning
- bigframes.exceptions.ObsoleteVersionWarning
- bigframes.exceptions.OperationAbortedError
- bigframes.exceptions.OrderRequiredError
- bigframes.exceptions.OrderingModePartialPreviewWarning
- bigframes.exceptions.PreviewWarning
- bigframes.exceptions.QueryComplexityError
- bigframes.exceptions.TimeTravelCacheWarning
- bigframes.exceptions.TimeTravelDisabledWarning
- bigframes.exceptions.UnknownDataTypeWarning
- bigframes.exceptions.UnknownLocationWarning
- bigframes.geopandas
- bigframes.pandas
- bigframes.pandas.clean_up_by_session_id
- bigframes.pandas.close_session
- bigframes.pandas.col
- bigframes.pandas.concat
- bigframes.pandas.crosstab
- bigframes.pandas.cut
- bigframes.pandas.deploy_remote_function
- bigframes.pandas.deploy_udf
- bigframes.pandas.from_glob_path
- bigframes.pandas.get_default_session_id
- bigframes.pandas.get_dummies
- bigframes.pandas.get_global_session
- bigframes.pandas.merge
- bigframes.pandas.qcut
- bigframes.pandas.read_arrow
- bigframes.pandas.read_csv
- bigframes.pandas.read_gbq
- bigframes.pandas.read_gbq_function
- bigframes.pandas.read_gbq_model
- bigframes.pandas.read_gbq_object_table
- bigframes.pandas.read_gbq_query
- bigframes.pandas.read_gbq_table
- bigframes.pandas.read_json
- bigframes.pandas.read_pandas
- bigframes.pandas.read_parquet
- bigframes.pandas.read_pickle
- bigframes.pandas.remote_function
- bigframes.pandas.reset_session
- bigframes.pandas.to_datetime
- bigframes.pandas.to_timedelta
- bigframes.pandas.udf
- bigframes.pandas.DataFrame
- bigframes.pandas.DatetimeIndex
- bigframes.pandas.Index
- bigframes.pandas.MultiIndex
- bigframes.pandas.NamedAgg
- bigframes.pandas.Series
- bigframes.pandas.api.typing
- bigframes.streaming
- ML APIs
- bigframes.ml
- bigframes.ml.cluster
- bigframes.ml.compose
- bigframes.ml.decomposition
- bigframes.ml.ensemble
- bigframes.ml.forecasting
- bigframes.ml.imported
- bigframes.ml.impute
- bigframes.ml.linear_model
- bigframes.ml.llm
- bigframes.ml.metrics
- bigframes.ml.model_selection
- bigframes.ml.pipeline
- bigframes.ml.preprocessing
- bigframes.ml.remote
- Supported pandas APIs
Changelog#
For a list of all BigQuery DataFrames releases:
- Changelog
- 2.36.0 (2026-02-17)
- 2.35.0 (2026-02-07)
- 2.34.0 (2026-02-02)
- 2.33.0 (2026-01-22)
- 2.32.0 (2026-01-05)
- 2.31.0 (2025-12-10)
- 2.30.0 (2025-12-03)
- 2.29.0 (2025-11-10)
- 2.28.0 (2025-11-03)
- 2.27.0 (2025-10-24)
- 2.26.0 (2025-10-17)
- 2.25.0 (2025-10-13)
- 2.24.0 (2025-10-07)
- 2.23.0 (2025-09-29)
- 2.22.0 (2025-09-25)
- 2.21.0 (2025-09-17)
- 2.20.0 (2025-09-16)
- 2.19.0 (2025-09-09)
- 2.18.0 (2025-09-03)
- 2.17.0 (2025-08-22)
- 2.16.0 (2025-08-20)
- 2.15.0 (2025-08-11)
- 2.14.0 (2025-08-05)
- 2.13.0 (2025-07-25)
- 2.12.0 (2025-07-23)
- 2.11.0 (2025-07-15)
- 2.10.0 (2025-07-08)
- 2.9.0 (2025-06-30)
- 2.8.0 (2025-06-23)
- 2.7.0 (2025-06-16)
- 2.6.0 (2025-06-09)
- 2.5.0 (2025-05-30)
- 2.4.0 (2025-05-12)
- 2.3.0 (2025-05-06)
- 2.2.0 (2025-04-30)
- 2.1.0 (2025-04-22)
- 2.0.0 (2025-04-17)
- 1.42.0 (2025-03-27)
- 1.41.0 (2025-03-19)
- 1.40.0 (2025-03-11)
- 1.39.0 (2025-03-05)
- 1.38.0 (2025-02-24)
- 1.37.0 (2025-02-19)
- 1.36.0 (2025-02-11)
- 1.35.0 (2025-02-04)
- 1.34.0 (2025-01-27)
- 1.33.0 (2025-01-22)
- 1.32.0 (2025-01-13)
- 1.31.0 (2025-01-05)
- 1.30.0 (2024-12-30)
- 1.29.0 (2024-12-12)
- 1.28.0 (2024-12-11)
- 1.27.0 (2024-11-16)
- 1.26.0 (2024-11-12)
- 1.25.0 (2024-10-29)
- 1.24.0 (2024-10-24)
- 1.23.0 (2024-10-23)
- 1.22.0 (2024-10-09)
- 1.21.0 (2024-10-02)
- 1.20.0 (2024-09-25)
- 1.19.0 (2024-09-24)
- 1.18.0 (2024-09-18)
- 1.17.0 (2024-09-11)
- 1.16.0 (2024-09-04)
- 1.15.0 (2024-08-20)
- 1.14.0 (2024-08-14)
- 1.13.0 (2024-08-05)
- 1.12.0 (2024-07-31)
- 1.11.1 (2024-07-08)
- 1.11.0 (2024-07-01)
- 1.10.0 (2024-06-21)
- 1.9.0 (2024-06-10)
- 1.8.0 (2024-05-31)
- 1.7.0 (2024-05-20)
- 1.6.0 (2024-05-13)
- 1.5.0 (2024-05-07)
- 1.4.0 (2024-04-29)
- 1.3.0 (2024-04-22)
- 1.2.0 (2024-04-15)
- 1.1.0 (2024-04-04)
- 1.0.0 (2024-03-25)
- 0.26.0 (2024-03-20)
- 0.25.0 (2024-03-14)
- 0.24.0 (2024-03-12)
- 0.23.0 (2024-03-05)
- 0.22.0 (2024-02-27)
- 0.21.0 (2024-02-13)
- 0.20.1 (2024-02-06)
- 0.20.0 (2024-01-30)
- 0.19.2 (2024-01-22)
- 0.19.1 (2024-01-17)
- 0.19.0 (2024-01-09)
- 0.18.0 (2024-01-02)
- 0.17.0 (2023-12-14)
- 0.16.0 (2023-12-12)
- 0.15.0 (2023-11-29)
- 0.14.1 (2023-11-16)
- 0.14.0 (2023-11-14)
- 0.13.0 (2023-11-07)
- 0.12.0 (2023-11-01)
- 0.11.0 (2023-10-26)
- 0.10.0 (2023-10-19)
- 0.9.0 (2023-10-18)
- 0.8.0 (2023-10-12)
- 0.7.0 (2023-10-11)
- 0.6.0 (2023-10-04)
- 0.5.0 (2023-09-28)
- 0.4.0 (2023-09-16)
- 0.3.2 (2023-09-06)
- 0.3.1 (2023-09-05)
- 0.3.0 (2023-09-02)
- 0.2.0 (2023-08-17)
- 0.1.1 (2023-08-14)
- 0.1.0 (2023-08-11)
- 0.0.0 (2023-02-22)