# Copyright 2026 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Analyzing movie posters with BigQuery Dataframe AI functions#
Run in Colab
|
|
|
BigQuery Dataframe provides a Pythonic way to use AI functions directly with your dataframes. In this notebook, you will use these functions to analyze old
movie posters. These posters are images stored in a public Google Cloud Storage bucket: gs://cloud-samples-data/vertex-ai/dataset-management/datasets/classic-movie-posters
Set up#
Before you begin, you need to
Set up your permissions for generative AI functions with these instructions
Set up your Cloud Resource connection by following these instructions
Once you have the permissions set up, import the bigframes.pandas package, and
set your cloud project ID.
import bigframes.pandas as bpd
MY_RPOJECT_ID = "bigframes-dev" # @param {type:"string"}
bpd.options.bigquery.project = MY_RPOJECT_ID
Load data#
First, you load the data from the GCS bucket to a BigQuery Dataframe with the from_glob_path method:
# Replace with your own connection name.
MY_CONNECTION = 'bigframes-default-connection' # @param {type:"string"}
movies = bpd.from_glob_path(
"gs://cloud-samples-data/vertex-ai/dataset-management/datasets/classic-movie-posters/*",
connection = MY_CONNECTION,
name='poster')
movies.head(1)
/usr/local/lib/python3.12/dist-packages/bigframes/core/global_session.py:113: DefaultLocationWarning: No explicit location is set, so using location US for the session.
_global_session = bigframes.session.connect(
/usr/local/lib/python3.12/dist-packages/bigframes/dtypes.py:1010: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())
instead of using `db_dtypes` in the future when available in pandas
(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.
warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)
/usr/local/lib/python3.12/dist-packages/bigframes/core/logging/log_adapter.py:229: ApiDeprecationWarning: The blob accessor is deprecated and will be removed in a future release. Use bigframes.bigquery.obj functions instead.
return prop(*args, **kwargs)
| poster | |
|---|---|
| 0 | ![]() |
1 rows × 1 columns
Extract titles from posters#
import bigframes.bigquery as bbq
movies['title'] = bbq.ai.generate(
("What is the movie title for this poster? Name only", movies['poster']),
endpoint='gemini-2.5-pro'
).struct.field("result")
movies.head(1)
/usr/local/lib/python3.12/dist-packages/bigframes/dtypes.py:1010: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())
instead of using `db_dtypes` in the future when available in pandas
(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.
warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)
/usr/local/lib/python3.12/dist-packages/bigframes/core/logging/log_adapter.py:229: ApiDeprecationWarning: The blob accessor is deprecated and will be removed in a future release. Use bigframes.bigquery.obj functions instead.
return prop(*args, **kwargs)
/usr/local/lib/python3.12/dist-packages/bigframes/dtypes.py:1010: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())
instead of using `db_dtypes` in the future when available in pandas
(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.
warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)
/usr/local/lib/python3.12/dist-packages/bigframes/core/logging/log_adapter.py:229: ApiDeprecationWarning: The blob accessor is deprecated and will be removed in a future release. Use bigframes.bigquery.obj functions instead.
return prop(*args, **kwargs)
| poster | title | |
|---|---|---|
| 0 | ![]() |
Der Student von Prag |
1 rows × 2 columns
Notice that ai.generate() has a struct return type, which holds not only the LLM response, but also the status. If you do not provide a field name for your answer, "result" will be the default name. You can access LLM response content with the struct accessor (e.g. my_response.struct.filed("result"));.
Get movie release year#
In the example below, you will use ai.generate_int() to find the release year for each movie poster:
movies['year'] = bbq.ai.generate_int(
("What is the release year for this movie?", movies['title']),
endpoint='gemini-2.5-pro'
).struct.field("result")
movies.head(1)
/usr/local/lib/python3.12/dist-packages/bigframes/dtypes.py:1010: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())
instead of using `db_dtypes` in the future when available in pandas
(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.
warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)
/usr/local/lib/python3.12/dist-packages/bigframes/core/logging/log_adapter.py:229: ApiDeprecationWarning: The blob accessor is deprecated and will be removed in a future release. Use bigframes.bigquery.obj functions instead.
return prop(*args, **kwargs)
| poster | title | year | |
|---|---|---|---|
| 0 | ![]() |
Der Student von Prag | 1913 |
1 rows × 3 columns
movies.dtypes
/usr/local/lib/python3.12/dist-packages/bigframes/dtypes.py:1010: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())
instead of using `db_dtypes` in the future when available in pandas
(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.
warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)
| 0 | |
|---|---|
| poster | struct<uri: string, version: string, authorize... |
| title | string[pyarrow] |
| year | Int64 |
Filter movie by production country#
In the next example, you will use ai.if_() to find the movies that were produced in the USA.
us_movies = movies[bbq.ai.if_(
("The movie ", movies['title'], " was made in US")
)]
us_movies.head(1)
/usr/local/lib/python3.12/dist-packages/bigframes/dtypes.py:1010: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())
instead of using `db_dtypes` in the future when available in pandas
(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.
warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)
/usr/local/lib/python3.12/dist-packages/bigframes/core/logging/log_adapter.py:229: ApiDeprecationWarning: The blob accessor is deprecated and will be removed in a future release. Use bigframes.bigquery.obj functions instead.
return prop(*args, **kwargs)
| poster | title | year | |
|---|---|---|---|
| 8 | ![]() |
Shoulder Arms | 1918 |
1 rows × 3 columns
Run in Colab



