BigQuery DataFrames (BigFrames) AI Forecast#

This notebook is adapted from https://github.com/googleapis/python-bigquery-dataframes/blob/main/notebooks/generative_ai/bq_dataframes_ai_forecast.ipynb to work in the Kaggle runtime. It introduces forecasting with GenAI Foundation Model with BigFrames AI.

Install the bigframes package and upgrade other packages that are already included in Kaggle but have versions incompatible with bigframes.

%pip install --upgrade bigframes google-cloud-automl google-cloud-translate google-ai-generativelanguage tensorflow 

Important: restart the kernel by going to “Run -> Restart & clear cell outputs” before continuing.

Configure bigframes to use your GCP project. First, go to “Add-ons -> Google Cloud SDK” and click the “Attach” button. Then,

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
user_credential = user_secrets.get_gcloud_credential()
user_secrets.set_tensorflow_credential(user_credential)
PROJECT = "swast-scratch" # replace with your project


import bigframes.pandas as bpd
bpd.options.bigquery.project = PROJECT
bpd.options.bigquery.ordering_mode = "partial" # Optional: partial ordering mode can accelerate executions and save costs

1. Create a BigFrames DataFrames from BigQuery public data.#

df = bpd.read_gbq("bigquery-public-data.san_francisco_bikeshare.bikeshare_trips")
df
/usr/local/lib/python3.11/dist-packages/bigframes/core/log_adapter.py:175: TimeTravelCacheWarning: Reading cached table from 2025-08-18 19:19:20.590271+00:00 to avoid
incompatibilies with previous reads of this table. To read the latest
version, set `use_cache=False` or close the current session with
Session.close() or bigframes.pandas.close_session().
  return method(*args, **kwargs)
trip_id duration_sec start_date start_station_name start_station_id end_date end_station_name end_station_id bike_number zip_code ... c_subscription_type start_station_latitude start_station_longitude end_station_latitude end_station_longitude member_birth_year member_gender bike_share_for_all_trip start_station_geom end_station_geom
0 201802092135083596 788 2018-02-09 21:35:08+00:00 10th Ave at E 15th St 222 2018-02-09 21:48:17+00:00 10th Ave at E 15th St 222 3596 <NA> ... <NA> 37.792714 -122.24878 37.792714 -122.24878 1984 Male Yes POINT (-122.24878 37.79271) POINT (-122.24878 37.79271)
1 201708152357422491 965 2017-08-15 23:57:42+00:00 10th St at Fallon St 201 2017-08-16 00:13:48+00:00 10th Ave at E 15th St 222 2491 <NA> ... <NA> 37.797673 -122.262997 37.792714 -122.24878 <NA> <NA> <NA> POINT (-122.26300 37.79767) POINT (-122.24878 37.79271)
2 201802281657253632 560 2018-02-28 16:57:25+00:00 10th St at Fallon St 201 2018-02-28 17:06:46+00:00 10th Ave at E 15th St 222 3632 <NA> ... <NA> 37.797673 -122.262997 37.792714 -122.24878 1984 Male Yes POINT (-122.26300 37.79767) POINT (-122.24878 37.79271)
3 201711170046091337 497 2017-11-17 00:46:09+00:00 10th St at Fallon St 201 2017-11-17 00:54:26+00:00 10th Ave at E 15th St 222 1337 <NA> ... <NA> 37.797673 -122.262997 37.792714 -122.24878 <NA> <NA> <NA> POINT (-122.26300 37.79767) POINT (-122.24878 37.79271)
4 201802201913231257 596 2018-02-20 19:13:23+00:00 10th St at Fallon St 201 2018-02-20 19:23:19+00:00 10th Ave at E 15th St 222 1257 <NA> ... <NA> 37.797673 -122.262997 37.792714 -122.24878 1984 Male Yes POINT (-122.26300 37.79767) POINT (-122.24878 37.79271)
5 201708242325001279 1341 2017-08-24 23:25:00+00:00 10th St at Fallon St 201 2017-08-24 23:47:22+00:00 10th Ave at E 15th St 222 1279 <NA> ... <NA> 37.797673 -122.262997 37.792714 -122.24878 1969 Male <NA> POINT (-122.26300 37.79767) POINT (-122.24878 37.79271)
6 201801161800473291 489 2018-01-16 18:00:47+00:00 10th St at Fallon St 201 2018-01-16 18:08:56+00:00 10th Ave at E 15th St 222 3291 <NA> ... <NA> 37.797673 -122.262997 37.792714 -122.24878 1984 Male Yes POINT (-122.26300 37.79767) POINT (-122.24878 37.79271)
7 20180408155601183 1105 2018-04-08 15:56:01+00:00 13th St at Franklin St 338 2018-04-08 16:14:26+00:00 10th Ave at E 15th St 222 183 <NA> ... <NA> 37.803189 -122.270579 37.792714 -122.24878 1987 Female No POINT (-122.27058 37.80319) POINT (-122.24878 37.79271)
8 201803141857032204 619 2018-03-14 18:57:03+00:00 13th St at Franklin St 338 2018-03-14 19:07:23+00:00 10th Ave at E 15th St 222 2204 <NA> ... <NA> 37.803189 -122.270579 37.792714 -122.24878 1982 Other No POINT (-122.27058 37.80319) POINT (-122.24878 37.79271)
9 201708192053311490 743 2017-08-19 20:53:31+00:00 2nd Ave at E 18th St 200 2017-08-19 21:05:54+00:00 10th Ave at E 15th St 222 1490 <NA> ... <NA> 37.800214 -122.25381 37.792714 -122.24878 <NA> <NA> <NA> POINT (-122.25381 37.80021) POINT (-122.24878 37.79271)
10 201711181823281960 353 2017-11-18 18:23:28+00:00 2nd Ave at E 18th St 200 2017-11-18 18:29:22+00:00 10th Ave at E 15th St 222 1960 <NA> ... <NA> 37.800214 -122.25381 37.792714 -122.24878 1988 Male <NA> POINT (-122.25381 37.80021) POINT (-122.24878 37.79271)
11 20170810204454839 1256 2017-08-10 20:44:54+00:00 2nd Ave at E 18th St 200 2017-08-10 21:05:50+00:00 10th Ave at E 15th St 222 839 <NA> ... <NA> 37.800214 -122.25381 37.792714 -122.24878 <NA> <NA> <NA> POINT (-122.25381 37.80021) POINT (-122.24878 37.79271)
12 201801171656553504 500 2018-01-17 16:56:55+00:00 El Embarcadero at Grand Ave 197 2018-01-17 17:05:16+00:00 10th Ave at E 15th St 222 3504 <NA> ... <NA> 37.808848 -122.24968 37.792714 -122.24878 1987 Male No POINT (-122.24968 37.80885) POINT (-122.24878 37.79271)
13 201801111613101305 858 2018-01-11 16:13:10+00:00 Frank H Ogawa Plaza 7 2018-01-11 16:27:28+00:00 10th Ave at E 15th St 222 1305 <NA> ... <NA> 37.804562 -122.271738 37.792714 -122.24878 1984 Male Yes POINT (-122.27174 37.80456) POINT (-122.24878 37.79271)
14 201802241826551215 1235 2018-02-24 18:26:55+00:00 Frank H Ogawa Plaza 7 2018-02-24 18:47:31+00:00 10th Ave at E 15th St 222 1215 <NA> ... <NA> 37.804562 -122.271738 37.792714 -122.24878 1969 Male No POINT (-122.27174 37.80456) POINT (-122.24878 37.79271)
15 201803091621483450 857 2018-03-09 16:21:48+00:00 Frank H Ogawa Plaza 7 2018-03-09 16:36:06+00:00 10th Ave at E 15th St 222 3450 <NA> ... <NA> 37.804562 -122.271738 37.792714 -122.24878 1984 Male Yes POINT (-122.27174 37.80456) POINT (-122.24878 37.79271)
16 201801021932232717 914 2018-01-02 19:32:23+00:00 Frank H Ogawa Plaza 7 2018-01-02 19:47:38+00:00 10th Ave at E 15th St 222 2717 <NA> ... <NA> 37.804562 -122.271738 37.792714 -122.24878 1984 Male Yes POINT (-122.27174 37.80456) POINT (-122.24878 37.79271)
17 201803161910283751 564 2018-03-16 19:10:28+00:00 Frank H Ogawa Plaza 7 2018-03-16 19:19:52+00:00 10th Ave at E 15th St 222 3751 <NA> ... <NA> 37.804562 -122.271738 37.792714 -122.24878 1987 Male No POINT (-122.27174 37.80456) POINT (-122.24878 37.79271)
18 20171212152403227 854 2017-12-12 15:24:03+00:00 Frank H Ogawa Plaza 7 2017-12-12 15:38:17+00:00 10th Ave at E 15th St 222 227 <NA> ... <NA> 37.804562 -122.271738 37.792714 -122.24878 1984 Male <NA> POINT (-122.27174 37.80456) POINT (-122.24878 37.79271)
19 201803131437033724 917 2018-03-13 14:37:03+00:00 Grand Ave at Webster St 181 2018-03-13 14:52:20+00:00 10th Ave at E 15th St 222 3724 <NA> ... <NA> 37.811377 -122.265192 37.792714 -122.24878 1989 Male No POINT (-122.26519 37.81138) POINT (-122.24878 37.79271)
20 201712061755593426 519 2017-12-06 17:55:59+00:00 Lake Merritt BART Station 163 2017-12-06 18:04:39+00:00 10th Ave at E 15th St 222 3426 <NA> ... <NA> 37.79732 -122.26532 37.792714 -122.24878 1986 Male <NA> POINT (-122.26532 37.79732) POINT (-122.24878 37.79271)
21 20180404210034451 366 2018-04-04 21:00:34+00:00 Lake Merritt BART Station 163 2018-04-04 21:06:41+00:00 10th Ave at E 15th St 222 451 <NA> ... <NA> 37.79732 -122.26532 37.792714 -122.24878 1987 Male No POINT (-122.26532 37.79732) POINT (-122.24878 37.79271)
22 201801231907161787 626 2018-01-23 19:07:16+00:00 Lake Merritt BART Station 163 2018-01-23 19:17:43+00:00 10th Ave at E 15th St 222 1787 <NA> ... <NA> 37.79732 -122.26532 37.792714 -122.24878 1987 Male No POINT (-122.26532 37.79732) POINT (-122.24878 37.79271)
23 201708271057061157 973 2017-08-27 10:57:06+00:00 Lake Merritt BART Station 163 2017-08-27 11:13:19+00:00 10th Ave at E 15th St 222 1157 <NA> ... <NA> 37.79732 -122.26532 37.792714 -122.24878 <NA> <NA> <NA> POINT (-122.26532 37.79732) POINT (-122.24878 37.79271)
24 201709071348372074 11434 2017-09-07 13:48:37+00:00 Lake Merritt BART Station 163 2017-09-07 16:59:12+00:00 10th Ave at E 15th St 222 2074 <NA> ... <NA> 37.79732 -122.26532 37.792714 -122.24878 <NA> <NA> <NA> POINT (-122.26532 37.79732) POINT (-122.24878 37.79271)

25 rows × 21 columns

[1947417 rows x 21 columns in total]

2. Preprocess Data#

Only take the start_date after 2018 and the “Subscriber” category as input. start_date are truncated to each hour.

df = df[df["start_date"] >= "2018-01-01"]
df = df[df["subscriber_type"] == "Subscriber"]
df["trip_hour"] = df["start_date"].dt.floor("h")
df = df[["trip_hour", "trip_id"]]

Group and count each hour’s num of trips.

df_grouped = df.groupby("trip_hour").count()
df_grouped = df_grouped.reset_index().rename(columns={"trip_id": "num_trips"})
df_grouped
Query job e3df71d2-9248-491a-8e5f-4bb5bfedb686 is DONE. 58.7 MB processed. Open Job
trip_hour num_trips
0 2018-01-01 00:00:00+00:00 20
1 2018-01-01 01:00:00+00:00 25
2 2018-01-01 02:00:00+00:00 13
3 2018-01-01 03:00:00+00:00 11
4 2018-01-01 05:00:00+00:00 4
5 2018-01-01 06:00:00+00:00 8
6 2018-01-01 07:00:00+00:00 8
7 2018-01-01 08:00:00+00:00 20
8 2018-01-01 09:00:00+00:00 30
9 2018-01-01 10:00:00+00:00 41
10 2018-01-01 11:00:00+00:00 45
11 2018-01-01 12:00:00+00:00 54
12 2018-01-01 13:00:00+00:00 57
13 2018-01-01 14:00:00+00:00 68
14 2018-01-01 15:00:00+00:00 86
15 2018-01-01 16:00:00+00:00 72
16 2018-01-01 17:00:00+00:00 72
17 2018-01-01 18:00:00+00:00 47
18 2018-01-01 19:00:00+00:00 32
19 2018-01-01 20:00:00+00:00 34
20 2018-01-01 21:00:00+00:00 27
21 2018-01-01 22:00:00+00:00 15
22 2018-01-01 23:00:00+00:00 6
23 2018-01-02 00:00:00+00:00 2
24 2018-01-02 01:00:00+00:00 1

25 rows × 2 columns

[2842 rows x 2 columns in total]

3. Make forecastings for next 1 week with DataFrames.ai.forecast API#

# Using all the data except the last week (2842-168) for training. And predict the last week (168).
result = df_grouped.head(2842-168).ai.forecast(timestamp_column="trip_hour", data_column="num_trips", horizon=168) 
result
Query job 3f1225a8-b80b-4dfa-a7cf-94b93e7c18c2 is DONE. 68.2 kB processed. Open Job
forecast_timestamp forecast_value confidence_level prediction_interval_lower_bound prediction_interval_upper_bound ai_forecast_status
0 2018-04-24 12:00:00+00:00 144.577728 0.95 120.01921 169.136247
1 2018-04-25 00:00:00+00:00 54.215515 0.95 46.8394 61.591631
2 2018-04-26 05:00:00+00:00 8.140533 0.95 -14.613272 30.894339
3 2018-04-26 14:00:00+00:00 198.744949 0.95 174.982268 222.50763
4 2018-04-27 02:00:00+00:00 9.91806 0.95 -26.749948 46.586069
5 2018-04-29 03:00:00+00:00 32.063339 0.95 -35.730978 99.857656
6 2018-04-27 04:00:00+00:00 25.757111 0.95 8.178037 43.336184
7 2018-04-30 06:00:00+00:00 89.808456 0.95 15.214961 164.401952
8 2018-04-30 02:00:00+00:00 -10.584175 0.95 -60.772024 39.603674
9 2018-04-30 05:00:00+00:00 18.118111 0.95 -40.902133 77.138355
10 2018-04-24 07:00:00+00:00 359.036957 0.95 250.880334 467.193579
11 2018-04-25 10:00:00+00:00 227.272049 0.95 170.918819 283.625279
12 2018-04-27 15:00:00+00:00 208.631363 0.95 188.977435 228.285291
13 2018-04-25 13:00:00+00:00 159.799911 0.95 150.066363 169.53346
14 2018-04-26 12:00:00+00:00 190.226944 0.95 177.898865 202.555023
15 2018-04-24 04:00:00+00:00 11.162338 0.95 -18.581041 40.905717
16 2018-04-24 14:00:00+00:00 136.70816 0.95 134.165413 139.250907
17 2018-04-28 21:00:00+00:00 65.308899 0.95 63.000915 67.616883
18 2018-04-29 20:00:00+00:00 71.788849 0.95 -2.49023 146.067928
19 2018-04-30 15:00:00+00:00 142.560944 0.95 41.495553 243.626334
20 2018-04-26 18:00:00+00:00 533.783813 0.95 412.068752 655.498875
21 2018-04-28 03:00:00+00:00 25.379761 0.95 22.565752 28.193769
22 2018-04-30 12:00:00+00:00 158.313385 0.95 79.466457 237.160313
23 2018-04-25 07:00:00+00:00 358.756592 0.95 276.305603 441.207581
24 2018-04-27 22:00:00+00:00 103.589096 0.95 94.45235 112.725842

25 rows × 6 columns

[168 rows x 6 columns in total]

4. Process the raw result and draw a line plot along with the training data#

result = result.sort_values("forecast_timestamp")
result = result[["forecast_timestamp", "forecast_value"]]
result = result.rename(columns={"forecast_timestamp": "trip_hour", "forecast_value": "num_trips_forecast"})
df_all = bpd.concat([df_grouped, result])
df_all = df_all.tail(672) # 4 weeks

Plot a line chart and compare with the actual result.

df_all = df_all.set_index("trip_hour")
df_all.plot.line(figsize=(16, 8))
<Axes: xlabel='trip_hour'>
../../_images/e3400c236ba2f5e1ad9fd8011612960e62a682636afa0311ca4023b622548ba5.png