{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Copyright 2025 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "YOrUAvz6DMw-" }, "source": [ "# BigFrames Multimodal DataFrame\n", "\n", "
\n",
" \n",
" Run in Colab\n",
" \n",
" | \n",
" \n",
" \n",
" \n",
" View on GitHub\n",
" \n",
" | \n",
" \n",
" \n",
" | \n",
"
| \n", " | image | \n", "
|---|---|
| 0 | \n", "![]() | \n",
"
| 1 | \n", "![]() | \n",
"
| 2 | \n", "![]() | \n",
"
| 3 | \n", "![]() | \n",
"
| 4 | \n", "![]() | \n",
"
5 rows × 1 columns
\n", "| \n", " | image | \n", "author | \n", "content_type | \n", "size | \n", "updated | \n", "
|---|---|---|---|---|---|
| 0 | \n", "![]() | \n",
" alice | \n", "image/png | \n", "1591240 | \n", "2025-03-20 17:45:04+00:00 | \n", "
| 1 | \n", "![]() | \n",
" bob | \n", "image/png | \n", "1182951 | \n", "2025-03-20 17:45:02+00:00 | \n", "
| 2 | \n", "![]() | \n",
" bob | \n", "image/png | \n", "1520884 | \n", "2025-03-20 17:44:55+00:00 | \n", "
| 3 | \n", "![]() | \n",
" alice | \n", "image/png | \n", "1235401 | \n", "2025-03-20 17:45:19+00:00 | \n", "
| 4 | \n", "![]() | \n",
" bob | \n", "image/png | \n", "1591923 | \n", "2025-03-20 17:44:47+00:00 | \n", "
5 rows × 5 columns
\n", "| \n", " | image | \n", "blurred | \n", "
|---|---|---|
| 0 | \n", "![]() | \n",
" ![]() | \n",
"
| 1 | \n", "![]() | \n",
" ![]() | \n",
"
| 2 | \n", "![]() | \n",
" ![]() | \n",
"
| 3 | \n", "![]() | \n",
" ![]() | \n",
"
| 4 | \n", "![]() | \n",
" ![]() | \n",
"
5 rows × 2 columns
\n", "| \n", " | ml_generate_text_llm_result | \n", "image | \n", "
|---|---|---|
| 0 | \n", "The item is a container of K9 Guard Dog Paw Balm. | \n", "![]() | \n",
"
| 1 | \n", "The item is K9 Guard Dog Hot Spot Spray. | \n", "![]() | \n",
"
| 2 | \n", "The image contains three bags of food, likely for small animals like rabbits or guinea pigs. They are labeled \"Timoth Hay Lend Variety Plend\", \"Herbal Greeıs Mix Variety Blend\", and \"Berry & Blossom Treat Blend\", all under the brand \"Fluffy Buns.\" The bags are yellow, green, and purple, respectively. Each bag has a pile of its contents beneath it. | \n", "![]() | \n",
"
| 3 | \n", "The item is a cat tree.\\n | \n", "![]() | \n",
"
| 4 | \n", "The item is a bag of bird seed. Specifically, it's labeled \"Chirpy Seed\", \"Deluxe Bird Food\".\\n | \n", "![]() | \n",
"
5 rows × 2 columns
\n", "| \n", " | ml_generate_text_llm_result | \n", "image | \n", "
|---|---|---|
| 0 | \n", "The item is a container of Dog Paw Balm. | \n", "![]() | \n",
"
| 1 | \n", "The picture contains many colors, including white, black, green, and a bright blue. The product label predominantly features a bright blue hue. The background is a solid gray. | \n", "![]() | \n",
"
| 2 | \n", "Here are the product names from the image:\\n\\n* **Timoth Hay Lend Variety Plend** is the product in the yellow bag.\\n* **Herbal Greeıs Mix Variety Blend** is the product in the green bag.\\n* **Berry & Blossom Treat Blend** is the product in the purple bag. | \n", "![]() | \n",
"
| 3 | \n", "Yes, it is for pets. It appears to be a cat tree or scratching post.\\n | \n", "![]() | \n",
"
| 4 | \n", "The image shows that the weight of the product is 15 oz/ 257g. | \n", "![]() | \n",
"
5 rows × 2 columns
\n", "| \n", " | ml_generate_embedding_result | \n", "ml_generate_embedding_status | \n", "ml_generate_embedding_start_sec | \n", "ml_generate_embedding_end_sec | \n", "content | \n", "
|---|---|---|---|---|---|
| 0 | \n", "[ 0.00638822 0.01666385 0.00451817 ... -0.02... | \n", "\n", " | <NA> | \n", "<NA> | \n", "{\"access_urls\":{\"expiry_time\":\"2026-02-21T01:4... | \n", "
| 1 | \n", "[ 0.00973976 0.02148137 0.0024429 ... 0.00... | \n", "\n", " | <NA> | \n", "<NA> | \n", "{\"access_urls\":{\"expiry_time\":\"2026-02-21T01:4... | \n", "
| 2 | \n", "[ 0.01195884 0.02139394 0.05968047 ... -0.01... | \n", "\n", " | <NA> | \n", "<NA> | \n", "{\"access_urls\":{\"expiry_time\":\"2026-02-21T01:4... | \n", "
| 3 | \n", "[-0.02621161 0.02797648 0.04416926 ... -0.01... | \n", "\n", " | <NA> | \n", "<NA> | \n", "{\"access_urls\":{\"expiry_time\":\"2026-02-21T01:4... | \n", "
| 4 | \n", "[ 0.05918628 0.0125137 0.01907336 ... 0.01... | \n", "\n", " | <NA> | \n", "<NA> | \n", "{\"access_urls\":{\"expiry_time\":\"2026-02-21T01:4... | \n", "
5 rows × 5 columns
\n", "| \n", " | extracted_text | \n", "chunked | \n", "
|---|---|---|
| 0 | \n", "CritterCuisine Pro 5000 - Automatic Pet Feeder... | \n", "[\"CritterCuisine Pro 5000 - Automatic Pet Feed... | \n", "
1 rows × 2 columns
\n", "0 CritterCuisine Pro 5000 - Automatic Pet Feeder...\n",
"0 on a level, stable surface to prevent tipping....\n",
"0 included)\\nto maintain the schedule during pow...\n",
"0 digits for Meal 1 will flash.\\n\u0000. Use the UP/D...\n",
"0 paperclip) for 5\\nseconds. This will reset all...\n",
"0 unit with a damp cloth. Do not immerse the bas...\n",
"0 continues,\\ncontact customer support.\\nE2: Foo..."
],
"text/plain": [
"0 CritterCuisine Pro 5000 - Automatic Pet Feeder...\n",
"0 on a level, stable surface to prevent tipping....\n",
"0 included)\\nto maintain the schedule during pow...\n",
"0 digits for Meal 1 will flash.\\n\u0000. Use the UP/D...\n",
"0 paperclip) for 5\\nseconds. This will reset all...\n",
"0 unit with a damp cloth. Do not immerse the bas...\n",
"0 continues,\\ncontact customer support.\\nE2: Foo...\n",
"Name: chunked, dtype: string"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Explode the chunks to see each chunk as a separate row\n",
"chunked = df_pdf[\"chunked\"].explode()\n",
"chunked"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 6. Audio transcribe"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"audio_gcs_path = \"gs://bigframes_blob_test/audio/*\"\n",
"df = bpd.from_glob_path(audio_gcs_path, name=\"audio\")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/google/home/shuowei/src/python-bigquery-dataframes/bigframes/dtypes.py:990: JSONDtypeWarning: JSON columns will be represented as pandas.ArrowDtype(pyarrow.json_())\n",
"instead of using `db_dtypes` in the future when available in pandas\n",
"(https://github.com/pandas-dev/pandas/issues/60958) and pyarrow.\n",
" warnings.warn(msg, bigframes.exceptions.JSONDtypeWarning)\n"
]
},
{
"data": {
"text/html": [
"0 Now, as all books, not primarily intended as p..." ], "text/plain": [ "0 Now, as all books, not primarily intended as p...\n", "Name: transcribed_content, dtype: string" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The audio_transcribe function is a convenience wrapper around bigframes.bigquery.ai.generate.\n", "# Here's how to perform the same operation directly:\n", "\n", "audio_series = df[\"audio\"]\n", "prompt_text = (\n", " \"**Task:** Transcribe the provided audio. **Instructions:** - Your response \"\n", " \"must contain only the verbatim transcription of the audio. - Do not include \"\n", " \"any introductory text, summaries, or conversational filler in your response. \"\n", " \"The output should begin directly with the first word of the audio.\"\n", ")\n", "\n", "# Convert the audio series to the runtime representation required by the model.\n", "# This involves fetching metadata and getting a signed access URL.\n", "audio_metadata = bbq.obj.fetch_metadata(audio_series)\n", "audio_runtime = bbq.obj.get_access_url(audio_metadata, mode=\"R\")\n", "\n", "transcribed_results = bbq.ai.generate(\n", " prompt=(prompt_text, audio_runtime),\n", " endpoint=\"gemini-2.0-flash-001\",\n", " model_params={\"generationConfig\": {\"temperature\": 0.0}},\n", ")\n", "\n", "transcribed_series = transcribed_results.struct.field(\"result\").rename(\"transcribed_content\")\n", "transcribed_series" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
0 {'status': '', 'content': 'Now, as all books, ..."
],
"text/plain": [
"0 {'status': '', 'content': 'Now, as all books, ...\n",
"Name: transcription_results, dtype: struct0 {\"ExifOffset\":47,\"Make\":\"MyCamera\"}"
],
"text/plain": [
"0 {\"ExifOffset\":47,\"Make\":\"MyCamera\"}\n",
"Name: blob_col, dtype: extension