{ "cells": [ { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import bigframes.pandas\n", "import pandas as pd\n", "from bigframes.ml.llm import GeminiTextGenerator" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Define the model" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_176683/987800245.py:1: ApiDeprecationWarning: gemini-1.5-X are going to be deprecated. Use gemini-2.0-X (https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm.GeminiTextGenerator) instead. \n", " model = GeminiTextGenerator(model_name=\"gemini-2.0-flash-001\")\n", "/usr/local/google/home/shuowei/src/python-bigquery-dataframes/bigframes/ml/llm.py:486: DefaultLocationWarning: No explicit location is set, so using location US for the session.\n", " self.session = session or global_session.get_global_session()\n" ] }, { "data": { "text/html": [ "Query job 6fa5121a-6da4-4c75-92ec-936799da4513 is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 74460ae9-3e89-49e7-93ad-bafbb6197a86 is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "model = GeminiTextGenerator(model_name=\"gemini-2.0-flash-001\")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Create Sample Data\n", "\n", "Read as a BigQuery DataFrames." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "df = pd.DataFrame(\n", " {\n", " \"prompt\": [\"What is BigQuery?\", \"What is BQML?\", \"What is BigQuery DataFrame?\"],\n", " })\n", "bf_df = bigframes.pandas.read_pandas(df)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Make Predictions" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "Query job 562ca203-3b53-4409-9a23-0a80d3840fcc is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/google/home/shuowei/src/python-bigquery-dataframes/bigframes/core/array_value.py:114: PreviewWarning: JSON column interpretation as a custom PyArrow extention in\n", "`db_dtypes` is a preview feature and subject to change.\n", " warnings.warn(msg, bfe.PreviewWarning)\n" ] }, { "data": { "text/html": [ "Query job 5a6ceff2-53b5-4a4a-83ff-31bffab1b8b8 is DONE. 14.0 kB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ml_generate_text_llm_resultml_generate_text_rai_resultml_generate_text_statusprompt
0BigQuery is a serverless, highly scalable, and...<NA>What is BigQuery?
1BQML stands for **BigQuery Machine Learning**....<NA>What is BQML?
2BigQuery DataFrames is a Python client library...<NA>What is BigQuery DataFrame?
\n", "
" ], "text/plain": [ " ml_generate_text_llm_result \\\n", "0 BigQuery is a serverless, highly scalable, and... \n", "1 BQML stands for **BigQuery Machine Learning**.... \n", "2 BigQuery DataFrames is a Python client library... \n", "\n", " ml_generate_text_rai_result ml_generate_text_status \\\n", "0 \n", "1 \n", "2 \n", "\n", " prompt \n", "0 What is BigQuery? \n", "1 What is BQML? \n", "2 What is BigQuery DataFrame? " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pred = model.predict(bf_df).to_pandas()\n", "pred" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Fetch Predictions" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\"## BigQuery: A serverless data warehouse for large-scale data analysis\\n\\nBigQuery is a serverless, highly-scalable data warehouse designed for analyzing large datasets. It's a cloud-based service offered by Google Cloud Platform (GCP), allowing users to store, manage, and analyze massive amounts of data without managing infrastructure. \\n\\nHere are some key features of BigQuery:\\n\\n**Serverless:** You don't need to worry about provisioning, managing, or scaling servers. BigQuery handles all of this automatically, letting you focus on analyzing your data.\\n\\n**Highly-scalable:** BigQuery can handle datasets of any size, from gigabytes to petabytes. It can also scale up and down automatically to meet your processing needs.\\n\\n**Cost-effective:** You only pay for the resources you use, and there are no upfront costs. Additionally, BigQuery offers several pricing models to fit your needs, including on-demand, flat-rate, and flexible slots.\\n\\n**Easy to use:** BigQuery uses SQL, a standard query language, making it easy to analyze your data. No need to learn a new programming language.\\n\\n**Integrated with GCP:** BigQuery integrates seamlessly with other GCP services, such as Google Cloud Storage, Dataflow, and Kubernetes. This allows you to build powerful data pipelines and workflows.\\n\\n**Secure:** BigQuery uses industry-standard security practices to protect your data.\\n\\nHere are some use cases for BigQuery:\\n\\n* **Data warehousing and analytics:** Store and analyze large datasets for business intelligence and reporting.\\n* **Machine learning:** Train and deploy machine learning models on your data.\\n* **Data integration:** Combine data from multiple sources for analysis.\\n* **Real-time analytics:** Analyze data in real-time for insights and decision-making.\\n\\n**Here are some additional resources that you may find helpful:**\\n\\n* **BigQuery website:** https://cloud.google.com/bigquery\\n* **BigQuery documentation:** https://cloud.google.com/bigquery/docs\\n* **BigQuery tutorial:** https://cloud.google.com/bigquery/docs/tutorials\\n* **BigQuery pricing:** https://cloud.google.com/bigquery/pricing\\n\\nI hope this gives you a good overview of BigQuery. Please let me know if you have any other questions.\"" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pred.iloc[0, 0]" ] } ], "metadata": { "kernelspec": { "display_name": "venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.15" } }, "nbformat": 4, "nbformat_minor": 4 }