{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "id": "ur8xi4C7S06n" }, "outputs": [], "source": [ "# Copyright 2023 Google LLC\n", "#\n", "# Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "id": "JAPoU8Sm5E6e" }, "source": [ "# BigQuery DataFrames ML: Drug Name Generation\n", "\n", "\n", " \n", " \n", " \n", " \n", "
\n", " \n", " \"Colab Run in Colab\n", " \n", " \n", " \n", " \"GitHub\n", " View on GitHub\n", " \n", " \n", " \n", " \"Vertex\n", " Open in Vertex AI Workbench\n", " \n", " \n", " \n", " \"BQ\n", " Open in BQ Studio\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": { "id": "tvgnzT1CKxrO" }, "source": [ "## Overview\n", "\n", "The goal of this notebook is to demonstrate an enterprise generative AI use case. A marketing user can provide information about a new pharmaceutical drug and its generic name, and receive ideas on marketing-oriented brand names for that drug.\n", "\n", "Learn more about [BigQuery DataFrames](https://cloud.google.com/bigquery/docs/dataframes-quickstart)." ] }, { "cell_type": "markdown", "metadata": { "id": "d975e698c9a4" }, "source": [ "### Objective\n", "\n", "In this tutorial, you learn about Generative AI concepts such as prompting and few-shot learning, as well as how to use BigFrames ML for performing these tasks simply using an intuitive dataframe API.\n", "\n", "The steps performed include:\n", "\n", "1. Ask the user for the generic name and usage for the drug.\n", "1. Use `bigframes` to query the FDA dataset of over 100,000 drugs, filtered on the brand name, generic name, and indications & usage columns.\n", "1. Filter this dataset to find prototypical brand names that can be used as examples in prompt tuning.\n", "1. Create a prompt with the user input, general instructions, examples and counter-examples for the desired brand name.\n", "1. Use the `bigframes.ml.llm.GeminiTextGenerator` to generate choices of brand names." ] }, { "cell_type": "markdown", "metadata": { "id": "08d289fa873f" }, "source": [ "### Dataset\n", "\n", "This notebook uses the [FDA dataset](https://cloud.google.com/blog/topics/healthcare-life-sciences/fda-mystudies-comes-to-google-cloud) available at [`bigquery-public-data.fda_drug`](https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1sbigquery-public-data!2sfda_drug)." ] }, { "cell_type": "markdown", "metadata": { "id": "aed92deeb4a0" }, "source": [ "### Costs\n", "\n", "This tutorial uses billable components of Google Cloud:\n", "\n", "* BigQuery (compute)\n", "* BigQuery ML\n", "\n", "Learn about [BigQuery compute pricing](https://cloud.google.com/bigquery/pricing#analysis_pricing_models),\n", "and [BigQuery ML pricing](https://cloud.google.com/bigquery/pricing#bqml),\n", "and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)\n", "to generate a cost estimate based on your projected usage." ] }, { "cell_type": "markdown", "metadata": { "id": "i7EUnXsZhAGF" }, "source": [ "## Installation\n", "\n", "Install the following packages required to execute this notebook." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "id": "2b4ef9b72d43" }, "outputs": [], "source": [ "# !pip install -U --quiet bigframes" ] }, { "cell_type": "markdown", "metadata": { "id": "58707a750154" }, "source": [ "### Colab only: Uncomment the following cell to restart the kernel." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "id": "f200f10a1da3" }, "outputs": [], "source": [ "# # Automatically restart kernel after installs so that your environment can access the new packages\n", "# import IPython\n", "\n", "# app = IPython.Application.instance()\n", "# app.kernel.do_shutdown(True)" ] }, { "cell_type": "markdown", "metadata": { "id": "960505627ddf" }, "source": [ "### Import libraries" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "id": "PyQmSRbKA8r-" }, "outputs": [], "source": [ "import bigframes.pandas as bpd\n", "from bigframes.ml.llm import GeminiTextGenerator\n", "from IPython.display import Markdown" ] }, { "cell_type": "markdown", "metadata": { "id": "sBCra4QMA2wR" }, "source": [ "### Authenticate your Google Cloud account\n", "\n", "Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below." ] }, { "cell_type": "markdown", "metadata": { "id": "74ccc9e52986" }, "source": [ "**1. Vertex AI Workbench**\n", "* Do nothing as you are already authenticated." ] }, { "cell_type": "markdown", "metadata": { "id": "de775a3773ba" }, "source": [ "**2. Local JupyterLab instance, uncomment and run:**" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "254614fa0c46" }, "outputs": [], "source": [ "# ! gcloud auth login" ] }, { "cell_type": "markdown", "metadata": { "id": "ef21552ccea8" }, "source": [ "**3. Colab, uncomment and run:**" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "id": "603adbbf0532" }, "outputs": [], "source": [ "# from google.colab import auth\n", "\n", "# auth.authenticate_user()" ] }, { "cell_type": "markdown", "metadata": { "id": "BF1j6f9HApxa" }, "source": [ "## Before you begin\n", "\n", "### Set up your Google Cloud project\n", "\n", "**The following steps are required, regardless of your notebook environment.**\n", "\n", "1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.\n", "\n", "2. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).\n", "\n", "3. [Enable the BigQuery API](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com).\n", "\n", "4. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk)." ] }, { "cell_type": "markdown", "metadata": { "id": "WReHDGG5g0XY" }, "source": [ "#### Set your project ID\n", "\n", "**If you don't know your project ID**, try the following:\n", "* Run `gcloud config list`.\n", "* Run `gcloud projects list`.\n", "* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "oM1iC_MfAts1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1;31mERROR:\u001b[0m (gcloud.config.set) argument VALUE: Must be specified.\n", "Usage: gcloud config set SECTION/PROPERTY VALUE [optional flags]\n", " optional flags may be --help | --installation\n", "\n", "For detailed information on this command and its flags, run:\n", " gcloud config set --help\n" ] } ], "source": [ "# Please fill in these values.\n", "PROJECT_ID = \"\" # @param {type:\"string\"}\n", "\n", "# Set the project id\n", "! gcloud config set project {PROJECT_ID}" ] }, { "cell_type": "markdown", "metadata": { "id": "evsJaAj5te0X" }, "source": [ "#### BigFrames configuration\n", "\n", "Next, we will specify a [BigQuery connection](https://cloud.google.com/bigquery/docs/working-with-connections). If you already have a connection, you can simplify provide the name and skip the following creation steps.\n", "\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "id": "G1vVsPiMsL2X" }, "outputs": [], "source": [ "# Please fill in these values.\n", "LOCATION = \"us\" # @param {type:\"string\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "WGS_TzhWlPBN" }, "source": [ "We will now try to use the provided connection, and if it doesn't exist, create a new one. We will also print the service account used." ] }, { "cell_type": "markdown", "metadata": { "id": "init_aip:mbsdk,all" }, "source": [ "### Initialize BigFrames client\n", "\n", "Here, we set the project configuration based on the provided parameters." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "id": "OCccLirpkSRz" }, "outputs": [], "source": [ "# Note: The project option is not required in all environments.\n", "# On BigQuery Studio, the project ID is automatically detected.\n", "bpd.options.bigquery.project = PROJECT_ID\n", "\n", "# Note: The location option is not required.\n", "# It defaults to the location of the first table or query\n", "# passed to read_gbq(). For APIs where a location can't be\n", "# auto-detected, the location defaults to the \"US\" location.\n", "bpd.options.bigquery.location = LOCATION" ] }, { "cell_type": "markdown", "metadata": { "id": "m8UCEtX9uLn6" }, "source": [ "## Generate a name\n", "\n", "Let's start with entering a generic name and description of the drug." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "id": "oxphj2gnuKou" }, "outputs": [], "source": [ "GENERIC_NAME = \"Entropofloxacin\" # @param {type:\"string\"}\n", "USAGE = \"Entropofloxacin is a fluoroquinolone antibiotic that is used to treat a variety of bacterial infections, including: pneumonia, streptococcus infections, salmonella infections, escherichia coli infections, and pseudomonas aeruginosa infections It is taken by mouth or by injection. The dosage and frequency of administration will vary depending on the type of infection being treated. It should be taken for the full course of treatment, even if symptoms improve after a few days. Stopping the medication early may increase the risk of the infection coming back.\" # @param {type:\"string\"}\n", "NUM_NAMES = 10 # @param {type:\"integer\"}\n", "TEMPERATURE = 0.5 # @param {type: \"number\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "1q-vlbalzu1Q" }, "source": [ "We can now create a prompt string, and populate it with the name and description." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "id": "0knz5ZWMzed-" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Provide 10 unique and modern brand names in Markdown bullet point format. Do not provide any additional explanation.\n", "\n", "Be creative with the brand names. Don't use English words directly; use variants or invented words.\n", "\n", "The generic name is: Entropofloxacin\n", "\n", "The indications and usage are: Entropofloxacin is a fluoroquinolone antibiotic that is used to treat a variety of bacterial infections, including: pneumonia, streptococcus infections, salmonella infections, escherichia coli infections, and pseudomonas aeruginosa infections It is taken by mouth or by injection. The dosage and frequency of administration will vary depending on the type of infection being treated. It should be taken for the full course of treatment, even if symptoms improve after a few days. Stopping the medication early may increase the risk of the infection coming back..\n" ] } ], "source": [ "zero_shot_prompt = f\"\"\"Provide {NUM_NAMES} unique and modern brand names in Markdown bullet point format. Do not provide any additional explanation.\n", "\n", "Be creative with the brand names. Don't use English words directly; use variants or invented words.\n", "\n", "The generic name is: {GENERIC_NAME}\n", "\n", "The indications and usage are: {USAGE}.\"\"\"\n", "\n", "print(zero_shot_prompt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, let's create a helper function to predict with our model. It will take a string input, and add it to a temporary BigFrames DataFrame. It will also return the string extracted from the response DataFrame." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def predict(prompt: str, temperature: float = TEMPERATURE) -> str:\n", " # Create dataframe\n", " input = bpd.DataFrame(\n", " {\n", " \"prompt\": [prompt],\n", " }\n", " )\n", "\n", " # Return response\n", " return model.predict(input, temperature=temperature).ml_generate_text_llm_result.iloc[0]" ] }, { "cell_type": "markdown", "metadata": { "id": "b1ZapNZsJW2p" }, "source": [ "We can now initialize the model, and get a response to our prompt!" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "id": "UW2fQ2k5Hsic" }, "outputs": [ { "data": { "text/html": [ "Query job 25b47284-2b28-4cd9-ac9a-90379f818c84 is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 0efa6f42-6569-4274-ac21-667c7eecefc7 is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job c5e98170-7d58-4aa2-a3a3-6680cd9a54c0 is DONE. 8 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 5fd9d5bf-c731-4b21-b7c9-9b6244ffb412 is DONE. 2 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 36f7e8ec-ee42-4f94-8e38-bdf18b371517 is DONE. 118 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "- Etherealox\n", "- Zenithrox\n", "- Aureox\n", "- Lucentrox\n", "- Aethrox\n", "- Luminex\n", "- Elysirox\n", "- Quasarox\n", "- Novaflux\n", "- Arcanox" ], "text/plain": [ "" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Define the model\n", "model = GeminiTextGenerator(model_name=\"gemini-2.0-flash-001\")\n", "\n", "# Invoke LLM with prompt\n", "response = predict(zero_shot_prompt, temperature = TEMPERATURE)\n", "\n", "# Print results as Markdown\n", "Markdown(response)" ] }, { "cell_type": "markdown", "metadata": { "id": "o3yIhHV2jsUT" }, "source": [ "We're off to a great start! Let's see if we can refine our response." ] }, { "cell_type": "markdown", "metadata": { "id": "mBroUzWS8xOL" }, "source": [ "## Few-shot learning\n", "\n", "Let's try using [few-shot learning](https://paperswithcode.com/task/few-shot-learning). We will provide a few examples of what we're looking for along with our prompt.\n", "\n", "Our prompt will consist of 3 parts:\n", "* General instructions (e.g. generate $n$ brand names)\n", "* Multiple examples\n", "* Information about the drug we'd like to generate a name for\n", "\n", "Let's walk through how to construct this prompt.\n", "\n", "Our first step will be to define how many examples we want to provide in the prompt." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "id": "MXdI78SOElyt" }, "outputs": [], "source": [ "# Specify number of examples to include\n", "\n", "NUM_EXAMPLES = 3 # @param {type:\"integer\"}" ] }, { "cell_type": "markdown", "metadata": { "id": "U8w4puVM_892" }, "source": [ "Next, let's define a prefix that will set the overall context." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "id": "aQ2iscnhF2cx" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Provide 10 unique and modern brand names in Markdown bullet point format, related to the drug at the bottom of this prompt.\n", "\n", "Be creative with the brand names. Don't use English words directly; use variants or invented words.\n", "\n", "First, we will provide 3 examples to help with your thought process.\n", "\n", "Then, we will provide the generic name and usage for the drug we'd like you to generate brand names for.\n", "\n" ] } ], "source": [ "prefix_prompt = f\"\"\"Provide {NUM_NAMES} unique and modern brand names in Markdown bullet point format, related to the drug at the bottom of this prompt.\n", "\n", "Be creative with the brand names. Don't use English words directly; use variants or invented words.\n", "\n", "First, we will provide {NUM_EXAMPLES} examples to help with your thought process.\n", "\n", "Then, we will provide the generic name and usage for the drug we'd like you to generate brand names for.\n", "\"\"\"\n", "\n", "print(prefix_prompt)" ] }, { "cell_type": "markdown", "metadata": { "id": "VI0Spv-axN7d" }, "source": [ "Our next step will be to include examples into the prompt.\n", "\n", "We will start out by retrieving the raw data for the examples, by querying the BigQuery public dataset." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "id": "IoO_Bp8wA07N" }, "outputs": [ { "data": { "text/html": [ "Query job 542b0ce1-9d56-456f-bcd3-d24a6f0c825a is DONE. 84.4 MB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 2405ba41-b263-46d3-a0e5-3b5e7ecef6ab is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job b24663ec-8d81-4295-84df-ffb65a6a0f1b is DONE. 3.1 kB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
openfda_generic_nameopenfda_brand_nameindications_and_usage
0BENZALKONIUM CHLORIDEmeijer kidsUse - hand washing to decrease bacteria on skin
3OCTINOXATE, TITANIUM DIOXIDECD DIORSKIN STAR Studio Makeup Spectacular Bri...Uses Helps prevent sunburn. If used as directe...
4TRIAMCINOLONE ACETONIDETriamcinolone AcetonideINDICATIONS AND USAGE Triamcinolone Acetonide ...
5BACITRACIN ZINC, NEOMYCIN SULFATE, POLYMYXIN B...Triple AntibioticFirst aid to help prevent infection in minor c...
6RISPERIDONERisperidone1. INDICATIONS AND USAGE Risperidone is an aty...
\n", "

5 rows × 3 columns

\n", "
[5 rows x 3 columns in total]" ], "text/plain": [ " openfda_generic_name \\\n", "0 BENZALKONIUM CHLORIDE \n", "3 OCTINOXATE, TITANIUM DIOXIDE \n", "4 TRIAMCINOLONE ACETONIDE \n", "5 BACITRACIN ZINC, NEOMYCIN SULFATE, POLYMYXIN B... \n", "6 RISPERIDONE \n", "\n", " openfda_brand_name \\\n", "0 meijer kids \n", "3 CD DIORSKIN STAR Studio Makeup Spectacular Bri... \n", "4 Triamcinolone Acetonide \n", "5 Triple Antibiotic \n", "6 Risperidone \n", "\n", " indications_and_usage \n", "0 Use - hand washing to decrease bacteria on skin \n", "3 Uses Helps prevent sunburn. If used as directe... \n", "4 INDICATIONS AND USAGE Triamcinolone Acetonide ... \n", "5 First aid to help prevent infection in minor c... \n", "6 1. INDICATIONS AND USAGE Risperidone is an aty... \n", "\n", "[5 rows x 3 columns]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Query 3 columns of interest from drug label dataset\n", "df = bpd.read_gbq(\"bigquery-public-data.fda_drug.drug_label\",\n", " columns=[\"openfda_generic_name\", \"openfda_brand_name\", \"indications_and_usage\"])\n", "\n", "# Exclude any rows with missing data\n", "df = df.dropna()\n", "\n", "# Drop duplicate rows\n", "df = df.drop_duplicates()\n", "\n", "# Print values\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": { "id": "W5kOtbNGBTI2" }, "source": [ "Let's now filter the results to remove atypical names." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "id": "95WDe2eCCeLx" }, "outputs": [], "source": [ "# Remove names with spaces\n", "df = df[df[\"openfda_brand_name\"].str.find(\" \") == -1]\n", "\n", "# Remove names with 5 or fewer characters\n", "df = df[df[\"openfda_brand_name\"].str.len() > 5]\n", "\n", "# Remove names where the generic and brand name match (case-insensitive)\n", "df = df[df[\"openfda_generic_name\"].str.lower() != df[\"openfda_brand_name\"].str.lower()]" ] }, { "cell_type": "markdown", "metadata": { "id": "FZD89ep4EyYc" }, "source": [ "Let's take `NUM_EXAMPLES` samples to include in the prompt." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "id": "2ohZYg7QEyJV" }, "outputs": [ { "data": { "text/html": [ "Query job 293c90e0-7fdf-4769-9d8e-f222f35d368e is DONE. 84.4 MB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
openfda_generic_nameopenfda_brand_nameindications_and_usage
81748AMPICILLIN SODIUMAmpicillinINDICATIONS AND USAGE Ampicillin for Injection...
730AZTREONAMCayston1 INDICATIONS AND USAGE CAYSTON® is indicated ...
71763TERAZOSIN HYDROCHLORIDETerazosinINDICATIONS AND USAGE Terazosin capsules are i...
\n", "
" ], "text/plain": [ " openfda_generic_name openfda_brand_name \\\n", "81748 AMPICILLIN SODIUM Ampicillin \n", "730 AZTREONAM Cayston \n", "71763 TERAZOSIN HYDROCHLORIDE Terazosin \n", "\n", " indications_and_usage \n", "81748 INDICATIONS AND USAGE Ampicillin for Injection... \n", "730 1 INDICATIONS AND USAGE CAYSTON® is indicated ... \n", "71763 INDICATIONS AND USAGE Terazosin capsules are i... " ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Take a sample and convert to a Pandas dataframe for local usage.\n", "df_examples = df.sample(NUM_EXAMPLES, random_state=3).to_pandas()\n", "\n", "df_examples" ] }, { "cell_type": "markdown", "metadata": { "id": "J-Qa1_SCImXy" }, "source": [ "Let's now convert the data to a JSON structure, to enable embedding into a prompt. For consistency, we'll capitalize each example brand name." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "id": "PcJdSaw0EGcW" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[{'brand_name': 'Ampicillin', 'generic_name': 'AMPICILLIN SODIUM', 'usage': 'INDICATIONS AND USAGE Ampicillin for Injection, USP is indicated in the treatment of infections caused by susceptible strains of the designated organisms in the following conditions: Respiratory Tract Infections caused by Streptococcus pneumoniae. Staphylococcus aureus (penicillinase and nonpenicillinase-producing), H. influenzae, and Group A beta-hemolytic streptococci. Bacterial Meningitis caused by E. coli, Group B streptococci, and other Gram-negative bacteria (Listeria monocytogenes, N. meningitidis). The addition of an aminoglycoside with ampicillin may increase its effectiveness against Gram-negative bacteria. Septicemia and Endocarditis caused by susceptible Gram-positive organisms including Streptococcus spp., penicillin G-susceptible staphylococci, and enterococci. Gram-negative sepsis caused by E. coli, Proteus mirabilis and Salmonella spp. responds to ampicillin. Endocarditis due to enterococcal strains usually respond to intravenous therapy. The addition of an aminoglycoside may enhance the effectiveness of ampicillin when treating streptococcal endocarditis. Urinary Tract Infections caused by sensitive strains of E. coli and Proteus mirabilis. Gastrointestinal Infections caused by Salmonella typhi (typhoid fever), other Salmonella spp., and Shigella spp. (dysentery) usually respond to oral or intravenous therapy. Bacteriology studies to determine the causative organisms and their susceptibility to ampicillin should be performed. Therapy may be instituted prior to obtaining results of susceptibility testing. It is advisable to reserve the parenteral form of this drug for moderately severe and severe infections and for patients who are unable to take the oral forms. A change to oral ampicillin may be made as soon as appropriate. To reduce the development of drug-resistant bacteria and maintain the effectiveness of Ampicillin for Injection, USP and other antibacterial drugs, Ampicillin for Injection, USP should be used only to treat or prevent infections that are proven or strongly suspected to be caused by susceptible bacteria. When culture and susceptibility information are available, they should be considered in selecting or modifying antibacterial therapy. In the absence of such data, local epidemiology and susceptibility patterns may contribute to the empiric selection of therapy. Indicated surgical procedures should be performed.'}, {'brand_name': 'Cayston', 'generic_name': 'AZTREONAM', 'usage': '1 INDICATIONS AND USAGE CAYSTON® is indicated to improve respiratory symptoms in cystic fibrosis (CF) patients with Pseudomonas aeruginosa. Safety and effectiveness have not been established in pediatric patients below the age of 7 years, patients with FEV1 <25% or >75% predicted, or patients colonized with Burkholderia cepacia [see Clinical Studies (14) ]. To reduce the development of drug-resistant bacteria and maintain the effectiveness of CAYSTON and other antibacterial drugs, CAYSTON should be used only to treat patients with CF known to have Pseudomonas aeruginosa in the lungs. CAYSTON is a monobactam antibacterial indicated to improve respiratory symptoms in cystic fibrosis (CF) patients with Pseudomonas aeruginosa. Safety and effectiveness have not been established in pediatric patients below the age of 7 years, patients with FEV1 <25% or >75% predicted, or patients colonized with Burkholderia cepacia. (1)'}, {'brand_name': 'Terazosin', 'generic_name': 'TERAZOSIN HYDROCHLORIDE', 'usage': 'INDICATIONS AND USAGE Terazosin capsules are indicated for the treatment of symptomatic benign prostatic hyperplasia (BPH). There is a rapid response, with approximately 70% of patients experiencing an increase in urinary flow and improvement in symptoms of BPH when treated with terazosin capsules. The long-term effects of terazosin capsules on the incidence of surgery, acute urinary obstruction or other complications of BPH are yet to be determined. Terazosin capsules are also indicated for the treatment of hypertension. Terazosin capsules can be used alone or in combination with other antihypertensive agents such as diuretics or beta-adrenergic blocking agents.'}]\n" ] } ], "source": [ "examples = [\n", " {\n", " \"brand_name\": brand_name.capitalize(),\n", " \"generic_name\": generic_name,\n", " \"usage\": usage,\n", " }\n", " for brand_name, generic_name, usage in zip(\n", " df_examples[\"openfda_brand_name\"],\n", " df_examples[\"openfda_generic_name\"],\n", " df_examples[\"indications_and_usage\"],\n", " )\n", "]\n", "\n", "print(examples)" ] }, { "cell_type": "markdown", "metadata": { "id": "oU4mb1Dwgq64" }, "source": [ "We'll create a prompt template for each example, and view the first one." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "id": "kzAVsF6wJ93S" }, "outputs": [ { "data": { "text/plain": [ "'Generic name: AMPICILLIN SODIUM\\nUsage: INDICATIONS AND USAGE Ampicillin for Injection, USP is indicated in the treatment of infections caused by susceptible strains of the designated organisms in the following conditions: Respiratory Tract Infections caused by Streptococcus pneumoniae. Staphylococcus aureus (penicillinase and nonpenicillinase-producing), H. influenzae, and Group A beta-hemolytic streptococci. Bacterial Meningitis caused by E. coli, Group B streptococci, and other Gram-negative bacteria (Listeria monocytogenes, N. meningitidis). The addition of an aminoglycoside with ampicillin may increase its effectiveness against Gram-negative bacteria. Septicemia and Endocarditis caused by susceptible Gram-positive organisms including Streptococcus spp., penicillin G-susceptible staphylococci, and enterococci. Gram-negative sepsis caused by E. coli, Proteus mirabilis and Salmonella spp. responds to ampicillin. Endocarditis due to enterococcal strains usually respond to intravenous therapy. The addition of an aminoglycoside may enhance the effectiveness of ampicillin when treating streptococcal endocarditis. Urinary Tract Infections caused by sensitive strains of E. coli and Proteus mirabilis. Gastrointestinal Infections caused by Salmonella typhi (typhoid fever), other Salmonella spp., and Shigella spp. (dysentery) usually respond to oral or intravenous therapy. Bacteriology studies to determine the causative organisms and their susceptibility to ampicillin should be performed. Therapy may be instituted prior to obtaining results of susceptibility testing. It is advisable to reserve the parenteral form of this drug for moderately severe and severe infections and for patients who are unable to take the oral forms. A change to oral ampicillin may be made as soon as appropriate. To reduce the development of drug-resistant bacteria and maintain the effectiveness of Ampicillin for Injection, USP and other antibacterial drugs, Ampicillin for Injection, USP should be used only to treat or prevent infections that are proven or strongly suspected to be caused by susceptible bacteria. When culture and susceptibility information are available, they should be considered in selecting or modifying antibacterial therapy. In the absence of such data, local epidemiology and susceptibility patterns may contribute to the empiric selection of therapy. Indicated surgical procedures should be performed.\\nBrand name: Ampicillin\\n\\nGeneric name: AZTREONAM\\nUsage: 1 INDICATIONS AND USAGE CAYSTON® is indicated to improve respiratory symptoms in cystic fibrosis (CF) patients with Pseudomonas aeruginosa. Safety and effectiveness have not been established in pediatric patients below the age of 7 years, patients with FEV1 <25% or >75% predicted, or patients colonized with Burkholderia cepacia [see Clinical Studies (14) ]. To reduce the development of drug-resistant bacteria and maintain the effectiveness of CAYSTON and other antibacterial drugs, CAYSTON should be used only to treat patients with CF known to have Pseudomonas aeruginosa in the lungs. CAYSTON is a monobactam antibacterial indicated to improve respiratory symptoms in cystic fibrosis (CF) patients with Pseudomonas aeruginosa. Safety and effectiveness have not been established in pediatric patients below the age of 7 years, patients with FEV1 <25% or >75% predicted, or patients colonized with Burkholderia cepacia. (1)\\nBrand name: Cayston\\n\\nGeneric name: TERAZOSIN HYDROCHLORIDE\\nUsage: INDICATIONS AND USAGE Terazosin capsules are indicated for the treatment of symptomatic benign prostatic hyperplasia (BPH). There is a rapid response, with approximately 70% of patients experiencing an increase in urinary flow and improvement in symptoms of BPH when treated with terazosin capsules. The long-term effects of terazosin capsules on the incidence of surgery, acute urinary obstruction or other complications of BPH are yet to be determined. Terazosin capsules are also indicated for the treatment of hypertension. Terazosin capsules can be used alone or in combination with other antihypertensive agents such as diuretics or beta-adrenergic blocking agents.\\nBrand name: Terazosin\\n\\n'" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "example_prompt = \"\"\n", "for example in examples:\n", " example_prompt += f\"Generic name: {example['generic_name']}\\nUsage: {example['usage']}\\nBrand name: {example['brand_name']}\\n\\n\"\n", "\n", "example_prompt" ] }, { "cell_type": "markdown", "metadata": { "id": "kbV2X1CXAyLV" }, "source": [ "Finally, we can create a suffix to our prompt. This will contain the generic name of the drug, its usage, ending with a request for brand names." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "id": "OYp6W_XfHTlo" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generic name: Entropofloxacin\n", "Usage: Entropofloxacin is a fluoroquinolone antibiotic that is used to treat a variety of bacterial infections, including: pneumonia, streptococcus infections, salmonella infections, escherichia coli infections, and pseudomonas aeruginosa infections It is taken by mouth or by injection. The dosage and frequency of administration will vary depending on the type of infection being treated. It should be taken for the full course of treatment, even if symptoms improve after a few days. Stopping the medication early may increase the risk of the infection coming back.\n", "Brand names:\n" ] } ], "source": [ "suffix_prompt = f\"\"\"Generic name: {GENERIC_NAME}\n", "Usage: {USAGE}\n", "Brand names:\"\"\"\n", "\n", "print(suffix_prompt)" ] }, { "cell_type": "markdown", "metadata": { "id": "RiaisW1nihJP" }, "source": [ "Let's pull it altogether into a few shot prompt." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "id": "99xdU7l8C1h8" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Provide 10 unique and modern brand names in Markdown bullet point format, related to the drug at the bottom of this prompt.\n", "\n", "Be creative with the brand names. Don't use English words directly; use variants or invented words.\n", "\n", "First, we will provide 3 examples to help with your thought process.\n", "\n", "Then, we will provide the generic name and usage for the drug we'd like you to generate brand names for.\n", "Generic name: AMPICILLIN SODIUM\n", "Usage: INDICATIONS AND USAGE Ampicillin for Injection, USP is indicated in the treatment of infections caused by susceptible strains of the designated organisms in the following conditions: Respiratory Tract Infections caused by Streptococcus pneumoniae. Staphylococcus aureus (penicillinase and nonpenicillinase-producing), H. influenzae, and Group A beta-hemolytic streptococci. Bacterial Meningitis caused by E. coli, Group B streptococci, and other Gram-negative bacteria (Listeria monocytogenes, N. meningitidis). The addition of an aminoglycoside with ampicillin may increase its effectiveness against Gram-negative bacteria. Septicemia and Endocarditis caused by susceptible Gram-positive organisms including Streptococcus spp., penicillin G-susceptible staphylococci, and enterococci. Gram-negative sepsis caused by E. coli, Proteus mirabilis and Salmonella spp. responds to ampicillin. Endocarditis due to enterococcal strains usually respond to intravenous therapy. The addition of an aminoglycoside may enhance the effectiveness of ampicillin when treating streptococcal endocarditis. Urinary Tract Infections caused by sensitive strains of E. coli and Proteus mirabilis. Gastrointestinal Infections caused by Salmonella typhi (typhoid fever), other Salmonella spp., and Shigella spp. (dysentery) usually respond to oral or intravenous therapy. Bacteriology studies to determine the causative organisms and their susceptibility to ampicillin should be performed. Therapy may be instituted prior to obtaining results of susceptibility testing. It is advisable to reserve the parenteral form of this drug for moderately severe and severe infections and for patients who are unable to take the oral forms. A change to oral ampicillin may be made as soon as appropriate. To reduce the development of drug-resistant bacteria and maintain the effectiveness of Ampicillin for Injection, USP and other antibacterial drugs, Ampicillin for Injection, USP should be used only to treat or prevent infections that are proven or strongly suspected to be caused by susceptible bacteria. When culture and susceptibility information are available, they should be considered in selecting or modifying antibacterial therapy. In the absence of such data, local epidemiology and susceptibility patterns may contribute to the empiric selection of therapy. Indicated surgical procedures should be performed.\n", "Brand name: Ampicillin\n", "\n", "Generic name: AZTREONAM\n", "Usage: 1 INDICATIONS AND USAGE CAYSTON® is indicated to improve respiratory symptoms in cystic fibrosis (CF) patients with Pseudomonas aeruginosa. Safety and effectiveness have not been established in pediatric patients below the age of 7 years, patients with FEV1 <25% or >75% predicted, or patients colonized with Burkholderia cepacia [see Clinical Studies (14) ]. To reduce the development of drug-resistant bacteria and maintain the effectiveness of CAYSTON and other antibacterial drugs, CAYSTON should be used only to treat patients with CF known to have Pseudomonas aeruginosa in the lungs. CAYSTON is a monobactam antibacterial indicated to improve respiratory symptoms in cystic fibrosis (CF) patients with Pseudomonas aeruginosa. Safety and effectiveness have not been established in pediatric patients below the age of 7 years, patients with FEV1 <25% or >75% predicted, or patients colonized with Burkholderia cepacia. (1)\n", "Brand name: Cayston\n", "\n", "Generic name: TERAZOSIN HYDROCHLORIDE\n", "Usage: INDICATIONS AND USAGE Terazosin capsules are indicated for the treatment of symptomatic benign prostatic hyperplasia (BPH). There is a rapid response, with approximately 70% of patients experiencing an increase in urinary flow and improvement in symptoms of BPH when treated with terazosin capsules. The long-term effects of terazosin capsules on the incidence of surgery, acute urinary obstruction or other complications of BPH are yet to be determined. Terazosin capsules are also indicated for the treatment of hypertension. Terazosin capsules can be used alone or in combination with other antihypertensive agents such as diuretics or beta-adrenergic blocking agents.\n", "Brand name: Terazosin\n", "\n", "Generic name: Entropofloxacin\n", "Usage: Entropofloxacin is a fluoroquinolone antibiotic that is used to treat a variety of bacterial infections, including: pneumonia, streptococcus infections, salmonella infections, escherichia coli infections, and pseudomonas aeruginosa infections It is taken by mouth or by injection. The dosage and frequency of administration will vary depending on the type of infection being treated. It should be taken for the full course of treatment, even if symptoms improve after a few days. Stopping the medication early may increase the risk of the infection coming back.\n", "Brand names:\n" ] } ], "source": [ "# Define the prompt\n", "few_shot_prompt = prefix_prompt + example_prompt + suffix_prompt\n", "\n", "# Print the prompt\n", "print(few_shot_prompt)" ] }, { "cell_type": "markdown", "metadata": { "id": "nbUWdHtfitWn" }, "source": [ "Now, let's pass our prompt to the LLM, and get a response!" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "id": "d4ODRJdvLhlQ" }, "outputs": [ { "data": { "text/html": [ "Query job 5c6c3b79-812c-4a6e-876e-ca1ff6230a6e is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 168d5859-5edb-4702-8192-838ac2c7bc17 is DONE. 8 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 72f07348-4bcd-4042-84ca-396e7651ad03 is DONE. 2 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 70863a3b-8c63-423c-84cd-2804139daf5f is DONE. 679 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "- **Aerion:** (Derived from \"aer\" meaning air)\n", "- **Aquazone:** (Combining \"aqua\" for water and \"zone\" for area)\n", "- **Biosphere:** (Inspired by the concept of a self-contained ecosystem)\n", "- **Celestial:** (Evoking the vastness and healing power of the universe)\n", "- **Ethereal:** (Conveying a sense of lightness and transcendence)\n", "- **Luminary:** (From \"lumen\" meaning light, symbolizing hope and healing)\n", "- **Quasar:** (Inspired by the powerful and distant cosmic objects)\n", "- **Sanctuary:** (Creating a sense of safety and refuge)\n", "- **Zenith:** (Reaching the highest point or peak)\n", "- **Zephyr:** (Named after the gentle west wind, representing a calming and soothing effect)" ], "text/plain": [ "" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "response = predict(few_shot_prompt)\n", "\n", "Markdown(response)" ] }, { "cell_type": "markdown", "metadata": { "id": "pFakjrTElOBs" }, "source": [ "# Bulk generation\n", "\n", "Let's take these experiments to the next level by generating many names in bulk. We'll see how to leverage BigFrames at scale!\n", "\n", "We can start by finding drugs that are missing brand names. There are approximately 4,000 drugs that meet this criteria. We'll put a limit of 100 in this notebook." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "id": "8eAutS41mx6U" }, "outputs": [ { "data": { "text/html": [ "Query job b73f92bb-0e58-4fe4-adfb-b948fc5f4647 is DONE. 84.4 MB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 392dae36-aacb-4753-b28c-dad8291cb153 is DONE. 0 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 7c6ff6ee-db64-4629-a417-846dcecac127 is DONE. 6.3 kB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
openfda_generic_nameopenfda_brand_nameindications_and_usage
89MEPHITIS MEPHITICAMEPHITIS MEPHITICAINDICATIONS Condition listed above or as direc...
105ONDANSETRONONDANSETRON1 INDICATIONS AND USAGE Ondansetron Injection,...
124CLOFARABINECLOFARABINE1 INDICATIONS AND USAGE Clofarabine injection ...
273ACETAMINOPHEN AND DIPHENHYDRAMINE HYDROCHLORIDEACETAMINOPHEN AND DIPHENHYDRAMINE HYDROCHLORIDEUses Temporary relief of occasional headaches ...
284OFLOXACINOFLOXACININDICATIONS AND USAGE To reduce the developmen...
\n", "

5 rows × 3 columns

\n", "
[5 rows x 3 columns in total]" ], "text/plain": [ " openfda_generic_name \\\n", "89 MEPHITIS MEPHITICA \n", "105 ONDANSETRON \n", "124 CLOFARABINE \n", "273 ACETAMINOPHEN AND DIPHENHYDRAMINE HYDROCHLORIDE \n", "284 OFLOXACIN \n", "\n", " openfda_brand_name \\\n", "89 MEPHITIS MEPHITICA \n", "105 ONDANSETRON \n", "124 CLOFARABINE \n", "273 ACETAMINOPHEN AND DIPHENHYDRAMINE HYDROCHLORIDE \n", "284 OFLOXACIN \n", "\n", " indications_and_usage \n", "89 INDICATIONS Condition listed above or as direc... \n", "105 1 INDICATIONS AND USAGE Ondansetron Injection,... \n", "124 1 INDICATIONS AND USAGE Clofarabine injection ... \n", "273 Uses Temporary relief of occasional headaches ... \n", "284 INDICATIONS AND USAGE To reduce the developmen... \n", "\n", "[5 rows x 3 columns]" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Query 3 columns of interest from drug label dataset\n", "df_missing = bpd.read_gbq(\"bigquery-public-data.fda_drug.drug_label\",\n", " columns=[\"openfda_generic_name\", \"openfda_brand_name\", \"indications_and_usage\"])\n", "\n", "# Exclude any rows with missing data\n", "df_missing = df_missing.dropna()\n", "\n", "# Include rows in which openfda_brand_name equals openfda_generic_name\n", "df_missing = df_missing[df_missing[\"openfda_generic_name\"] == df_missing[\"openfda_brand_name\"]]\n", "\n", "# Limit the number of rows for demonstration purposes\n", "df_missing = df_missing.head(100)\n", "\n", "# Print values\n", "df_missing.head()" ] }, { "cell_type": "markdown", "metadata": { "id": "Fm6L8S7eVnCI" }, "source": [ "We will create a column `prompt` with a customized prompt for each row." ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "id": "19TvGN1PVmVX" }, "outputs": [], "source": [ "df_missing[\"prompt\"] = (\n", " \"Provide a unique and modern brand name related to this pharmaceutical drug.\"\n", " + \"Don't use English words directly; use variants or invented words. The generic name is: \"\n", " + df_missing[\"openfda_generic_name\"]\n", " + \". The indications and usage are: \"\n", " + df_missing[\"indications_and_usage\"]\n", " + \".\"\n", ")" ] }, { "cell_type": "markdown", "metadata": { "id": "njxwBvCKgMPE" }, "source": [ "We'll create a new helper method, `batch_predict()` and query the LLM. The job may take a couple minutes to execute." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "id": "tiSHa5B4aFhw" }, "outputs": [ { "data": { "text/html": [ "Query job d216bea6-9b9c-4918-9194-40de2745beca is DONE. 84.4 MB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 37d88636-b1fb-44da-9504-44144af9624d is DONE. 800 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 0b35db83-5bac-47b4-8a2c-b46a816c0e3e is DONE. 200 Bytes processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def batch_predict(\n", " input: bpd.DataFrame, temperature: float = TEMPERATURE\n", ") -> bpd.DataFrame:\n", " return model.predict(input, temperature=temperature).ml_generate_text_llm_result\n", "\n", "\n", "response = batch_predict(df_missing[\"prompt\"])" ] }, { "cell_type": "markdown", "metadata": { "id": "K5a2nHdLgZEj" }, "source": [ "Let's check the results for one of our responses!" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "id": "TnizdeqBdbZj" }, "outputs": [ { "data": { "text/html": [ "Query job 4397b5f3-5058-409c-a361-c9fa715e46ee is DONE. 84.4 MB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 147ea301-e249-49fb-8280-d61948d5df7f is DONE. 84.4 MB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "Query job 067a2a73-0f36-42a6-973e-074ab8be631a is DONE. 56.7 kB processed. Open Job" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Generic name: MEPHITIS MEPHITICA\n", "Brand name: INDICATIONS Condition listed above or as directed by the physician\n", "Response: **Ephemeral** (Latin root: \"ephemerus,\" meaning \"lasting for a day\")\n", "\n", "**Aetheria** (Greek root: \"aither,\" meaning \"upper air, sky\")\n", "\n", "**Zenithar** (Combination of \"zenith\" and \"pharma\")\n", "\n", "**Celestian** (Latin root: \"celestial,\" meaning \"heavenly\")\n", "\n", "**Astralux** (Combination of \"astral\" and \"lux,\" meaning \"light\")\n" ] } ], "source": [ "# Pick a sample\n", "k = 0\n", "\n", "# Gather the prompt and response details\n", "prompt_generic = df_missing[\"openfda_generic_name\"].iloc[k]\n", "prompt_usage = df_missing[\"indications_and_usage\"].iloc[k]\n", "response_str = response.iloc[k]\n", "\n", "# Print details\n", "print(f\"Generic name: {prompt_generic}\")\n", "print(f\"Brand name: {prompt_usage}\")\n", "print(f\"Response: {response_str}\")" ] }, { "cell_type": "markdown", "metadata": { "id": "W4MviwyMI-Qh" }, "source": [ "Congratulations! You have learned how to use generative AI to jumpstart the creative process.\n", "\n", "You've also seen how BigFrames can manage each step of the process, including gathering data, data manipulation, and querying the LLM." ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" } }, "nbformat": 4, "nbformat_minor": 0 }