bigframes.ml.llm.TextEmbeddingGenerator#

class bigframes.ml.llm.TextEmbeddingGenerator(*, model_name: Literal['text-embedding-005', 'text-embedding-004', 'text-multilingual-embedding-002'] | None = None, session: Session | None = None, connection_name: str | None = None)[source]#

Text embedding generator LLM model.

Note

text-embedding-004 is going to be deprecated. Use text-embedding-005(https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm.TextEmbeddingGenerator) instead.

Parameters:

model_name (str, Default to "text-embedding-004") – The model for text embedding. Possible values are “text-embedding-005”, “text-embedding-004” or “text-multilingual-embedding-002”. text-embedding models returns model embeddings for text inputs. text-multilingual-embedding models returns model embeddings for text inputs which support over 100 languages. If no setting is provided, “text-embedding-004” will be used by default and a warning will be issued.
session (bigframes.Session or None) – BQ session to create the model. If None, use the global default session.
connection_name (str or None) – Connection to connect with remote service. str of the format <PROJECT_NUMBER/PROJECT_ID>.<LOCATION>.<CONNECTION_ID>. If None, use default connection in session context.

predict(X: DataFrame | Series | DataFrame | Series, *, max_retries: int = 0) → DataFrame[source]#

Predict the result from input DataFrame.

Parameters:

X (bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series) – Input DataFrame or Series, can contain one or more columns. If multiple columns are in the DataFrame, it must contain a “content” column for prediction.
max_retries (int, default 0) – Max number of retries if the prediction for any rows failed. Each try needs to make progress (i.e. has successfully predicted rows) to continue the retry. Each retry will append newly succeeded rows. When the max retries are reached, the remaining rows (the ones without successful predictions) will be appended to the end of the result.

Returns:

DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values.

Return type:

bigframes.dataframe.DataFrame

to_gbq(model_name: str, replace: bool = False) → TextEmbeddingGenerator[source]#

Save the model to BigQuery.

Parameters:

model_name (str) – The name of the model.
replace (bool, default False) – Determine whether to replace if the model already exists. Default to False.

Returns:

Saved model.

Return type:

TextEmbeddingGenerator

bigframes.ml.llm.TextEmbeddingGenerator#

This Page