bigframes.ml.llm.TextEmbeddingGenerator#
- class bigframes.ml.llm.TextEmbeddingGenerator(*, model_name: Literal['text-embedding-005', 'text-embedding-004', 'text-multilingual-embedding-002'] | None = None, session: Session | None = None, connection_name: str | None = None)[source]#
Text embedding generator LLM model.
Note
text-embedding-004 is going to be deprecated. Use text-embedding-005(https://cloud.google.com/python/docs/reference/bigframes/latest/bigframes.ml.llm.TextEmbeddingGenerator) instead.
- Parameters:
model_name (str, Default to "text-embedding-004") – The model for text embedding. Possible values are “text-embedding-005”, “text-embedding-004” or “text-multilingual-embedding-002”. text-embedding models returns model embeddings for text inputs. text-multilingual-embedding models returns model embeddings for text inputs which support over 100 languages. If no setting is provided, “text-embedding-004” will be used by default and a warning will be issued.
session (bigframes.Session or None) – BQ session to create the model. If None, use the global default session.
connection_name (str or None) – Connection to connect with remote service. str of the format <PROJECT_NUMBER/PROJECT_ID>.<LOCATION>.<CONNECTION_ID>. If None, use default connection in session context.
- predict(X: DataFrame | Series | DataFrame | Series, *, max_retries: int = 0) DataFrame[source]#
Predict the result from input DataFrame.
- Parameters:
X (bigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series) – Input DataFrame or Series, can contain one or more columns. If multiple columns are in the DataFrame, it must contain a “content” column for prediction.
max_retries (int, default 0) – Max number of retries if the prediction for any rows failed. Each try needs to make progress (i.e. has successfully predicted rows) to continue the retry. Each retry will append newly succeeded rows. When the max retries are reached, the remaining rows (the ones without successful predictions) will be appended to the end of the result.
- Returns:
DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values.
- Return type:
- to_gbq(model_name: str, replace: bool = False) TextEmbeddingGenerator[source]#
Save the model to BigQuery.
- Parameters:
- Returns:
Saved model.
- Return type: