bigframes.ml.pipeline.Pipeline#

class bigframes.ml.pipeline.Pipeline(steps: List[Tuple[str, BaseEstimator]])[source]#

Pipeline of transforms with a final estimator.

Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be transforms. That is, they must implement fit and transform methods. The final estimator only needs to implement fit.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. This simplifies code and allows for deploying an estimator and preprocessing together, e.g. with Pipeline.to_gbq(…).

fit(X: DataFrame | Series, y: DataFrame | Series | None = None) Pipeline[source]#

Fit the model.

Fit all the transformers one after the other and transform the data. Finally, fit the transformed data using the final estimator.

Parameters:
Returns:

Pipeline with fitted steps.

Return type:

Pipeline

to_gbq(model_name: str, replace: bool = False) Pipeline[source]#

Save the pipeline to BigQuery.

Parameters:
  • model_name (str) – The name of the model(pipeline).

  • replace (bool, default False) – Whether to replace if the model(pipeline) already exists. Default to False.

Returns:

Saved model(pipeline).

Return type:

Pipeline