bigframes.ml.ensemble.XGBRegressor#

class bigframes.ml.ensemble.XGBRegressor(n_estimators: int = 1, *, booster: Literal['gbtree', 'dart'] = 'gbtree', dart_normalized_type: Literal['tree', 'forest'] = 'tree', tree_method: Literal['auto', 'exact', 'approx', 'hist'] = 'auto', min_tree_child_weight: int = 1, colsample_bytree: float = 1.0, colsample_bylevel: float = 1.0, colsample_bynode: float = 1.0, gamma: float = 0.0, max_depth: int = 6, subsample: float = 1.0, reg_alpha: float = 0.0, reg_lambda: float = 1.0, learning_rate: float = 0.3, max_iterations: int = 20, tol: float = 0.01, enable_global_explain: bool = False, xgboost_version: Literal['0.9', '1.1'] = '0.9')[source]#

XGBoost regression model.

Parameters:

n_estimators (Optional[int]) – Number of parallel trees constructed during each iteration. Default to 1.
booster (Optional[str]) – Specify which booster to use: gbtree or dart. Default to “gbtree”.
dart_normalized_type (Optional[str]) – Type of normalization algorithm for DART booster. Possible values: “TREE”, “FOREST”. Default to “TREE”.
tree_method (Optional[str]) – Specify which tree method to use. Default to “auto”. If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: “exact”, “approx”, “hist”.
min_child_weight (Optional[float]) – Minimum sum of instance weight(hessian) needed in a child. Default to 1.
colsample_bytree (Optional[float]) – Subsample ratio of columns when constructing each tree. Default to 1.0.
colsample_bylevel (Optional[float]) – Subsample ratio of columns for each level. Default to 1.0.
colsample_bynode (Optional[float]) – Subsample ratio of columns for each split. Default to 1.0.
gamma (Optional[float]) – (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.
max_depth (Optional[int]) – Maximum tree depth for base learners. Default to 6.
subsample (Optional[float]) – Subsample ratio of the training instance. Default to 1.0.
reg_alpha (Optional[float]) – L1 regularization term on weights (xgb’s alpha). Default to 0.0.
reg_lambda (Optional[float]) – L2 regularization term on weights (xgb’s lambda). Default to 1.0.
learning_rate (Optional[float]) – Boosting learning rate (xgb’s “eta”). Default to 0.3.
max_iterations (Optional[int]) – Maximum number of rounds for boosting. Default to 20.
tol (Optional[float]) – Minimum relative loss improvement necessary to continue training. Default to 0.01.
enable_global_explain (Optional[bool]) – Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.
xgboost_version (Optional[str]) – Specifies the Xgboost version for model training. Default to “0.9”. Possible values: “0.9”, “1.1”.

predict(X: DataFrame | Series | DataFrame | Series) → DataFrame[source]#

Predict using the XGB model.

Parameters:: X (bigframes.dataframe.DataFrame or bigframes.series.Series) – Series or DataFrame of shape (n_samples, n_features). Samples.
Returns:: DataFrame of shape (n_samples, n_input_columns + n_prediction_columns). Returns predicted values.
Return type:: bigframes.dataframe.DataFrame

Calculate evaluation metrics of the model.

Note

Output matches that of the BigQuery ML.EVALUATE function. See: https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-evaluate#regression_models for the outputs relevant to this model type.

Parameters:

X (bigframes.dataframe.DataFrame or bigframes.series.Series) – Series or DataFrame of shape (n_samples, n_features). Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (bigframes.dataframe.DataFrame or bigframes.series.Series) – Series or DataFrame of shape (n_samples,) or (n_samples, n_outputs). True values for X.

Returns:

A DataFrame of the evaluation result.

Return type:

bigframes.dataframe.DataFrame

to_gbq(model_name: str, replace: bool = False) → XGBRegressor[source]#

Save the model to BigQuery.

Parameters:

model_name (str) – The name of the model.
replace (bool, default False) – Determine whether to replace if the model already exists. Default to False.

Returns: Saved model.

bigframes.ml.ensemble.XGBRegressor#

This Page