bigframes.ml.ensemble.RandomForestRegressor#

class bigframes.ml.ensemble.RandomForestRegressor(n_estimators: int = 100, *, tree_method: Literal['auto', 'exact', 'approx', 'hist'] = 'auto', min_tree_child_weight: int = 1, colsample_bytree: float = 1.0, colsample_bylevel: float = 1.0, colsample_bynode: float = 0.8, gamma: float = 0.0, max_depth: int = 15, subsample: float = 0.8, reg_alpha: float = 0.0, reg_lambda: float = 1.0, tol: float = 0.01, enable_global_explain: bool = False, xgboost_version: Literal['0.9', '1.1'] = '0.9')[source]#

A random forest regressor.

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters:
  • n_estimators (Optional[int]) – Number of parallel trees constructed during each iteration. Default to 100. Minimum value is 2.

  • tree_method (Optional[str]) – Specify which tree method to use. Default to “auto”. If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: “exact”, “approx”, “hist”.

  • min_child_weight (Optional[float]) – Minimum sum of instance weight(hessian) needed in a child. Default to 1.

  • colsample_bytree (Optional[float]) – Subsample ratio of columns when constructing each tree. Default to 1.0. The value should be between 0 and 1.

  • colsample_bylevel (Optional[float]) – Subsample ratio of columns for each level. Default to 1.0. The value should be between 0 and 1.

  • colsample_bynode (Optional[float]) – Subsample ratio of columns for each split. Default to 0.8. The value should be between 0 and 1.

  • gamma (Optional[float]) – (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.

  • max_depth (Optional[int]) – Maximum tree depth for base learners. Default to 15. The value should be greater than 0 and less than 1.

  • (Optional[float] (subsample) – Subsample ratio of the training instance. Default to 0.8. The value should be greater than 0 and less than 1.

  • reg_alpha (Optional[float]) – L1 regularization term on weights (xgb’s alpha). Default to 0.0.

  • reg_lambda (Optional[float]) – L2 regularization term on weights (xgb’s lambda). Default to 1.0.

  • tol (Optional[float]) – Minimum relative loss improvement necessary to continue training. Default to 0.01.

  • enable_global_explain (Optional[bool]) – Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.

  • xgboost_version (Optional[str]) – Specifies the Xgboost version for model training. Default to “0.9”. Possible values: “0.9”, “1.1”.

Methods

__init__([n_estimators, tree_method, ...])

fit(X, y[, X_eval, y_eval])

Build a forest of trees from the training set (X, y).

get_params([deep])

Get parameters for this estimator.

predict(X)

Predict regression target for X.

register([vertex_ai_model_id])

Register the model to Vertex AI.

score(X, y)

Calculate evaluation metrics of the model.

to_gbq(model_name[, replace])

Save the model to BigQuery.