bigframes.ml.model_selection.cross_validate#

bigframes.ml.model_selection.cross_validate(estimator, X: DataFrame | Series | DataFrame | Series, y: DataFrame | Series | DataFrame | Series | None = None, *, cv: int | KFold | None = None) dict[str, list][source]#

Evaluate metric(s) by cross-validation and also record fit/score times.

Examples:

>>> import bigframes.pandas as bpd
>>> from bigframes.ml.model_selection import cross_validate, KFold
>>> from bigframes.ml.linear_model import LinearRegression
>>> X = bpd.DataFrame({"feat0": [1, 3, 5], "feat1": [2, 4, 6]})
>>> y = bpd.DataFrame({"label": [1, 2, 3]})
>>> model = LinearRegression()
>>> scores = cross_validate(model, X, y, cv=3)
>>> for score in scores["test_score"]:
...   print(score["mean_squared_error"][0])
...
5.218167286047954e-19
2.726229944928669e-18
1.6197635612324266e-17
Parameters:
Returns:

A dict of arrays containing the score/time arrays for each scorer is returned. The keys for this dict are:

test_score

The score array for test scores on each cv split.

fit_time

The time for fitting the estimator on the train set for each cv split.

score_time

The time for scoring the estimator on the test set for each cv split.

Return type:

Dict[str, List]