bigframes.ml.metrics.roc_auc_score#

bigframes.ml.metrics.roc_auc_score(y_true: DataFrame | Series, y_score: DataFrame | Series) float[source]#

Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores.

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> y_true = bpd.DataFrame([0, 0, 1, 1, 0, 1, 0, 1, 1, 1])
>>> y_score = bpd.DataFrame([0.1, 0.4, 0.35, 0.8, 0.65, 0.9, 0.5, 0.3, 0.6, 0.45])
>>> roc_auc_score = bigframes.ml.metrics.roc_auc_score(y_true, y_score)
>>> roc_auc_score
np.float64(0.625)

The input can be Series:

>>> df = bpd.DataFrame(
...     {"y_true": [0, 0, 1, 1, 0, 1, 0, 1, 1, 1],
...      "y_score": [0.1, 0.4, 0.35, 0.8, 0.65, 0.9, 0.5, 0.3, 0.6, 0.45],}
... )
>>> roc_auc_score = bigframes.ml.metrics.roc_auc_score(df["y_true"], df["y_score"])
>>> roc_auc_score
np.float64(0.625)
Parameters:
  • y_true (Series or DataFrame of shape (n_samples,)) – True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).

  • y_score (Series or DataFrame of shape (n_samples,)) – Target scores. * In the binary case, it corresponds to an array of shape (n_samples,). Both probability estimates and non-thresholded decision values can be provided. The probability estimates correspond to the probability of the class with the greater label, i.e. estimator.classes_[1] and thus estimator.predict_proba(X, y)[:, 1]. The decision values corresponds to the output of estimator.decision_function(X, y).

Returns:

Area Under the Curve score.

Return type:

float