bigframes.ml.metrics.confusion_matrix#

bigframes.ml.metrics.confusion_matrix(y_true: DataFrame | Series, y_pred: DataFrame | Series) DataFrame[source]#

Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix \(C\) is such that \(C_{i, j}\) is equal to the number of observations known to be in group \(i\) and predicted to be in group \(j\).

Thus in binary classification, the count of true negatives is \(C_{0,0}\), false negatives is \(C_{1,0}\), true positives is \(C_{1,1}\) and false positives is \(C_{0,1}\).

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.ml.metrics
>>> y_true = bpd.DataFrame([2, 0, 2, 2, 0, 1])
>>> y_pred = bpd.DataFrame([0, 0, 2, 2, 0, 2])
>>> confusion_matrix = bigframes.ml.metrics.confusion_matrix(y_true, y_pred)
>>> confusion_matrix
   0  1  2
0  2  0  0
1  0  0  1
2  1  0  2
>>> y_true = bpd.DataFrame(["cat", "ant", "cat", "cat", "ant", "bird"])
>>> y_pred = bpd.DataFrame(["ant", "ant", "cat", "cat", "ant", "cat"])
>>> confusion_matrix = bigframes.ml.metrics.confusion_matrix(y_true, y_pred)
>>> confusion_matrix
    ant  bird  cat
ant     2     0    0
bird    0     0    1
cat     1     0    2
Parameters:
  • y_true (Series or DataFrame of shape (n_samples,)) – Ground truth (correct) target values.

  • y_pred (Series or DataFrame of shape (n_samples,)) – Estimated targets as returned by a classifier.

Returns:

Confusion matrix whose

i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class.

Return type:

DataFrame of shape (n_samples, n_features)