bigframes.ml.impute.SimpleImputer#
- class bigframes.ml.impute.SimpleImputer(strategy: Literal['mean', 'median', 'most_frequent'] = 'mean')[source]#
Univariate imputer for completing missing values with simple strategies.
Replace missing values using a descriptive statistic (e.g. mean, median, or most frequent) along each column.
Examples:
>>> import bigframes.pandas as bpd >>> from bigframes.ml.impute import SimpleImputer >>> X_train = bpd.DataFrame({"feat0": [7.0, 4.0, 10.0], "feat1": [2.0, None, 5.0], "feat2": [3.0, 6.0, 9.0]}) >>> imp_mean = SimpleImputer().fit(X_train) >>> X_test = bpd.DataFrame({"feat0": [None, 4.0, 10.0], "feat1": [2.0, None, None], "feat2": [3.0, 6.0, 9.0]}) >>> imp_mean.transform(X_test) imputer_feat0 imputer_feat1 imputer_feat2 0 7.0 2.0 3.0 1 4.0 3.5 6.0 2 10.0 3.5 9.0 [3 rows x 3 columns]
- Parameters:
strategy ({'mean', 'median', 'most_frequent'}, default='mean') – The imputation strategy. ‘mean’: replace missing values using the mean along the axis. ‘median’:replace missing values using the median along the axis. ‘most_frequent’, replace missing using the most frequent value along the axis.
Methods
__init__([strategy])fit(X[, y])Fit the imputer on X.
fit_transform(X[, y])Fit to data, then transform it.
get_params([deep])Get parameters for this estimator.
to_gbq(model_name[, replace])Save the transformer as a BigQuery model.
transform(X)Impute all missing values in X.