bigframes.ml.preprocessing.StandardScaler#
- class bigframes.ml.preprocessing.StandardScaler[source]#
Standardize features by removing the mean and scaling to unit variance.
The standard score of a sample x is calculated as:z = (x - u) / s where u is the mean of the training samples or zero if with_mean=False, and s is the standard deviation of the training samples or one if with_std=False.
Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Mean and standard deviation are then stored to be used on later data using
transform().Standardization of a dataset is a common requirement for many machine learning estimators: they might behave badly if the individual features do not more or less look like standard normally distributed data (e.g. Gaussian with 0 mean and unit variance).
Examples:
from bigframes.ml.preprocessing import StandardScaler import bigframes.pandas as bpd scaler = StandardScaler() data = bpd.DataFrame({"a": [0, 0, 1, 1], "b":[0, 0, 1, 1]}) scaler.fit(data) print(scaler.transform(data)) print(scaler.transform(bpd.DataFrame({"a": [2], "b":[2]})))
Methods
__init__()fit(X[, y])Compute the mean and std to be used for later scaling.
fit_transform(X[, y])Fit to data, then transform it.
get_params([deep])Get parameters for this estimator.
to_gbq(model_name[, replace])Save the transformer as a BigQuery model.
transform(X)Perform standardization by centering and scaling.