bigframes.pandas.DataFrame.describe#
- DataFrame.describe(include: None | Literal['all'] = None) DataFrame[source]#
Generate descriptive statistics.
Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding
NaNvalues.- Parameters:
include ("all" or None, optional) – If “all”: All columns of the input will be included in the output. If None: The result will include all numeric columns.
Note
Percentile values are approximates only.
Note
For numeric data, the result’s index will include
count,mean,std,min,maxas well as lower,50and upper percentiles. By default the lower percentile is25and the upper percentile is75. The50percentile is the same as the median.Examples:
>>> import bigframes.pandas as bpd >>> df = bpd.DataFrame({"A": [3, 1, 2], "B": [0, 2, 8], "C": ["cat", "cat", "dog"]}) >>> df A B C 0 3 0 cat 1 1 2 cat 2 2 8 dog [3 rows x 3 columns]
>>> df.describe() A B count 3.0 3.0 mean 2.0 3.333333 std 1.0 4.163332 min 1.0 0.0 25% 1.0 0.0 50% 2.0 2.0 75% 3.0 8.0 max 3.0 8.0 [8 rows x 2 columns]
- Using describe with include = “all”:
>>> df.describe(include="all") A B C count 3.0 3.0 3 nunique <NA> <NA> 2 mean 2.0 3.333333 <NA> std 1.0 4.163332 <NA> min 1.0 0.0 <NA> 25% 1.0 0.0 <NA> 50% 2.0 2.0 <NA> 75% 3.0 8.0 <NA> max 3.0 8.0 <NA> [9 rows x 3 columns]
- Returns:
Summary statistics of the Series or Dataframe provided.
- Return type:
- Raises:
ValueError – If unsupported
includetype is provided.