bigframes.pandas.api.typing.DataFrameGroupBy.value_counts#

DataFrameGroupBy.value_counts(subset: Sequence[Hashable] | None = None, normalize: bool = False, sort: bool = True, ascending: bool = False, dropna: bool = True) DataFrame | Series[source]#

Return a Series or DataFrame containing counts of unique rows.

Examples:

>>> df = bpd.DataFrame({
...     'gender': ['male', 'male', 'female', 'male', 'female', 'male'],
...     'education': ['low', 'medium', 'high', 'low', 'high', 'low'],
...     'country': ['US', 'FR', 'US', 'FR', 'FR', 'FR']
... })
>>> df
   gender education country
0    male       low      US
1    male    medium      FR
2  female      high      US
3    male       low      FR
4  female      high      FR
5    male       low      FR

[6 rows x 3 columns]
>>> df.groupby('gender').value_counts()
     gender  education  country
female  high       FR         1
                   US         1
male    low        FR         2
                   US         1
        medium     FR         1
Name: count, dtype: Int64
>>> df.groupby('gender').value_counts(ascending=True)
gender  education  country
female  high       FR         1
                   US         1
male    low        US         1
        medium     FR         1
        low        FR         2
Name: count, dtype: Int64
>>> df.groupby('gender').value_counts(normalize=True)
gender  education  country
female  high       FR          0.5
                   US          0.5
male    low        FR          0.5
                   US         0.25
        medium     FR         0.25
Name: proportion, dtype: Float64
>>> df.groupby('gender', as_index=False).value_counts()
   gender education country  count
0  female      high      FR      1
1  female      high      US      1
2    male       low      FR      2
3    male       low      US      1
4    male    medium      FR      1

[5 rows x 4 columns]
>>> df.groupby('gender', as_index=False).value_counts(normalize=True)
   gender education country  proportion
0  female      high      FR         0.5
1  female      high      US         0.5
2    male       low      FR         0.5
3    male       low      US        0.25
4    male    medium      FR        0.25

[5 rows x 4 columns]
Parameters:
  • subset (list-like, optional) – Columns to use when counting unique combinations.

  • normalize (bool, default False) – Return proportions rather than frequencies.

  • sort (bool, default True) – Sort by frequencies.

  • ascending (bool, default False) – Sort in ascending order.

  • dropna (bool, default True) – Don’t include counts of rows that contain NA values.

Returns:

Series if the groupby as_index is True, otherwise DataFrame.

Return type:

Series or DataFrame