bigframes.geopandas.GeoSeries.groupby#
- GeoSeries.groupby(by: Hashable | Series | Sequence[Hashable | Series] = None, axis=0, level: int | str | Sequence[int] | Sequence[str] | None = None, as_index: bool = True, *, dropna: bool = True) SeriesGroupBy#
Group Series using a mapper or by a Series of columns.
A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
Examples:
You can group by a named index level.
>>> s = bpd.Series([380, 370., 24., 26.], ... index=["Falcon", "Falcon", "Parrot", "Parrot"], ... name="Max Speed") >>> s.index.name="Animal" >>> s Animal Falcon 380.0 Falcon 370.0 Parrot 24.0 Parrot 26.0 Name: Max Speed, dtype: Float64 >>> s.groupby("Animal").mean() Animal Falcon 375.0 Parrot 25.0 Name: Max Speed, dtype: Float64
You can also group by more than one index levels.
>>> s = bpd.Series([380, 370., 24., 26.], ... index=pd.MultiIndex.from_tuples( ... [("Falcon", "Clear"), ... ("Falcon", "Cloudy"), ... ("Parrot", "Clear"), ... ("Parrot", "Clear")], ... names=["Animal", "Sky"]), ... name="Max Speed") >>> s Animal Sky Falcon Clear 380.0 Cloudy 370.0 Parrot Clear 24.0 Clear 26.0 Name: Max Speed, dtype: Float64
>>> s.groupby("Animal").mean() Animal Falcon 375.0 Parrot 25.0 Name: Max Speed, dtype: Float64
>>> s.groupby("Sky").mean() Sky Clear 143.333333 Cloudy 370.0 Name: Max Speed, dtype: Float64
>>> s.groupby(["Animal", "Sky"]).mean() Animal Sky Falcon Clear 380.0 Cloudy 370.0 Parrot Clear 25.0 Name: Max Speed, dtype: Float64
You can also group by values in a Series provided the index matches with the original series.
>>> df = bpd.DataFrame({'Animal': ['Falcon', 'Falcon', 'Parrot', 'Parrot'], ... 'Max Speed': [380., 370., 24., 26.], ... 'Age': [10., 20., 4., 6.]}) >>> df Animal Max Speed Age 0 Falcon 380.0 10.0 1 Falcon 370.0 20.0 2 Parrot 24.0 4.0 3 Parrot 26.0 6.0 [4 rows x 3 columns]
>>> df['Max Speed'].groupby(df['Animal']).mean() Animal Falcon 375.0 Parrot 25.0 Name: Max Speed, dtype: Float64
>>> df['Age'].groupby(df['Animal']).max() Animal Falcon 20.0 Parrot 6.0 Name: Age, dtype: Float64
- Parameters:
by (mapping, function, label, pd.Grouper or list of such, default None) – Used to determine the groups for the groupby. If
byis a function, it’s called on each value of the object’s index. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups (the Series’ values are first aligned; see.align()method). If a list or ndarray of length equal to the selected axis is passed (see the groupby user guide), the values are used as-is to determine the groups. A label or list of labels may be passed to group by the columns inself. Notice that a tuple is interpreted as a (single) key.axis ({0 or 'index', 1 or 'columns'}, default 0) – Split along rows (0) or columns (1). For Series this parameter is unused and defaults to 0.
level (int, level name, or sequence of such, default None) – If the axis is a MultiIndex (hierarchical), group by a particular level or levels. Do not specify both
byandlevel.as_index (bool, default True) – Return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output. This argument has no effect on filtrations (see the “filtrations in the user guide” https://pandas.pydata.org/docs/dev/user_guide/groupby.html#filtration), such as
head(),tail(),nth()and in transformations (see the “transformations in the user guide” https://pandas.pydata.org/docs/dev/user_guide/groupby.html#transformation).dropna – bool, default True If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups.
- Returns:
Returns a groupby object that contains information about the groups.
- Return type:
bigframes.core.groupby.SeriesGroupBy