bigframes.bigquery.array_agg#

bigframes.bigquery.array_agg(obj: groupby.SeriesGroupBy | groupby.DataFrameGroupBy) series.Series | dataframe.DataFrame[source]#

Group data and create arrays from selected columns, omitting NULLs to avoid BigQuery errors (NULLs not allowed in arrays).

Examples:

>>> import bigframes.pandas as bpd
>>> import bigframes.bigquery as bbq

For a SeriesGroupBy object:

>>> lst = ['a', 'a', 'b', 'b', 'a']
>>> s = bpd.Series([1, 2, 3, 4, np.nan], index=lst)
>>> bbq.array_agg(s.groupby(level=0))
a    [1. 2.]
b    [3. 4.]
dtype: list<item: double>[pyarrow]

For a DataFrameGroupBy object:

>>> l = [[1, 2, 3], [1, None, 4], [2, 1, 3], [1, 2, 2]]
>>> df = bpd.DataFrame(l, columns=["a", "b", "c"])
>>> bbq.array_agg(df.groupby(by=["b"]))
         a      c
b
1.0    [2]    [3]
2.0  [1 1]  [3 2]

[2 rows x 2 columns]
Parameters:

obj (groupby.SeriesGroupBy | groupby.DataFrameGroupBy) – A GroupBy object to be applied the function.

Returns:

A Series or

DataFrame containing aggregated array columns, and indexed by the original group columns.

Return type:

bigframes.series.Series | bigframes.dataframe.DataFrame