bigframes.pandas.Series.to_pandas_batches#

Series.to_pandas_batches(page_size: int | None = None, max_results: int | None = None, *, allow_large_results: bool | None = None) Iterable[Series][source]#

Stream Series results to an iterable of pandas Series.

page_size and max_results determine the size and number of batches, see https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.job.QueryJob#google_cloud_bigquery_job_QueryJob_result

Examples:

>>> s = bpd.Series([4, 3, 2, 2, 3])

Iterate through the results in batches, limiting the total rows yielded across all batches via max_results:

>>> for s_batch in s.to_pandas_batches(max_results=3):
...     print(s_batch)
0    4
1    3
2    2
dtype: Int64

Alternatively, control the approximate size of each batch using page_size and fetch batches manually using next():

>>> it = s.to_pandas_batches(page_size=2)
>>> next(it)
0    4
1    3
dtype: Int64
>>> next(it)
2    2
3    2
dtype: Int64
Parameters:
  • page_size (int, default None) – The maximum number of rows of each batch. Non-positive values are ignored.

  • max_results (int, default None) – The maximum total number of rows of all batches.

  • allow_large_results (bool, default None) – If not None, overrides the global setting to allow or disallow large query results over the default size limit of 10 GB.

Returns:

An iterable of smaller Series which combine to form the original Series. Results stream from bigquery, see https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.table.RowIterator#google_cloud_bigquery_table_RowIterator_to_arrow_iterable

Return type:

Iterable[pandas.Series]