bigframes.pandas.DataFrame.to_pandas#

DataFrame.to_pandas(max_download_size: int | None = None, sampling_method: str | None = None, random_state: int | None = None, *, ordered: bool = True, dry_run: Literal[False] = False, allow_large_results: bool | None = None) DataFrame[source]#
DataFrame.to_pandas(max_download_size: int | None = None, sampling_method: str | None = None, random_state: int | None = None, *, ordered: bool = True, dry_run: Literal[True] = False, allow_large_results: bool | None = None) Series

Write DataFrame to pandas DataFrame.

Examples:

>>> df = bpd.DataFrame({'col': [4, 2, 2]})

Download the data from BigQuery and convert it into an in-memory pandas DataFrame.

>>> df.to_pandas()
   col
0    4
1    2
2    2

Estimate job statistics without processing or downloading data by using dry_run=True.

>>> df.to_pandas(dry_run=True)
columnCount                                                            1
columnDtypes                                              {'col': Int64}
indexLevel                                                             1
indexDtypes                                                      [Int64]
projectId                                                  bigframes-dev
location                                                              US
jobType                                                            QUERY
destinationTable       {'projectId': 'bigframes-dev', 'datasetId': '_...
useLegacySql                                                       False
referencedTables                                                    None
totalBytesProcessed                                                    0
cacheHit                                                           False
statementType                                                     SELECT
creationTime                            2025-04-02 20:17:12.038000+00:00
dtype: object
Parameters:
  • max_download_size (int, default None) –

    Deprecated since version 2.0.0: max_download_size parameter is deprecated. Please use to_pandas_batches() method instead.

    Download size threshold in MB. If max_download_size is exceeded when downloading data, the data will be downsampled if bigframes.options.sampling.enable_downsampling is True, otherwise, an error will be raised. If set to a value other than None, this will supersede the global config.

  • sampling_method (str, default None) –

    Deprecated since version 2.0.0: sampling_method parameter is deprecated. Please use sample() method instead.

    Downsampling algorithms to be chosen from, the choices are: “head”: This algorithm returns a portion of the data from the beginning. It is fast and requires minimal computations to perform the downsampling; “uniform”: This algorithm returns uniform random samples of the data. If set to a value other than None, this will supersede the global config.

  • random_state (int, default None) –

    Deprecated since version 2.0.0: random_state parameter is deprecated. Please use sample() method instead.

    The seed for the uniform downsampling algorithm. If provided, the uniform method may take longer to execute and require more computation. If set to a value other than None, this will supersede the global config.

  • ordered (bool, default True) – Determines whether the resulting pandas dataframe will be ordered. In some cases, unordered may result in a faster-executing query.

  • dry_run (bool, default False) – If this argument is true, this method will not process the data. Instead, it returns a Pandas Series containing dry run statistics

  • allow_large_results (bool, default None) – If not None, overrides the global setting to allow or disallow large query results over the default size limit of 10 GB.

Returns:

A pandas DataFrame with all rows and columns of this DataFrame if the

data_sampling_threshold_mb is not exceeded; otherwise, a pandas DataFrame with downsampled rows and all columns of this DataFrame. If dry_run is set, a pandas Series containing dry run statistics will be returned.

Return type:

pandas.DataFrame