bigframes.pandas.read_parquet#

bigframes.pandas.read_parquet(path: str | IO[bytes], *, engine: str = 'auto', write_engine: Literal['default', 'bigquery_inline', 'bigquery_load', 'bigquery_streaming', 'bigquery_write', '_deferred'] = 'default') DataFrame[source]#

Load a Parquet object from the file path (local or Cloud Storage), returning a DataFrame.

Note

This method will not guarantee the same ordering as the file. Instead, set a serialized index column as the index and sort by that in the resulting DataFrame.

Note

For non-“bigquery” engine, data is inlined in the query SQL if it is small enough (roughly 5MB or less in memory). Larger size data is loaded to a BigQuery table instead.

Examples:

>>> import bigframes.pandas as bpd
>>> gcs_path = "gs://cloud-samples-data/bigquery/us-states/us-states.parquet"
>>> df = bpd.read_parquet(path=gcs_path, engine="bigquery")
Parameters:
  • path (str) – Local or Cloud Storage path to Parquet file.

  • engine (str) – One of 'auto', 'pyarrow', 'fastparquet', or 'bigquery'. Parquet library to parse the file. If set to 'bigquery', order is not preserved. Default, 'auto'.

Returns:

A BigQuery DataFrames.

Return type:

bigframes.pandas.DataFrame