bigframes.pandas.DataFrame.to_parquet#
- DataFrame.to_parquet(path=None, *, compression: Literal['snappy', 'gzip'] | None = 'snappy', index: bool = True, allow_large_results: bool | None = None) bytes | None[source]#
Write a DataFrame to the binary Parquet format.
This function writes the dataframe as a parquet file to Cloud Storage.
Examples:
>>> import bigframes.pandas as bpd >>> df = bpd.DataFrame({'col1': [1, 2], 'col2': [3, 4]}) >>> gcs_bucket = "gs://bigframes-dev-testing/sample_parquet*.parquet" >>> df.to_parquet(path=gcs_bucket)
- Parameters:
path (str, path object, file-like object, or None, default None) – String, path object (implementing
os.PathLike[str]), or file-like object implementing a binarywrite()function. If None, the result is returned as bytes. If a string or path, it will be used as Root Directory path when writing a partitioned dataset. Destination URI(s) of Cloud Storage files(s) to store the extracted dataframe should be formattedgs://<bucket_name>/<object_name_or_glob>. If the data size is more than 1GB, you must use a wildcard to export the data into multiple files and the size of the files varies.compression (str, default 'snappy') – Name of the compression to use. Use
Nonefor no compression. Supported options:'gzip','snappy'.index (bool, default True) – If
True, include the dataframe’s index(es) in the file output. IfFalse, they will not be written to the file.allow_large_results (bool, default None) – If not None, overrides the global setting to allow or disallow large query results over the default size limit of 10 GB. This parameter has no effect when results are saved to Google Cloud Storage (GCS).
- Returns:
bytes if no path argument is provided else None
- Return type:
None or bytes
- Raises:
ValueError – If an invalid value provided for compression that is not one of
None,snappy, orgzip.