bigframes.pandas.DataFrame.astype#

DataFrame.astype(dtype: Literal['boolean', 'Float64', 'Int64', 'int64[pyarrow]', 'string', 'string[pyarrow]', 'timestamp[us, tz=UTC][pyarrow]', 'timestamp[us][pyarrow]', 'date32[day][pyarrow]', 'time64[us][pyarrow]', 'decimal128(38, 9)[pyarrow]', 'decimal256(76, 38)[pyarrow]', 'binary[pyarrow]', 'duration[us][pyarrow]'] | BooleanDtype | Float64Dtype | Int64Dtype | StringDtype | ArrowDtype | GeometryDtype | type | dict[str, Literal['boolean', 'Float64', 'Int64', 'int64[pyarrow]', 'string', 'string[pyarrow]', 'timestamp[us, tz=UTC][pyarrow]', 'timestamp[us][pyarrow]', 'date32[day][pyarrow]', 'time64[us][pyarrow]', 'decimal128(38, 9)[pyarrow]', 'decimal256(76, 38)[pyarrow]', 'binary[pyarrow]', 'duration[us][pyarrow]'] | BooleanDtype | Float64Dtype | Int64Dtype | StringDtype | ArrowDtype | GeometryDtype], *, errors: Literal['raise', 'null'] = 'raise') DataFrame[source]#

Cast a pandas object to a specified dtype dtype.

Examples:

Create a DataFrame:

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = bpd.DataFrame(data=d)
>>> df.dtypes
col1    Int64
col2    Int64
dtype: object

Cast all columns to Float64:

>>> df.astype('Float64').dtypes
col1    Float64
col2    Float64
dtype: object

Create a series of type Int64:

>>> ser = bpd.Series([2023010000246789, 1624123244123101, 1054834234120101], dtype='Int64')
>>> ser
0    2023010000246789
1    1624123244123101
2    1054834234120101
dtype: Int64

Convert to Float64 type:

>>> ser.astype('Float64')
0    2023010000246789.0
1    1624123244123101.0
2    1054834234120101.0
dtype: Float64

Convert to pd.ArrowDtype(pa.timestamp("us", tz="UTC")) type:

>>> ser.astype("timestamp[us, tz=UTC][pyarrow]")
0    2034-02-08 11:13:20.246789+00:00
1    2021-06-19 17:20:44.123101+00:00
2    2003-06-05 17:30:34.120101+00:00
dtype: timestamp[us, tz=UTC][pyarrow]

Note that this is equivalent of using to_datetime with unit='us':

>>> bpd.to_datetime(ser, unit='us', utc=True)
0    2034-02-08 11:13:20.246789+00:00
1    2021-06-19 17:20:44.123101+00:00
2    2003-06-05 17:30:34.120101+00:00
dtype: timestamp[us, tz=UTC][pyarrow]

Convert pd.ArrowDtype(pa.timestamp("us", tz="UTC")) type to Int64 type:

>>> timestamp_ser = ser.astype("timestamp[us, tz=UTC][pyarrow]")
>>> timestamp_ser.astype('Int64')
0    2023010000246789
1    1624123244123101
2    1054834234120101
dtype: Int64
Parameters:
  • dtype (str, data type or pandas.ExtensionDtype) – A dtype supported by BigQuery DataFrame include 'boolean', 'Float64', 'Int64', 'int64\[pyarrow\]', 'string', 'string\[pyarrow\]', 'timestamp\[us, tz=UTC\]\[pyarrow\]', 'timestamp\[us\]\[pyarrow\]', 'date32\[day\]\[pyarrow\]', 'time64\[us\]\[pyarrow\]'. A pandas.ExtensionDtype include pandas.BooleanDtype(), pandas.Float64Dtype(), pandas.Int64Dtype(), pandas.StringDtype(storage="pyarrow"), pd.ArrowDtype(pa.date32()), pd.ArrowDtype(pa.time64("us")), pd.ArrowDtype(pa.timestamp("us")), pd.ArrowDtype(pa.timestamp("us", tz="UTC")).

  • errors ({'raise', 'null'}, default 'raise') – Control raising of exceptions on invalid data for provided dtype. If ‘raise’, allow exceptions to be raised if any value fails cast If ‘null’, will assign null value if value fails cast

Returns:

A BigQuery DataFrame.

Return type:

bigframes.pandas.DataFrame