bigframes.pandas.DataFrame#
- class bigframes.pandas.DataFrame(data=None, index: vendored_pandas_typing.Axes | None = None, columns: vendored_pandas_typing.Axes | None = None, dtype: Optional[bigframes.dtypes.DtypeString | bigframes.dtypes.Dtype] = None, copy: Optional[bool] = None, *, session: Optional[bigframes.session.Session] = None)[source]#
Two-dimensional, size-mutable, potentially heterogeneous tabular data.
Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
Attributes
Returns the accessor for AI operators.
BigQuery REST API Client the DataFrame uses for operations.
BigQuery job metadata for the most recent query.
Compiles this DataFrame's expression tree to SQL.
Methods
__init__([data, index, columns, dtype, ...])abs()add(other[, axis])Get addition of DataFrame and other, element-wise (binary operator +).
add_prefix(prefix[, axis])add_suffix(suffix[, axis])agg(func)Aggregate using one or more operations over columns.
aggregate(func)Aggregate using one or more operations over columns.
align(other[, join, axis])Align two objects on their axes with the specified join method.
all([axis, bool_only])Return whether all elements are True, potentially over an axis.
any(*[, axis, bool_only])Return whether any element is True, potentially over an axis.
apply(func, *[, axis, args])Apply a function along an axis of the DataFrame.
applymap(func[, na_action])Apply a function to a Dataframe elementwise.
area([x, y, stacked])Draw a stacked area plot.
assign(**kwargs)Assign new columns to a DataFrame.
astype(dtype, *[, errors])bar([x, y])Draw a vertical bar plot.
bfill(*[, limit])cache()Materializes the DataFrame to a temporary table.
combine(other, func[, fill_value, ...])Perform column-wise combine with another DataFrame.
combine_first(other)Update null elements with value in the same location in other.
copy()corr([method, min_periods, numeric_only])Compute pairwise correlation of columns, excluding NA/null values.
corrwith(other, *[, numeric_only])Compute pairwise correlation.
count(*[, numeric_only])Count non-NA cells for each column.
cov(*[, numeric_only])Compute pairwise covariance of columns, excluding NA/null values.
cummax()Return cumulative maximum over columns.
cummin()Return cumulative minimum over columns.
cumprod()Return cumulative product over columns.
cumsum()Return cumulative sum over columns.
describe([include])Generate descriptive statistics.
diff([periods])First discrete difference of element.
div(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
divide(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
dot(other)Compute the matrix multiplication between the DataFrame and other.
drop()Drop specified labels from columns.
drop_duplicates([subset, keep])Return DataFrame with duplicate rows removed.
droplevel(level[, axis])Return DataFrame with requested index / column level(s) removed.
dropna(*[, axis, how, thresh, subset, ...])Remove missing values.
duplicated([subset, keep])Return boolean Series denoting duplicate rows.
eq(other[, axis])Get equal to of DataFrame and other, element-wise (binary operator eq).
equals(other)eval(expr)Evaluate a string describing operations on DataFrame columns.
expanding([min_periods])explode(column, *[, ignore_index])Transform each element of an array to a row, replicating index values.
ffill(*[, limit])fillna([value])Fill NA (NULL in BigQuery) values using the specified method.
filter([items, like, regex, axis])floordiv(other[, axis])Get integer division of DataFrame and other, element-wise (binary operator //).
from_dict(data[, orient, dtype, columns])from_records(data[, index, exclude, ...])ge(other[, axis])Get 'greater than or equal to' of DataFrame and other, element-wise (binary operator >=).
get(key[, default])groupby([by, level, as_index, dropna])Group DataFrame by columns.
gt(other[, axis])Get 'greater than' of DataFrame and other, element-wise (binary operator >).
head([n])hist([by, bins])Draw one histogram of the DataFrame’s columns.
idxmax()Return index of first occurrence of maximum over columns.
idxmin()Return index of first occurrence of minimum over columns.
info([verbose, buf, max_cols, memory_usage, ...])Print a concise summary of a DataFrame.
insert(loc, column, value[, allow_duplicates])Insert column into DataFrame at specified location.
interpolate([method])Fill NA (NULL in BigQuery) values using an interpolation method.
isin(values)Whether each element in the DataFrame is contained in values.
isna()isnull()items()Iterate over (column name, Series) pairs.
iterrows()Iterate over DataFrame rows as (index, Series) pairs.
itertuples([index, name])Iterate over DataFrame rows as namedtuples.
join(other[, on, how, lsuffix, rsuffix])Join columns of another DataFrame.
keys()Get the 'info axis'.
kurt(*[, numeric_only])Return unbiased kurtosis over columns.
kurtosis(*[, numeric_only])Return unbiased kurtosis over columns.
le(other[, axis])Get 'less than or equal to' of dataframe and other, element-wise (binary operator <=).
line([x, y])Plot Series or DataFrame as lines.
lt(other[, axis])Get 'less than' of DataFrame and other, element-wise (binary operator <).
map(func[, na_action])Apply a function to a Dataframe elementwise.
mask(cond[, other])Replace values where the condition is False.
max([axis, numeric_only])Return the maximum of the values over the requested axis.
mean([axis, numeric_only])Return the mean of the values over the requested axis.
median(*[, numeric_only, exact])Return the median of the values over colunms.
melt([id_vars, value_vars, var_name, value_name])Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
memory_usage([index])Return the memory usage of each column in bytes.
merge(right[, how, on, left_on, right_on, ...])Merge DataFrame objects with a database-style join.
min([axis, numeric_only])Return the minimum of the values over the requested axis.
mod(other[, axis])Get modulo of DataFrame and other, element-wise (binary operator %).
mul(other[, axis])Get multiplication of DataFrame and other, element-wise (binary operator *).
multiply(other[, axis])Get multiplication of DataFrame and other, element-wise (binary operator *).
ne(other[, axis])Get not equal to of DataFrame and other, element-wise (binary operator ne).
nlargest(n, columns[, keep])Return the first n rows ordered by columns in descending order.
notna()notnull()nsmallest(n, columns[, keep])Return the first n rows ordered by columns in ascending order.
nunique()Count number of distinct elements in each column.
pct_change([periods])peek([n, force, allow_large_results])Preview n arbitrary rows from the dataframe.
pipe(func, *args, **kwargs)pivot(*, columns[, index, values])Return reshaped DataFrame organized by given index / column values.
pivot_table([values, index, columns, ...])Create a spreadsheet-style pivot table as a DataFrame.
pow(other[, axis])Get Exponential power of dataframe and other, element-wise (binary operator **).
prod([axis, numeric_only])Return the product of the values over the requested axis.
product([axis, numeric_only])Return the product of the values over the requested axis.
quantile([q, numeric_only])Return values at the given quantile over requested axis.
query(expr)Query the columns of a DataFrame with a boolean expression.
radd(other[, axis])Get addition of DataFrame and other, element-wise (binary operator +).
rank([axis, method, numeric_only, ...])rdiv(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
reindex([labels, index, columns, axis, validate])Conform DataFrame to new index with optional filling logic.
reindex_like(other, *[, validate])Return an object with matching indices as other object.
rename()Rename columns.
Set the name of the axis for the index.
reorder_levels(order[, axis])Rearrange index levels using input order.
replace(to_replace[, value, regex])Replace values given in to_replace with value.
resample(rule, *[, closed, label, on, ...])Resample time-series data.
Reset the index.
rfloordiv(other[, axis])Get integer division of DataFrame and other, element-wise (binary operator //).
rmod(other[, axis])Get modulo of DataFrame and other, element-wise (binary operator %).
rmul(other[, axis])Get multiplication of DataFrame and other, element-wise (binary operator *).
rolling(window[, min_periods, on, closed])round([decimals])Round a DataFrame to a variable number of decimal places.
rpow(other[, axis])Get Exponential power of dataframe and other, element-wise (binary operator rpow).
rsub(other[, axis])Get subtraction of DataFrame and other, element-wise (binary operator -).
rtruediv(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
sample([n, frac, random_state, sort])scatter([x, y, s, c])Create a scatter plot with varying marker point size and color.
select_dtypes([include, exclude])Return a subset of the DataFrame's columns based on the column dtypes.
set_index(keys[, append, drop])Set the DataFrame index using existing columns.
shift([periods])skew(*[, numeric_only])Return unbiased skew over columns.
Sort object by labels (along an axis).
Sort by the values along row axis.
stack([level])Stack the prescribed level(s) from columns to index.
std([axis, numeric_only])Return sample standard deviation over columns.
sub(other[, axis])Get subtraction of DataFrame and other, element-wise (binary operator -).
subtract(other[, axis])Get subtraction of DataFrame and other, element-wise (binary operator -).
sum([axis, numeric_only])Return the sum of the values over the requested axis.
swaplevel([i, j, axis])Swap levels i and j in a
MultiIndex.tail([n])take(indices[, axis])to_arrow(*[, ordered, allow_large_results])Write DataFrame to an Arrow table / record batch.
to_csv([path_or_buf, sep, header, index, ...])to_dict([orient, into, allow_large_results])Convert the DataFrame to a dictionary.
to_excel(excel_writer[, sheet_name, ...])Write DataFrame to an Excel sheet.
to_gbq([destination_table, if_exists, ...])Write a DataFrame to a BigQuery table.
to_html([buf, columns, col_space, header, ...])Render a DataFrame as an HTML table.
to_json([path_or_buf, orient, lines, index, ...])to_latex([buf, columns, header, index, ...])Render object to a LaTeX tabular, longtable, or nested table.
to_markdown([buf, mode, index, ...])Print DataFrame in Markdown-friendly format.
to_numpy([dtype, copy, na_value, ...])Convert the DataFrame to a NumPy array.
to_orc([path, allow_large_results])Write a DataFrame to the ORC format.
Write DataFrame to pandas DataFrame.
to_pandas_batches([page_size, max_results, ...])Stream DataFrame results to an iterable of pandas DataFrame.
to_parquet([path, compression, index, ...])Write a DataFrame to the binary Parquet format.
to_pickle(path, *[, allow_large_results])Pickle (serialize) object to file.
to_records([index, column_dtypes, ...])Convert DataFrame to a NumPy record array.
to_string([buf, columns, col_space, header, ...])Render a DataFrame to a console-friendly tabular output.
Transpose index and columns.
truediv(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
unstack([level])Pivot a level of the (necessarily hierarchical) index labels.
update(other[, join, overwrite, filter_func])Modify in place using non-NA values from another DataFrame.
value_counts([subset, normalize, sort, ...])Return a Series containing counts of unique rows in the DataFrame.
var([axis, numeric_only])Return unbiased variance over requested axis.
where(cond[, other])Replace values where the condition is False.