bigframes.pandas.DataFrame#
- class bigframes.pandas.DataFrame(data=None, index: vendored_pandas_typing.Axes | None = None, columns: vendored_pandas_typing.Axes | None = None, dtype: Optional[bigframes.dtypes.DtypeString | bigframes.dtypes.Dtype] = None, copy: Optional[bool] = None, *, session: Optional[bigframes.session.Session] = None)[source]#
Two-dimensional, size-mutable, potentially heterogeneous tabular data.
Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.
Attributes
The transpose of the DataFrame.
Returns the accessor for AI operators.
Access a single value for a row/column label pair.
Return a list representing the axes of the DataFrame.
BigQuery REST API Client the DataFrame uses for operations.
The column labels of the DataFrame.
Return the dtypes in the DataFrame.
Indicates whether Series/DataFrame is empty.
Access a single value for a row/column pair by integer position.
Purely integer-location based indexing for selection by position.
The index (row labels) of the DataFrame.
Access a group of rows and columns by label(s) or a boolean array.
Return an int representing the number of axes / array dimensions.
Make plots of Dataframes.
BigQuery job metadata for the most recent query.
Return a tuple representing the dimensionality of the DataFrame.
Return an int representing the number of elements in this object.
Compiles this DataFrame's expression tree to SQL.
Return the values of DataFrame in the form of a NumPy array.
Methods
__init__([data, index, columns, dtype, ...])abs()Return a Series/DataFrame with absolute numeric value of each element.
add(other[, axis])Get addition of DataFrame and other, element-wise (binary operator +).
add_prefix(prefix[, axis])Prefix labels with string prefix.
add_suffix(suffix[, axis])Suffix labels with string suffix.
agg(func)Aggregate using one or more operations over columns.
aggregate(func)Aggregate using one or more operations over columns.
align(other[, join, axis])Align two objects on their axes with the specified join method.
all([axis, bool_only])Return whether all elements are True, potentially over an axis.
any(*[, axis, bool_only])Return whether any element is True, potentially over an axis.
apply(func, *[, axis, args])Apply a function along an axis of the DataFrame.
applymap(func[, na_action])Apply a function to a Dataframe elementwise.
area([x, y, stacked])Draw a stacked area plot.
assign(**kwargs)Assign new columns to a DataFrame.
astype(dtype, *[, errors])Cast a pandas object to a specified dtype
dtype.bar([x, y])Draw a vertical bar plot.
bfill(*[, limit])Fill NA/NaN values by using the next valid observation to fill the gap.
cache()Materializes the DataFrame to a temporary table.
combine(other, func[, fill_value, ...])Perform column-wise combine with another DataFrame.
combine_first(other)Update null elements with value in the same location in other.
copy()Make a copy of this object's indices and data.
corr([method, min_periods, numeric_only])Compute pairwise correlation of columns, excluding NA/null values.
corrwith(other, *[, numeric_only])Compute pairwise correlation.
count(*[, numeric_only])Count non-NA cells for each column.
cov(*[, numeric_only])Compute pairwise covariance of columns, excluding NA/null values.
cummax()Return cumulative maximum over columns.
cummin()Return cumulative minimum over columns.
cumprod()Return cumulative product over columns.
cumsum()Return cumulative sum over columns.
describe([include])Generate descriptive statistics.
diff([periods])First discrete difference of element.
div(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
divide(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
dot(other)Compute the matrix multiplication between the DataFrame and other.
drop()Drop specified labels from columns.
drop_duplicates([subset, keep])Return DataFrame with duplicate rows removed.
droplevel(level[, axis])Return DataFrame with requested index / column level(s) removed.
dropna(*[, axis, how, thresh, subset, ...])Remove missing values.
duplicated([subset, keep])Return boolean Series denoting duplicate rows.
eq(other[, axis])Get equal to of DataFrame and other, element-wise (binary operator eq).
equals(other)Test whether two objects contain the same elements.
eval(expr)Evaluate a string describing operations on DataFrame columns.
expanding([min_periods])Provide expanding window calculations.
explode(column, *[, ignore_index])Transform each element of an array to a row, replicating index values.
ffill(*[, limit])Fill NA/NaN values by propagating the last valid observation to next valid.
fillna([value])Fill NA (NULL in BigQuery) values using the specified method.
filter([items, like, regex, axis])Subset the dataframe rows or columns according to the specified index labels.
floordiv(other[, axis])Get integer division of DataFrame and other, element-wise (binary operator //).
from_dict(data[, orient, dtype, columns])Construct DataFrame from dict of array-like or dicts.
from_records(data[, index, exclude, ...])Convert structured or record ndarray to DataFrame.
ge(other[, axis])Get 'greater than or equal to' of DataFrame and other, element-wise (binary operator >=).
get(key[, default])Get item from object for given key (ex: DataFrame column).
groupby([by, level, as_index, dropna])Group DataFrame by columns.
gt(other[, axis])Get 'greater than' of DataFrame and other, element-wise (binary operator >).
head([n])Return the first n rows.
hist([by, bins])Draw one histogram of the DataFrame’s columns.
idxmax()Return index of first occurrence of maximum over columns.
idxmin()Return index of first occurrence of minimum over columns.
info([verbose, buf, max_cols, memory_usage, ...])Print a concise summary of a DataFrame.
insert(loc, column, value[, allow_duplicates])Insert column into DataFrame at specified location.
interpolate([method])Fill NA (NULL in BigQuery) values using an interpolation method.
isin(values)Whether each element in the DataFrame is contained in values.
isna()Detect missing (NULL) values.
isnull()Detect missing (NULL) values.
items()Iterate over (column name, Series) pairs.
iterrows()Iterate over DataFrame rows as (index, Series) pairs.
itertuples([index, name])Iterate over DataFrame rows as namedtuples.
join(other[, on, how, lsuffix, rsuffix])Join columns of another DataFrame.
keys()Get the 'info axis'.
kurt(*[, numeric_only])Return unbiased kurtosis over columns.
kurtosis(*[, numeric_only])Return unbiased kurtosis over columns.
le(other[, axis])Get 'less than or equal to' of dataframe and other, element-wise (binary operator <=).
line([x, y])Plot Series or DataFrame as lines.
lt(other[, axis])Get 'less than' of DataFrame and other, element-wise (binary operator <).
map(func[, na_action])Apply a function to a Dataframe elementwise.
mask(cond[, other])Replace values where the condition is False.
max([axis, numeric_only])Return the maximum of the values over the requested axis.
mean([axis, numeric_only])Return the mean of the values over the requested axis.
median(*[, numeric_only, exact])Return the median of the values over colunms.
melt([id_vars, value_vars, var_name, value_name])Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
memory_usage([index])Return the memory usage of each column in bytes.
merge(right[, how, on, left_on, right_on, ...])Merge DataFrame objects with a database-style join.
min([axis, numeric_only])Return the minimum of the values over the requested axis.
mod(other[, axis])Get modulo of DataFrame and other, element-wise (binary operator %).
mul(other[, axis])Get multiplication of DataFrame and other, element-wise (binary operator *).
multiply(other[, axis])Get multiplication of DataFrame and other, element-wise (binary operator *).
ne(other[, axis])Get not equal to of DataFrame and other, element-wise (binary operator ne).
nlargest(n, columns[, keep])Return the first n rows ordered by columns in descending order.
notna()Detect existing (non-missing) values.
notnull()Detect existing (non-missing) values.
nsmallest(n, columns[, keep])Return the first n rows ordered by columns in ascending order.
nunique()Count number of distinct elements in each column.
pct_change([periods])Fractional change between the current and a prior element.
peek([n, force, allow_large_results])Preview n arbitrary rows from the dataframe.
pipe(func, *args, **kwargs)Apply chainable functions that expect Series or DataFrames.
pivot(*, columns[, index, values])Return reshaped DataFrame organized by given index / column values.
pivot_table([values, index, columns, ...])Create a spreadsheet-style pivot table as a DataFrame.
pow(other[, axis])Get Exponential power of dataframe and other, element-wise (binary operator **).
prod([axis, numeric_only])Return the product of the values over the requested axis.
product([axis, numeric_only])Return the product of the values over the requested axis.
quantile([q, numeric_only])Return values at the given quantile over requested axis.
query(expr)Query the columns of a DataFrame with a boolean expression.
radd(other[, axis])Get addition of DataFrame and other, element-wise (binary operator +).
rank([axis, method, numeric_only, ...])Compute numerical data ranks (1 through n) along axis.
rdiv(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
reindex([labels, index, columns, axis, validate])Conform DataFrame to new index with optional filling logic.
reindex_like(other, *[, validate])Return an object with matching indices as other object.
rename()Rename columns.
Set the name of the axis for the index.
reorder_levels(order[, axis])Rearrange index levels using input order.
replace(to_replace[, value, regex])Replace values given in to_replace with value.
resample(rule, *[, closed, label, on, ...])Resample time-series data.
Reset the index.
rfloordiv(other[, axis])Get integer division of DataFrame and other, element-wise (binary operator //).
rmod(other[, axis])Get modulo of DataFrame and other, element-wise (binary operator %).
rmul(other[, axis])Get multiplication of DataFrame and other, element-wise (binary operator *).
rolling(window[, min_periods, on, closed])Provide rolling window calculations.
round([decimals])Round a DataFrame to a variable number of decimal places.
rpow(other[, axis])Get Exponential power of dataframe and other, element-wise (binary operator rpow).
rsub(other[, axis])Get subtraction of DataFrame and other, element-wise (binary operator -).
rtruediv(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
sample([n, frac, random_state, sort])Return a random sample of items from an axis of object.
scatter([x, y, s, c])Create a scatter plot with varying marker point size and color.
select_dtypes([include, exclude])Return a subset of the DataFrame's columns based on the column dtypes.
set_index(keys[, append, drop])Set the DataFrame index using existing columns.
shift([periods])Shift index by desired number of periods.
skew(*[, numeric_only])Return unbiased skew over columns.
Sort object by labels (along an axis).
Sort by the values along row axis.
stack([level])Stack the prescribed level(s) from columns to index.
std([axis, numeric_only])Return sample standard deviation over columns.
sub(other[, axis])Get subtraction of DataFrame and other, element-wise (binary operator -).
subtract(other[, axis])Get subtraction of DataFrame and other, element-wise (binary operator -).
sum([axis, numeric_only])Return the sum of the values over the requested axis.
swaplevel([i, j, axis])Swap levels i and j in a
MultiIndex.tail([n])Return the last n rows.
take(indices[, axis])Return the elements in the given positional indices along an axis.
to_arrow(*[, ordered, allow_large_results])Write DataFrame to an Arrow table / record batch.
to_csv([path_or_buf, sep, header, index, ...])Write object to a comma-separated values (csv) file on Cloud Storage.
to_dict([orient, into, allow_large_results])Convert the DataFrame to a dictionary.
to_excel(excel_writer[, sheet_name, ...])Write DataFrame to an Excel sheet.
to_gbq([destination_table, if_exists, ...])Write a DataFrame to a BigQuery table.
to_html([buf, columns, col_space, header, ...])Render a DataFrame as an HTML table.
to_json([path_or_buf, orient, lines, index, ...])Convert the object to a JSON string, written to Cloud Storage.
to_latex([buf, columns, header, index, ...])Render object to a LaTeX tabular, longtable, or nested table.
to_markdown([buf, mode, index, ...])Print DataFrame in Markdown-friendly format.
to_numpy([dtype, copy, na_value, ...])Convert the DataFrame to a NumPy array.
to_orc([path, allow_large_results])Write a DataFrame to the ORC format.
Write DataFrame to pandas DataFrame.
to_pandas_batches([page_size, max_results, ...])Stream DataFrame results to an iterable of pandas DataFrame.
to_parquet([path, compression, index, ...])Write a DataFrame to the binary Parquet format.
to_pickle(path, *[, allow_large_results])Pickle (serialize) object to file.
to_records([index, column_dtypes, ...])Convert DataFrame to a NumPy record array.
to_string([buf, columns, col_space, header, ...])Render a DataFrame to a console-friendly tabular output.
Transpose index and columns.
truediv(other[, axis])Get floating division of DataFrame and other, element-wise (binary operator /).
unstack([level])Pivot a level of the (necessarily hierarchical) index labels.
update(other[, join, overwrite, filter_func])Modify in place using non-NA values from another DataFrame.
value_counts([subset, normalize, sort, ...])Return a Series containing counts of unique rows in the DataFrame.
var([axis, numeric_only])Return unbiased variance over requested axis.
where(cond[, other])Replace values where the condition is False.