bigframes.pandas.DataFrame.drop_duplicates#
- DataFrame.drop_duplicates(subset: Hashable | Sequence[Hashable] = None, *, keep: str = 'first') DataFrame[source]#
Return DataFrame with duplicate rows removed.
Considering certain columns is optional. Indexes, including time indexes are ignored.
- Parameters:
subset (column label or sequence of labels, optional) – Only consider certain columns for identifying duplicates, by default use all of the columns.
keep ({‘first’, ‘last’,
False}, default ‘first’) –Determines which duplicates (if any) to keep.
’first’ : Drop duplicates except for the first occurrence.
’last’ : Drop duplicates except for the last occurrence.
False: Drop all duplicates.
- Returns:
DataFrame with duplicates removed
- Return type: