bigframes.pandas.Series.mask#
- Series.mask(cond, other=None) Series[source]#
Replace values where the condition is True.
Examples:
>>> s = bpd.Series([10, 11, 12, 13, 14]) >>> s 0 10 1 11 2 12 3 13 4 14 dtype: Int64
You can mask the values in the Series based on a condition. The values matching the condition would be masked. The condition can be provided in formm of a Series.
>>> s.mask(s % 2 == 0) 0 <NA> 1 11 2 <NA> 3 13 4 <NA> dtype: Int64
You can specify a custom mask value.
>>> s.mask(s % 2 == 0, -1) 0 -1 1 11 2 -1 3 13 4 -1 dtype: Int64 >>> s.mask(s % 2 == 0, 100*s) 0 1000 1 11 2 1200 3 13 4 1400 dtype: Int64
You can also use a remote function to evaluate the mask condition. This is useful in situation such as the following, where the mask condition is evaluated based on a complicated business logic which cannot be expressed in form of a Series.
>>> @bpd.remote_function(reuse=False, cloud_function_service_account="default") ... def should_mask(name: str) -> bool: ... hash = 0 ... for char_ in name: ... hash += ord(char_) ... return hash % 2 == 0
>>> s = bpd.Series(["Alice", "Bob", "Caroline"]) >>> s 0 Alice 1 Bob 2 Caroline dtype: string >>> s.mask(should_mask) 0 <NA> 1 Bob 2 Caroline dtype: string >>> s.mask(should_mask, "REDACTED") 0 REDACTED 1 Bob 2 Caroline dtype: string
Simple vectorized (i.e. they only perform operations supported on a Series) lambdas or python functions can be used directly.
>>> nums = bpd.Series([1, 2, 3, 4], name="nums") >>> nums 0 1 1 2 2 3 3 4 Name: nums, dtype: Int64 >>> nums.mask(lambda x: (x+1) % 2 == 1) 0 1 1 <NA> 2 3 3 <NA> Name: nums, dtype: Int64
>>> def is_odd(num): ... return num % 2 == 1 >>> nums.mask(is_odd) 0 <NA> 1 2 2 <NA> 3 4 Name: nums, dtype: Int64
- Parameters:
cond (bool Series/DataFrame, array-like, or callable) – Where cond is False, keep the original value. Where True, replace with corresponding value from other. If cond is callable, it is computed on the Series/DataFrame and should return boolean Series/DataFrame or array. The callable must not change input Series/DataFrame (though pandas doesn’t check it).
other (scalar, Series/DataFrame, or callable) – Entries where cond is True are replaced with corresponding value from other. If other is callable, it is computed on the Series/DataFrame and should return scalar or Series/DataFrame. The callable must not change input Series/DataFrame (though pandas doesn’t check it). If not specified, entries will be filled with the corresponding NULL value (np.nan for numpy dtypes, pd.NA for extension dtypes).
- Returns:
Series after the replacement.
- Return type: