bigframes.pandas.api.typing.StringMethods.extract#
- StringMethods.extract(pat: str, flags: int = 0) DataFrame[source]#
Extract capture groups in the regex pat as columns in a DataFrame.
For each subject string in the Series, extract groups from the first match of regular expression pat.
Examples:
>>> import bigframes.pandas as bpd
A pattern with two groups will return a DataFrame with two columns. Non-matches will be NaN.
>>> s = bpd.Series(['a1', 'b2', 'c3']) >>> s.str.extract(r'([ab])(\d)') 0 1 0 a 1 1 b 2 2 <NA> <NA> [3 rows x 2 columns]
Named groups will become column names in the result.
>>> s.str.extract(r'(?P<letter>[ab])(?P<digit>\d)') letter digit 0 a 1 1 b 2 2 <NA> <NA> [3 rows x 2 columns]
A pattern with one group will return a DataFrame with one column.
>>> s.str.extract(r'[ab](\d)') 0 0 1 1 2 2 <NA> [3 rows x 1 columns]
- Parameters:
- Returns:
A DataFrame with one row for each subject string, and one column for each group. Any capture group names in regular expression pat will be used for column names; otherwise capture group numbers will be used.
- Return type: