bigframes.pandas.DataFrame.query#
- DataFrame.query(expr: str) DataFrame[source]#
Query the columns of a DataFrame with a boolean expression.
Examples:
>>> df = bpd.DataFrame({'A': range(1, 6), ... 'B': range(10, 0, -2), ... 'C C': range(10, 5, -1)}) >>> df A B C C 0 1 10 10 1 2 8 9 2 3 6 8 3 4 4 7 4 5 2 6 [5 rows x 3 columns] >>> df.query('A > B') A B C C 4 5 2 6 [1 rows x 3 columns]
The previous expression is equivalent to
>>> df[df.A > df.B] A B C C 4 5 2 6 [1 rows x 3 columns]
For columns with spaces in their name, you can use backtick quoting.
>>> df.query('B == `C C`') A B C C 0 1 10 10 [1 rows x 3 columns]
The previous expression is equivalent to
>>> df[df.B == df['C C']] A B C C 0 1 10 10 [1 rows x 3 columns]
- Parameters:
expr (str) –
The query string to evaluate.
You can refer to variables in the environment by prefixing them with an ‘@’ character like
@a + b.You can refer to column names that are not valid Python variable names by surrounding them in backticks. Thus, column names containing spaces or punctuations (besides underscores) or starting with digits must be surrounded by backticks. (For example, a column named “Area (cm^2)” would be referenced as
`Area (cm^2)`). Column names which are Python keywords (like “list”, “for”, “import”, etc) cannot be used.For example, if one of your columns is called
a aand you want to sum it withb, your query should be`a a` + b.- Returns:
DataFrame result after the query operation, otherwise None.
- Return type:
None or bigframes.pandas.DataFrame