DataFrame.query() - ( Pandas DataFrame Basics )
Heading h2
Syntax
df.query('column == value')
Example
import pandas as pd
data = {'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eric'],
'age': [25, 20, 18, 32, 41],
'city': ['New York', 'Paris', 'Tokyo', 'London', 'Dubai']}
df = pd.DataFrame(data)
filtered_df = df.query('age > 20')
print(filtered_df)
Output
name age city
0 Alice 25 New York
3 David 32 London
4 Eric 41 Dubai
Explanation
Pandas DataFrame provides the query()
function to filter the rows based on a condition. The function takes a string-based condition and returns a new DataFrame containing only those rows that satisfy the condition.
In the above example, we have a DataFrame df
containing information about people's name, age, and city. We use the query function to filter only the rows where the age column is greater than 20.
Use
The query function is very useful for filtering large DataFrames based on a condition. It is faster and more readable than using the traditional loc
function to filter the rows.
Important Points
- Pandas DataFrame provides the
query()
function to filter the rows based on a condition - The function takes a string-based condition and returns a new DataFrame containing only those rows that satisfy the condition
- It is faster and more readable than using the traditional
loc
function to filter the rows.
Summary
In conclusion, the query()
function in Pandas DataFrame is very useful for filtering large DataFrames based on a condition. It is faster and more readable than using the traditional loc
function to filter the rows and can be very helpful when dealing with large datasets where query times are critical.