Data Analysis Data Wrangling Tutorial
query() method: Query/Filter Columns
- November 24, 2018
- Key Terms: query, python, pandas
In pandas, we can query the columns of DataFrames with boolean expressions using the query() method. I'll walk through lots of simple examples.
Import Modules¶
In [2]:
import pandas as pd
import seaborn as sns
Get Flights Data¶
Let's get the flights
dataset included in the seaborn
library and assign it to the DataFrame df_flights
.
In [3]:
df_flights = sns.load_dataset('flights')
Preview the first few rows of df_flights
.
Each row represents a month's flight history details. The passengers
column represents that total number of passengers that flew that month.
In [4]:
df_flights.head()
Out[4]:
This dataset spans 1949 to 1960.
Practice Filtering Rows and Columns¶
Query for rows in which year is equal to 1949¶
In [5]:
df_flights.query('year==1949')
Out[5]:
Query for rows in which month is equal to January¶
Notice how 'January'
is in single quotes because it's a string.
In [6]:
df_flights.query("month=='January'")
Out[6]:
Query for rows in which year is equal to 1949 and month is equal to January¶
In [7]:
df_flights.query("year==1949 and month=='January'")
Out[7]:
Query for rows in which month is January or February¶
In [8]:
df_flights.query("month==['January', 'February']")
Out[8]:
Query for rows in which month equals January and year is less than 1955¶
In [9]:
df_flights.query("month=='January' and year<1955")
Out[9]:
Query for rows in which month equals January and year is greater than 1955¶
In [10]:
df_flights.query("month=='January' and year>1955")
Out[10]:
Query for rows in which month equals January and the year is not 1955¶
In [11]:
df_flights.query("month=='January' and year!=1955")
Out[11]:
Query for rows in which month equals January or year equals 1955¶
In [12]:
df_flights.query("month=='January' or year==1955")
Out[12]: