pythonbeginner

Pandas DataFrame Filtering Techniques

Filter DataFrames using boolean masks, query syntax, isin, between, and string matching methods.

python
import pandas as pd

df = pd.read_csv("sales.csv")

# Boolean mask filtering
high_sales = df[df["amount"] > 1000]

# Multiple conditions
filtered = df[(df["amount"] > 500) & (df["region"] == "US")]

# Query syntax (cleaner for complex filters)
result = df.query("amount > 500 and region == 'US' and status != 'cancelled'")

# Filter with isin
active_regions = df[df["region"].isin(["US", "EU", "UK"])]

# Between range
df_range = df[df["date"].between("2024-01-01", "2024-12-31")]

# String matching
tech_products = df[df["product"].str.contains("tech|software", case=False, na=False)]

# Filter null / not null
with_email = df[df["email"].notna()]
missing_data = df[df["phone"].isna()]

# Negate a filter
non_cancelled = df[~df["status"].isin(["cancelled", "refunded"])]

# Filter using loc for label-based selection
result = df.loc[
    (df["amount"] > 100) & (df["date"] >= "2024-06-01"),
    ["product", "amount", "date"]
]

print(f"Filtered: {len(result)} rows from {len(df)} total")

Use Cases

  • Extracting subsets of data for reporting
  • Filtering records based on business rules
  • Data quality checks and validation

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.