Pandas GroupBy Aggregation Examples
GroupBy operations with multiple aggregations, named aggregations, and transform for DataFrame analysis.
import pandas as pd
df = pd.read_csv("orders.csv")
# Basic groupby with single aggregation
revenue_by_region = df.groupby("region")["amount"].sum()
# Multiple aggregations
stats = df.groupby("region")["amount"].agg(["sum", "mean", "count", "std"])
# Named aggregations (clean output)
summary = df.groupby("region").agg(
total_revenue=pd.NamedAgg(column="amount", aggfunc="sum"),
avg_order=pd.NamedAgg(column="amount", aggfunc="mean"),
order_count=pd.NamedAgg(column="id", aggfunc="count"),
unique_customers=pd.NamedAgg(column="customer_id", aggfunc="nunique"),
).reset_index()
# Group by multiple columns
monthly = df.groupby(["region", pd.Grouper(key="date", freq="ME")])["amount"].sum()
# Transform: add group-level calculation as a new column
df["pct_of_region"] = df.groupby("region")["amount"].transform(
lambda x: x / x.sum() * 100
)
# Filter groups: keep only regions with total > 10000
big_regions = df.groupby("region").filter(lambda g: g["amount"].sum() > 10000)
# Rank within groups
df["rank_in_region"] = df.groupby("region")["amount"].rank(
ascending=False, method="dense"
)
# Pivot table
pivot = pd.pivot_table(
df, values="amount", index="region", columns="status",
aggfunc="sum", fill_value=0, margins=True
)
print(summary)
print(pivot)Use Cases
- Sales reporting by region and time period
- Computing KPIs per customer segment
- Building summary dashboards from raw data
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
String Aggregation with GROUP BY
Concatenate grouped values into comma-separated strings using STRING_AGG with ordering and filtering.
Pandas DataFrame Transformations
Common pandas DataFrame transformations including column operations, type casting, and string methods.
Pandas DataFrame Filtering Techniques
Filter DataFrames using boolean masks, query syntax, isin, between, and string matching methods.
JSON Aggregation and Querying
Aggregate related rows into JSON arrays and query JSONB columns with PostgreSQL native JSON operators.