pythonintermediate
Pandas GroupBy Aggregation Examples
GroupBy operations with multiple aggregations, named aggregations, and transform for DataFrame analysis.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandas as pd
df = pd.read_csv("orders.csv")
# Basic groupby with single aggregation
revenue_by_region = df.groupby("region")["amount"].sum()
# Multiple aggregations
stats = df.groupby("region")["amount"].agg(["sum", "mean", "count", "std"])
# Named aggregations (clean output)
summary = df.groupby("region").agg(
total_revenue=pd.NamedAgg(column="amount", aggfunc="sum"),
avg_order=pd.NamedAgg(column="amount", aggfunc="mean"),
order_count=pd.NamedAgg(column="id", aggfunc="count"),
unique_customers=pd.NamedAgg(column="customer_id", aggfunc="nunique"),
).reset_index()
# Group by multiple columns
monthly = df.groupby(["region", pd.Grouper(key="date", freq="ME")])["amount"].sum()
# Transform: add group-level calculation as a new column
df["pct_of_region"] = df.groupby("region")["amount"].transform(
lambda x: x / x.sum() * 100
)
# Filter groups: keep only regions with total > 10000
big_regions = df.groupby("region").filter(lambda g: g["amount"].sum() > 10000)
# Rank within groups
df["rank_in_region"] = df.groupby("region")["amount"].rank(
ascending=False, method="dense"
)
# Pivot table
pivot = pd.pivot_table(
df, values="amount", index="region", columns="status",
aggfunc="sum", fill_value=0, margins=True
)
print(summary)
print(pivot)Use Cases
- Sales reporting by region and time period
- Computing KPIs per customer segment
- Building summary dashboards from raw data
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Pandas Custom Aggregation Functions
Pass custom lambda and named functions to .agg() for complex groupby aggregations.
Best for: HR analytics
#pandas#groupby
pythonintermediate
Pandas Time Series Analysis
Time series operations with resampling, rolling windows, date offsets, and period conversions.
Best for: Sales trend analysis with moving averages
#pandas#time-series
pythonbeginner
Pandas Time-Series Resampling
Resample time-series data from daily to weekly/monthly frequencies with aggregation functions.
Best for: time-series analytics
#pandas#time-series
pythonintermediate
Grouped Time-Series with ffill
Forward-fill missing time-series values within groups to handle irregular measurement intervals.
Best for: IoT sensor data
#pandas#ffill