pythonintermediate

Pandas GroupBy Transform Patterns

Use groupby().transform() to compute group-level statistics and broadcast them back to row level.

python
import pandas as pd
import numpy as np

df = pd.DataFrame({'dept':['eng','eng','hr','hr','eng'],'salary':[90,85,60,65,95],'bonus':[10,8,5,6,12]})

df['dept_avg']    = df.groupby('dept')['salary'].transform('mean')
df['dept_total']  = df.groupby('dept')['salary'].transform('sum')
df['salary_rank'] = df.groupby('dept')['salary'].transform('rank', ascending=False)
df['normalised']  = (df['salary'] - df['dept_avg']) / df.groupby('dept')['salary'].transform('std')

print(df)

Use Cases

  • feature engineering
  • normalisation within groups
  • ranking

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.