pythonintermediate

Pandas IntervalIndex for Binning

Use IntervalIndex and pd.cut to bin continuous variables into labelled categories.

python
import pandas as pd
import numpy as np

scores = np.random.randint(0, 100, 200)

bins   = [0, 60, 70, 80, 90, 100]
labels = ['F', 'D', 'C', 'B', 'A']

df = pd.DataFrame({'score': scores})
df['grade'] = pd.cut(df['score'], bins=bins, labels=labels, right=True)

print(df['grade'].value_counts().sort_index())
print(df.groupby('grade')['score'].mean())

Use Cases

  • grading systems
  • risk tiering
  • feature binning

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.