pythonbeginner

Pandas Categorical Encoding for ML

One-hot encode, label encode, and ordinal encode categorical columns using pandas and scikit-learn.

python
import pandas as pd
from sklearn.preprocessing import LabelEncoder, OrdinalEncoder

df = pd.DataFrame({'color':['red','blue','green','red'],'size':['S','L','M','XL'],'target':[1,0,1,0]})

# One-hot encode
df_ohe = pd.get_dummies(df, columns=['color'], prefix='color')

# Label encode
le = LabelEncoder()
df['color_label'] = le.fit_transform(df['color'])

# Ordinal encode
oe = OrdinalEncoder(categories=[['S','M','L','XL']])
df['size_ord'] = oe.fit_transform(df[['size']])

print(df)

Use Cases

  • ML preprocessing
  • feature engineering
  • categorical handling

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.