pythonadvanced

Efficient One-Hot Pivot with Sparse

Create a sparse user-item matrix from transaction logs for recommendation or ML use cases.

python
import pandas as pd

transactions = pd.DataFrame({'user_id':[1,1,2,3,2,1],'item':['A','B','B','C','A','C']})

# Pivot to binary user-item matrix
matrix = (
    transactions.assign(val=1)
    .pivot_table(index='user_id', columns='item', values='val', fill_value=0, aggfunc='max')
    .astype(pd.SparseDtype('int8', fill_value=0))
)

print(matrix)
print(f'Density: {matrix.sparse.density:.1%}')

Use Cases

  • recommendation systems
  • feature matrices
  • user-item modeling

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.