pythonintermediate
Pareto / Cumulative Share Analysis
Calculate cumulative share (Pareto 80/20) of values for product or customer ranking analysis.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandas as pd
import numpy as np
df = pd.DataFrame({'product': [f'P{i:03d}' for i in range(50)],'revenue': np.random.pareto(1.5, 50) * 1000}).sort_values('revenue', ascending=False).reset_index(drop=True)
df['cum_revenue'] = df['revenue'].cumsum()
df['cum_share'] = df['cum_revenue'] / df['revenue'].sum()
df['rank'] = df.index + 1
top_80 = df[df['cum_share'] <= 0.80]
print(f'Top {len(top_80)} products drive 80% of revenue')
print(df.head(10)[['product','revenue','cum_share']])Use Cases
- product analytics
- customer segmentation
- 80/20 analysis
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
sqlintermediate
SQL Running Totals and Cumulative Metrics
Calculate running totals, cumulative counts, and percent-of-total using window functions and partitions.
Best for: Building cumulative revenue dashboards
#sql#window-functions
pythonbeginner
Pandas Cross-Tabulation (crosstab)
Compute frequency and proportion cross-tabulations between two categorical columns.
Best for: categorical analysis
#pandas#crosstab
pythonbeginner
Pandas Rank with Tie-Breaking Methods
Apply different ranking strategies (min, dense, average) and handle ties in pandas.
Best for: leaderboards
#pandas#ranking
pythonbeginner
Pandas nlargest / nsmallest
Efficiently retrieve the N largest or smallest rows without sorting the full DataFrame.
Best for: top-N queries
#pandas#top-n