pythonintermediate
Pandas Vectorised Operations vs Apply
Compare apply vs vectorised pandas operations for performance-critical column transformations.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandas as pd
import numpy as np
df = pd.DataFrame({'price': np.random.rand(1_000_000) * 100, 'qty': np.random.randint(1, 10, 1_000_000)})
# Fast: vectorised
df['revenue'] = df['price'] * df['qty']
df['tier'] = np.where(df['revenue'] > 500, 'high', 'low')
print(df.head())Use Cases
- feature engineering
- column transformations
- ETL pre-processing
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Read Large CSV in Chunks with Pandas
Process CSV files larger than RAM by reading in chunks — memory-efficient ETL pattern for data pipelines.
Best for: Processing multi-GB CSV files without running out of memory
#pandas#csv
pythonintermediate
Polars DataFrame Operations
High-performance DataFrame operations using Polars: filtering, groupby, joins, and lazy evaluation.
Best for: data transformation
#polars#dataframe
pythonbeginner
SQLite + Pandas Local Data Pipeline
Run a lightweight local ETL with SQLite and pandas: load CSV, transform, persist to SQLite.
Best for: local analytics
#sqlite#pandas
pythonintermediate
Multiprocessing Pool for ETL
Parallelise CPU-bound ETL transformations across multiple CPU cores using multiprocessing.Pool.
Best for: parallel file processing
#multiprocessing#parallel