pythonintermediate
Pandas Apply with Chunked Progress
Apply a function to a large DataFrame in chunked batches to avoid memory spikes and track progress.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandas as pd
import numpy as np
def process_batch(df_chunk: pd.DataFrame) -> pd.DataFrame:
df_chunk = df_chunk.copy()
df_chunk['result'] = df_chunk['value'] ** 2 + df_chunk['value']
return df_chunk
df = pd.DataFrame({'value': np.random.rand(100_000)})
chunk_size = 10_000
chunks = [df.iloc[i:i+chunk_size] for i in range(0, len(df), chunk_size)]
processed = pd.concat([process_batch(c) for c in chunks], ignore_index=True)
print(f'Processed {len(processed):,} rows')
print(processed.describe())Use Cases
- memory-safe transforms
- chunked processing
- large DataFrame ops
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Stream Large SQL Query in Chunks
Read millions of rows from SQL in memory-safe chunks using pandas read_sql with chunksize.
Best for: large table extraction
#pandas#sql
pythonbeginner
Pandas DataFrame Transformations
Common pandas DataFrame transformations including column operations, type casting, and string methods.
Best for: Cleaning raw data files for analysis
#pandas#dataframe
pythonbeginner
Pandas DataFrame Filtering Techniques
Filter DataFrames using boolean masks, query syntax, isin, between, and string matching methods.
Best for: Extracting subsets of data for reporting
#pandas#filtering
pythonintermediate
Pandas GroupBy Aggregation Examples
GroupBy operations with multiple aggregations, named aggregations, and transform for DataFrame analysis.
Best for: Sales reporting by region and time period
#pandas#groupby