pythonintermediate
Stream Large SQL Query in Chunks
Read millions of rows from SQL in memory-safe chunks using pandas read_sql with chunksize.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql://user:pass@localhost/db')
results = []
for chunk in pd.read_sql(
"SELECT * FROM events WHERE created_at >= '2024-01-01'",
con=engine,
chunksize=50_000,
):
chunk['hour'] = chunk['created_at'].dt.hour
results.append(chunk[['event_id','user_id','hour']])
df = pd.concat(results, ignore_index=True)
print(df.shape)Use Cases
- large table extraction
- memory-safe ETL
- incremental processing
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Read Large CSV in Chunks with Pandas
Process CSV files larger than RAM by reading in chunks — memory-efficient ETL pattern for data pipelines.
Best for: Processing multi-GB CSV files without running out of memory
#pandas#csv
pythonintermediate
Pandas Memory Reduction via Dtypes
Reduce DataFrame memory by 60-80% by downcasting numeric types and using categorical columns.
Best for: large dataset loading
#pandas#memory
pythonbeginner
Pandas Category Dtype Optimization
Convert string columns to categorical dtype to dramatically reduce memory and speed up groupby.
Best for: memory optimization
#pandas#category
pythonintermediate
Pandas Apply with Chunked Progress
Apply a function to a large DataFrame in chunked batches to avoid memory spikes and track progress.
Best for: memory-safe transforms
#pandas#apply