DuckDB In-Memory Analytics
Run fast analytical SQL on pandas DataFrames or Parquet files without a server using DuckDB.
import duckdb
import pandas as pd
import numpy as np
df = pd.DataFrame({'date': pd.date_range('2024-01-01', periods=365, freq='D'),'revenue': np.random.randint(1000, 5000, 365),'region': np.random.choice(['North','South','East'], 365)})
result = duckdb.query("""
SELECT
region,
date_trunc('month', date) AS month,
SUM(revenue) AS total_revenue,
AVG(revenue) AS avg_revenue
FROM df
GROUP BY 1, 2
ORDER BY 1, 2
""").df()
print(result.head(12))Use Cases
- serverless analytics
- Parquet querying
- in-process SQL
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
SQL Window Functions for Analytics
Advanced SQL window functions for running totals, rankings, moving averages, and gap analysis.
Best for: Building analytics dashboards with running totals
SQL Window Functions for Analytics
Use window functions for running totals, rankings, moving averages, and gap detection in analytics.
Best for: Building cumulative revenue dashboards
SQL Running Totals and Cumulative Metrics
Calculate running totals, cumulative counts, and percent-of-total using window functions and partitions.
Best for: Building cumulative revenue dashboards
DuckDB — Query Parquet Files with Python
Use DuckDB to query Parquet files and CSVs directly from Python without loading into memory first.
Best for: Ad-hoc analytics on Parquet files without Spark