pythonadvanced
Ibis Portable DataFrame SQL
Write backend-agnostic analytics queries with Ibis that compile to DuckDB, BigQuery, or Spark.
pythonPress ⌘/Ctrl + Shift + C to copy
import ibis
ibis.options.interactive = True
con = ibis.duckdb.connect(':memory:')
con.create_table('events', ibis.memtable({'user_id':[1,1,2,2,3],'action':['click','view','click','buy','view'],'value':[1.0,2.0,1.5,50.0,2.0]}))
t = con.table('events')
result = (
t.group_by('user_id')
.aggregate(
actions=t['action'].count(),
total_value=t['value'].sum(),
)
.order_by('total_value', ascending=False)
)
print(result.execute())Use Cases
- portable analytics
- backend-agnostic queries
- lakehouse exploration
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonbeginner
DuckDB In-Memory Analytics
Run fast analytical SQL on pandas DataFrames or Parquet files without a server using DuckDB.
Best for: serverless analytics
#duckdb#analytics
pythonadvanced
Spark SQL Query Example
PySpark DataFrame operations with SQL queries, window functions, and aggregations for big data.
Best for: Processing large-scale datasets with Spark
#spark#pyspark
sqlintermediate
SQL Incremental Load Pattern
Incremental data load using watermark tracking to process only new and updated records efficiently.
Best for: Efficient warehouse loading without full reloads
#sql#incremental-load
sqlintermediate
SQL Data Deduplication Techniques
Remove duplicate records using ROW_NUMBER, DISTINCT ON, and self-join deduplication strategies.
Best for: Cleaning duplicate records in production databases
#sql#deduplication