pythonintermediate

Pandas Merge and Join Examples

Combine DataFrames using merge, join, and concat with different join types and key handling.

python
import pandas as pd

orders = pd.DataFrame({
    "order_id": [1, 2, 3, 4],
    "customer_id": [10, 20, 30, 40],
    "amount": [100, 200, 150, 300],
})

customers = pd.DataFrame({
    "customer_id": [10, 20, 50],
    "name": ["Alice", "Bob", "Charlie"],
    "region": ["US", "EU", "APAC"],
})

# Inner join (default) — only matching rows
inner = pd.merge(orders, customers, on="customer_id")

# Left join — keep all orders
left = pd.merge(orders, customers, on="customer_id", how="left")

# Outer join — keep everything
outer = pd.merge(orders, customers, on="customer_id", how="outer")

# Merge on different column names
renamed = pd.merge(orders, customers, left_on="customer_id", right_on="customer_id")

# Multi-key merge
result = pd.merge(df1, df2, on=["date", "product_id"], how="inner")

# Concat DataFrames vertically
stacked = pd.concat([df_jan, df_feb, df_mar], ignore_index=True)

# Concat horizontally
side_by_side = pd.concat([df_features, df_labels], axis=1)

# Merge with indicator to see join result
diag = pd.merge(orders, customers, on="customer_id", how="outer", indicator=True)
print(diag["_merge"].value_counts())

Use Cases

  • Combining data from multiple sources
  • Enriching transactional data with reference data
  • Diagnosing data completeness across tables

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.