pythonintermediate
Pandas Merge and Join Examples
Combine DataFrames using merge, join, and concat with different join types and key handling.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandas as pd
orders = pd.DataFrame({
"order_id": [1, 2, 3, 4],
"customer_id": [10, 20, 30, 40],
"amount": [100, 200, 150, 300],
})
customers = pd.DataFrame({
"customer_id": [10, 20, 50],
"name": ["Alice", "Bob", "Charlie"],
"region": ["US", "EU", "APAC"],
})
# Inner join (default) — only matching rows
inner = pd.merge(orders, customers, on="customer_id")
# Left join — keep all orders
left = pd.merge(orders, customers, on="customer_id", how="left")
# Outer join — keep everything
outer = pd.merge(orders, customers, on="customer_id", how="outer")
# Merge on different column names
renamed = pd.merge(orders, customers, left_on="customer_id", right_on="customer_id")
# Multi-key merge
result = pd.merge(df1, df2, on=["date", "product_id"], how="inner")
# Concat DataFrames vertically
stacked = pd.concat([df_jan, df_feb, df_mar], ignore_index=True)
# Concat horizontally
side_by_side = pd.concat([df_features, df_labels], axis=1)
# Merge with indicator to see join result
diag = pd.merge(orders, customers, on="customer_id", how="outer", indicator=True)
print(diag["_merge"].value_counts())Use Cases
- Combining data from multiple sources
- Enriching transactional data with reference data
- Diagnosing data completeness across tables
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonbeginner
Pandas DataFrame Transformations
Common pandas DataFrame transformations including column operations, type casting, and string methods.
Best for: Cleaning raw data files for analysis
#pandas#dataframe
pythonbeginner
Pandas DataFrame Filtering Techniques
Filter DataFrames using boolean masks, query syntax, isin, between, and string matching methods.
Best for: Extracting subsets of data for reporting
#pandas#filtering
pythonintermediate
Pandas merge_asof for Time-Based Joins
Perform an as-of join to match events to the most recent reference record within a time window.
Best for: tick data joins
#pandas#merge-asof
pythonintermediate
Pandas Merge with Validation
Use merge() validate parameter to catch unexpected many-to-many or missing key issues in joins.
Best for: data integrity
#pandas#merge