pythonintermediate
Pandera @check_input and @check_output
Decorate pipeline functions with Pandera schema validators to enforce input and output contracts.
pythonPress ⌘/Ctrl + Shift + C to copy
import pandera as pa
from pandera import check_input, check_output
import pandas as pd
input_schema = pa.DataFrameSchema({'price': pa.Column(float, pa.Check.gt(0)), 'qty': pa.Column(int, pa.Check.gt(0))})
output_schema = pa.DataFrameSchema({'price': pa.Column(float), 'qty': pa.Column(int), 'revenue': pa.Column(float, pa.Check.ge(0))})
@check_input(input_schema)
@check_output(output_schema)
def add_revenue(df: pd.DataFrame) -> pd.DataFrame:
return df.assign(revenue=df['price'] * df['qty'])
df = pd.DataFrame({'price':[10.0,20.0],'qty':[3,5]})
print(add_revenue(df))Use Cases
- contract testing
- data pipeline QA
- typed ETL functions
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Pandera DataFrame Schema Validation
Use Pandera to validate DataFrame schemas with type checks, value constraints, and custom checks.
Best for: pipeline input validation
#pandera#validation
pythonintermediate
Pydantic Models for ETL Validation
Parse and validate raw JSON records against Pydantic models before inserting into a database.
Best for: input validation
#pydantic#validation
pythonadvanced
Python ETL Pipeline Example
Complete extract-transform-load pipeline with error handling, logging, and incremental processing.
Best for: Automating data ingestion from CSV to warehouse
#etl#pipeline
pythonintermediate
Python Batch Processing Script
Process large files in configurable batches with progress tracking, error handling, and resume support.
Best for: Processing large CSV files that don't fit in memory
#batch-processing#python