pythonintermediate

Pandera @check_input and @check_output

Decorate pipeline functions with Pandera schema validators to enforce input and output contracts.

python
import pandera as pa
from pandera import check_input, check_output
import pandas as pd

input_schema = pa.DataFrameSchema({'price': pa.Column(float, pa.Check.gt(0)), 'qty': pa.Column(int, pa.Check.gt(0))})
output_schema = pa.DataFrameSchema({'price': pa.Column(float), 'qty': pa.Column(int), 'revenue': pa.Column(float, pa.Check.ge(0))})

@check_input(input_schema)
@check_output(output_schema)
def add_revenue(df: pd.DataFrame) -> pd.DataFrame:
    return df.assign(revenue=df['price'] * df['qty'])

df = pd.DataFrame({'price':[10.0,20.0],'qty':[3,5]})
print(add_revenue(df))

Use Cases

  • contract testing
  • data pipeline QA
  • typed ETL functions

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.