pythonbeginner
Generate Synthetic Data with Faker
Create realistic test datasets for development and testing using the Faker library.
pythonPress ⌘/Ctrl + Shift + C to copy
from faker import Faker
import pandas as pd
import random
faker = Faker()
Faker.seed(42)
random.seed(42)
rows = [
{
'id': i + 1,
'name': faker.name(),
'email': faker.email(),
'address': faker.address().replace('\n', ', '),
'company': faker.company(),
'created_at': faker.date_time_between('-2y', 'now').isoformat(),
'score': round(random.uniform(0, 100), 2),
}
for i in range(1000)
]
df = pd.DataFrame(rows)
df.to_csv('synthetic_users.csv', index=False)
print(df.dtypes)Use Cases
- test data generation
- demo datasets
- ML training fixtures
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Pytest Fixtures and Parametrize
Reusable pytest fixtures with scope control, parametrize for data-driven tests, and temporary resources.
Best for: Unit test setup
#pytest#testing
pythonintermediate
Data Quality Testing with Expectations
Define and run data quality expectations for automated validation in data pipelines.
Best for: Automated data quality gates in pipelines
#data-quality#testing
sqlintermediate
SQL Data Quality Checks and Assertions
Reusable SQL queries for data quality: null checks, uniqueness, referential integrity, and freshness.
Best for: Automated data quality gates in ETL pipelines
#sql#data-quality
sqlbeginner
dbt Source Freshness and Testing
Configure dbt source freshness checks and schema tests to validate upstream data pipelines.
Best for: Ensuring upstream data sources are fresh
#dbt#testing