</>SnippetsLabBuild faster with production-ready snippets

pythonbeginner

Generate Synthetic Data with Faker

Create realistic test datasets for development and testing using the Faker library.

pythonPress ⌘/Ctrl + Shift + C to copy

from faker import Faker
import pandas as pd
import random

faker = Faker()
Faker.seed(42)
random.seed(42)

rows = [
    {
        'id':         i + 1,
        'name':       faker.name(),
        'email':      faker.email(),
        'address':    faker.address().replace('\n', ', '),
        'company':    faker.company(),
        'created_at': faker.date_time_between('-2y', 'now').isoformat(),
        'score':      round(random.uniform(0, 100), 2),
    }
    for i in range(1000)
]
df = pd.DataFrame(rows)
df.to_csv('synthetic_users.csv', index=False)
print(df.dtypes)

Use Cases

test data generation
demo datasets
ML training fixtures

Tags

#faker #testing #synthetic-data #fixtures

Related Snippets

Similar patterns you can reuse in the same workflow.

pythonintermediate

Pytest Fixtures and Parametrize

Reusable pytest fixtures with scope control, parametrize for data-driven tests, and temporary resources.

Best for: Unit test setup

#pytest#testing

pythonintermediate

Data Quality Testing with Expectations

Define and run data quality expectations for automated validation in data pipelines.

Best for: Automated data quality gates in pipelines

#data-quality#testing

sqlintermediate

SQL Data Quality Checks and Assertions

Reusable SQL queries for data quality: null checks, uniqueness, referential integrity, and freshness.

Best for: Automated data quality gates in ETL pipelines

#sql#data-quality

dbt Source Freshness and Testing

Configure dbt source freshness checks and schema tests to validate upstream data pipelines.

Best for: Ensuring upstream data sources are fresh