pythonbeginner

Read NDJSON / JSON Lines Files

Efficiently read newline-delimited JSON (NDJSON) log files into a pandas DataFrame.

python
import pandas as pd

# Read NDJSON (one JSON object per line)
df = pd.read_json('events.ndjson', lines=True)

# Streaming (for large files)
chunks = pd.read_json('large_events.ndjson', lines=True, chunksize=50_000)
results = []
for chunk in chunks:
    chunk = chunk[chunk['action'] == 'purchase']
    results.append(chunk)

filtered = pd.concat(results, ignore_index=True)
print(f'Purchase events: {len(filtered):,}')

Use Cases

  • log file ingestion
  • event stream loading
  • NDJSON ETL

Tags

Related Snippets

Similar patterns you can reuse in the same workflow.