</>SnippetsLabBuild faster with production-ready snippets

pythonbeginner

Local LLM with Ollama Python Client

Run local open-source models with Ollama and stream responses using the Python API.

pythonPress ⌘/Ctrl + Shift + C to copy

from ollama import Client
import ollama

# Synchronous call
client = Client(host='http://localhost:11434')
response = client.chat(model='llama3.2', messages=[{'role':'user','content':'What is machine learning?'}])
print(response['message']['content'])

# Streaming
print('\nStreaming response:')
for chunk in ollama.chat(model='llama3.2', messages=[{'role':'user','content':'Write a haiku about Python.'}], stream=True):
    print(chunk['message']['content'], end='', flush=True)
print()

# Generate embeddings locally
embeddings = ollama.embeddings(model='nomic-embed-text', prompt='Python is great for data science.')
print(f'Embedding dim: {len(embeddings["embedding"])}')

Use Cases

local AI
private inference
offline LLM apps

Tags

#ollama #local-llm #streaming #python

Related Snippets

Similar patterns you can reuse in the same workflow.

typescriptintermediate

Ollama Local LLM Inference

Run local LLM inference using Ollama REST API with streaming and model management.

Best for: local development

Anthropic Streaming with Python

Stream Claude responses token by token using the Anthropic Python SDK with context manager.

Best for: streaming responses

#anthropic#claude

pythonintermediate

Python Streaming Data Processing

Process streaming data with generators, windowed aggregation, and memory-efficient line-by-line reading.

Best for: Processing large event log files efficiently

#streaming#python

Kafka Consumer in Python — Stream Processing

Build a Kafka consumer in Python with offset management, error handling, and batch processing.

Best for: Real-time event processing from Kafka topics

#kafka#streaming