pythonbeginner
Local LLM with Ollama Python Client
Run local open-source models with Ollama and stream responses using the Python API.
pythonPress ⌘/Ctrl + Shift + C to copy
from ollama import Client
import ollama
# Synchronous call
client = Client(host='http://localhost:11434')
response = client.chat(model='llama3.2', messages=[{'role':'user','content':'What is machine learning?'}])
print(response['message']['content'])
# Streaming
print('\nStreaming response:')
for chunk in ollama.chat(model='llama3.2', messages=[{'role':'user','content':'Write a haiku about Python.'}], stream=True):
print(chunk['message']['content'], end='', flush=True)
print()
# Generate embeddings locally
embeddings = ollama.embeddings(model='nomic-embed-text', prompt='Python is great for data science.')
print(f'Embedding dim: {len(embeddings["embedding"])}')Use Cases
- local AI
- private inference
- offline LLM apps
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
typescriptintermediate
Ollama Local LLM Inference
Run local LLM inference using Ollama REST API with streaming and model management.
Best for: local development
#ai#ollama
pythonbeginner
Anthropic Streaming with Python
Stream Claude responses token by token using the Anthropic Python SDK with context manager.
Best for: streaming responses
#anthropic#claude
pythonintermediate
Python Streaming Data Processing
Process streaming data with generators, windowed aggregation, and memory-efficient line-by-line reading.
Best for: Processing large event log files efficiently
#streaming#python
pythonadvanced
Kafka Consumer in Python — Stream Processing
Build a Kafka consumer in Python with offset management, error handling, and batch processing.
Best for: Real-time event processing from Kafka topics
#kafka#streaming