pythonintermediate
Cache Embeddings in Redis
Cache expensive embedding API calls in Redis to avoid redundant computation and reduce costs.
pythonPress ⌘/Ctrl + Shift + C to copy
import json, hashlib, redis
from openai import OpenAI
oai = OpenAI()
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
TTL = 86_400 # 1 day
def get_embedding(text: str, model: str = 'text-embedding-3-small') -> list[float]:
key = 'emb:' + hashlib.sha256(f'{model}:{text}'.encode()).hexdigest()[:16]
cached = r.get(key)
if cached:
return json.loads(cached)
resp = oai.embeddings.create(input=text, model=model)
emb = resp.data[0].embedding
r.setex(key, TTL, json.dumps(emb))
return emb
v = get_embedding('Hello, semantic world!')
print(f'Embedding dim: {len(v)}')Use Cases
- cost reduction
- embedding caching
- semantic search optimization
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
typescriptbeginner
Generate Text Embeddings with OpenAI
Create vector embeddings for semantic search and similarity matching using text-embedding-3-small.
Best for: semantic search
#openai#embeddings
typescriptadvanced
RAG Pipeline (Retrieve + Augment + Generate)
Minimal RAG implementation: embed a query, retrieve top-k chunks, inject into prompt.
Best for: document Q&A
#rag#embeddings
typescriptadvanced
Semantic Caching Layer for LLM Calls
Cache LLM responses by semantic similarity of prompts to reduce API costs and improve latency.
Best for: Reducing LLM API costs for repeated queries
#caching#embeddings
typescriptintermediate
Batch Embeddings Processing
Generate embeddings for large document sets in batches with rate limiting and progress tracking.
Best for: Indexing large document collections for search
#embeddings#batch-processing