pythonintermediate
Semantic Similarity Search with Embeddings
Compute and compare text embeddings for semantic search and matching.
pythonPress ⌘/Ctrl + Shift + C to copy
import numpy as np
from openai import OpenAI
client = OpenAI()
def get_embeddings(texts: list[str]) -> np.ndarray:
response = client.embeddings.create(
model="text-embedding-3-small",
input=texts
)
return np.array([e.embedding for e in response.data])
def cosine_similarity(a: np.ndarray, b: np.ndarray) -> float:
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# Build document index
documents = [
"Python is a programming language",
"Machine learning uses statistical models",
"React is a JavaScript UI library",
"Neural networks are inspired by the brain",
"CSS is used for styling web pages"
]
doc_embeddings = get_embeddings(documents)
# Search
query = "deep learning frameworks"
query_embedding = get_embeddings([query])[0]
# Rank by similarity
scores = [
cosine_similarity(query_embedding, doc_emb)
for doc_emb in doc_embeddings
]
results = sorted(zip(scores, documents), reverse=True)
for score, doc in results[:3]:
print(f" {score:.3f}: {doc}")Use Cases
- Semantic search
- Document matching
- Recommendation systems
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
typescriptintermediate
Cosine Similarity for Embeddings
Compute cosine similarity between embedding vectors for semantic search and recommendation systems.
Best for: Semantic search over documents
#embeddings#similarity
typescriptadvanced
RAG Pipeline Implementation
Build a retrieval-augmented generation pipeline that grounds LLM answers in your own documents.
Best for: Grounding LLM answers in private documents
#ai#rag
pythonadvanced
Build a RAG Pipeline with LangChain
Implement retrieval-augmented generation using LangChain, embeddings, and a vector store.
Best for: Knowledge base Q&A
#ai#langchain
pythonintermediate
Batch Process Embeddings Efficiently
Process large datasets of embeddings with batching, caching, and rate limiting.
Best for: Large-scale indexing
#ai#embeddings