pythonbeginner
OpenAI GPT-4 Vision Image Analysis
Analyse images from URLs or base64 with GPT-4 Vision for structured visual understanding.
pythonPress ⌘/Ctrl + Shift + C to copy
from openai import OpenAI
import base64
from pathlib import Path
client = OpenAI()
# Analyse from URL
response_url = client.chat.completions.create(
model='gpt-4o-mini',
messages=[{'role':'user','content':[{'type':'text','text':'What is in this chart? Summarise key metrics.'},{'type':'image_url','image_url':{'url':'https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/Image_created_with_a_mobile_phone.png/640px-Image_created_with_a_mobile_phone.png','detail':'high'}}]}],
max_tokens=300,
)
print(response_url.choices[0].message.content)
# Analyse local file
img_b64 = base64.b64encode(Path('chart.png').read_bytes()).decode()
response_local = client.chat.completions.create(
model='gpt-4o-mini',
messages=[{'role':'user','content':[{'type':'text','text':'Describe this image.'},{'type':'image_url','image_url':{'url':f'data:image/png;base64,{img_b64}'}}]}],
max_tokens=200,
)
print(response_local.choices[0].message.content)Use Cases
- visual Q&A
- chart analysis
- product image understanding
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
pythonintermediate
Analyze Images with GPT Vision API
Send images to GPT-4o for description, analysis, and visual Q&A.
Best for: Image analysis
#ai#vision
typescriptintermediate
OpenAI Vision API Image Analysis
Analyze images using GPT-4o vision capabilities with base64 and URL inputs.
Best for: image captioning
#ai#vision
pythonadvanced
Multimodal RAG with Images and Text
Build a multimodal RAG pipeline that retrieves and answers questions about image+text documents.
Best for: visual document Q&A
#multimodal#rag
typescriptintermediate
OpenAI Chat Completion with Streaming
Stream GPT responses token-by-token using the OpenAI SDK with async iteration.
Best for: chatbot UI
#openai#streaming