Whisper Audio Transcription
Transcribe audio files to text using OpenAI Whisper API with language detection and timestamps.
import OpenAI from 'openai';
import * as fs from 'fs';
const openai = new OpenAI();
export async function transcribe(filePath: string) {
const file = fs.createReadStream(filePath);
const transcription = await openai.audio.transcriptions.create({
file,
model: 'whisper-1',
response_format: 'verbose_json',
timestamp_granularities: ['segment'],
});
return {
text: transcription.text,
language: transcription.language,
duration: transcription.duration,
segments: transcription.segments?.map((s) => ({
start: s.start,
end: s.end,
text: s.text,
})),
};
}
// Usage:
// const result = await transcribe('./podcast-episode.mp3');
// console.log(result.text);
// result.segments?.forEach(s => console.log(`[${s.start}s] ${s.text}`));Use Cases
- Podcast transcription
- Meeting notes
- Voice command processing
Tags
Related Snippets
Similar patterns you can reuse in the same workflow.
OpenAI Text-to-Speech
Generate natural speech audio from text using OpenAI TTS API with multiple voice options and formats.
OpenAI Chat Completion with Streaming
Stream GPT responses token-by-token using the OpenAI SDK with async iteration.
Generate Text Embeddings with OpenAI
Create vector embeddings for semantic search and similarity matching using text-embedding-3-small.
RAG Pipeline (Retrieve + Augment + Generate)
Minimal RAG implementation: embed a query, retrieve top-k chunks, inject into prompt.