Overview
Embeddings are numerical representations of text, images, or other data. Use them for semantic search, similarity matching, and building AI applications.
Concepts
- Embeddings: Fixed-size vectors representing meaning
- Vector Databases: Specialized databases for vector search
- Similarity Search: Find semantically similar items
- Semantic Search: Search by meaning rather than keywords
- Recommendations: Suggest relevant items based on similarity
Generate Embeddings
npm install openai
import OpenAI from 'openai'
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
})
async function getEmbedding(text: string) {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text
})
return response.data[0].embedding
}
// Usage
const embedding = await getEmbedding('Hello world')
console.log(embedding.length) // 1536 dimensions
Vector Database with Pinecone
import { Pinecone } from '@pinecone-database/pinecone'
const pc = new Pinecone({
apiKey: process.env.PINECONE_API_KEY
})
const index = pc.index('documents')
// Store embedding
await index.upsert([{
id: 'doc-1',
values: embedding,
metadata: { text: 'Hello world', source: 'greeting' }
}])
// Search similar items
const results = await index.query({
vector: embedding,
topK: 5,
includeMetadata: true
})
console.log(results.matches) // Top 5 similar documents
Semantic Search Implementation
async function semanticSearch(query: string, topK: number = 5) {
const queryEmbedding = await getEmbedding(query)
const results = await index.query({
vector: queryEmbedding,
topK,
includeMetadata: true
})
return results.matches.map(match => ({
text: match.metadata?.text,
score: match.score,
source: match.metadata?.source
}))
}
const results = await semanticSearch('What is machine learning?')
Embeddings enable semantic understanding without training custom AI models.