Cohere Embed V4: Multilingual Embeddings for Global RAG Systems
Cohere released Embed V4 with support for 100+ languages, improved retrieval accuracy, and reduced dimensionality for faster vector search.

TL;DR
- Embed V4 supports 100+ languages with unified embedding space.
- 1024 dimensions (vs 1536 for OpenAI) = 33% faster vector search.
- Improved retrieval: +12% accuracy on MTEB benchmark vs V3.
- Pricing: $0.10/million tokens (same as OpenAI text-embedding-3-small).
# Cohere Embed V4: Multilingual Embeddings for Global RAG Systems
Cohere launched Embed V4 in November 2024, significantly expanding multilingual support from 100 to 100+ languages while improving retrieval accuracy and reducing computational overhead. For companies building RAG systems serving global users, V4 enables single-model deployment across markets instead of language-specific embedding models.
Key improvements
Multilingual coverage
V3: 100 languages (good but gaps in regional languages)
V4: 100+ languages including:
- Major: English, Chinese, Spanish, Arabic, French, German, Japanese
- Regional: Swahili, Bengali, Vietnamese, Thai, Turkish
- Low-resource: Hausa, Zulu, Pashto
Unified embedding space: All languages map to same 1024-dimensional space, enabling cross-lingual search (query in English, retrieve German documents).
Retrieval accuracy
MTEB (Massive Text Embedding Benchmark) scores:
| Model | Avg score | Retrieval | Classification |
|---|---|---|---|
| Cohere Embed V4 | 69.8% | 58.2% | 78.4% |
| Cohere Embed V3 | 62.3% | 52.1% | 74.2% |
| OpenAI text-embedding-3-small | 62.3% | 49.2% | 70.9% |
| OpenAI text-embedding-3-large | 64.6% | 54.9% | 75.4% |
V4 leads on retrieval tasks (RAG use case).
Dimensionality reduction
1024 dimensions vs 1536 (OpenAI), 3072 (OpenAI large)
Benefits:
- 33% faster vector similarity calculations
- 33% less storage required
- Maintains 95%+ of retrieval quality
Trade-off: Slightly less precision for classification tasks (acceptable for most RAG systems).
"Execution beats strategy every time. A good plan well-executed will outperform a perfect plan poorly executed." - Marc Benioff, CEO at Salesforce
Implementation
import cohere
co = cohere.Client(api_key="...")
# Embed documents (any language)
docs = [
"AI is transforming healthcare", # English
"Die KI verändert das Gesundheitswesen", # German
"الذكاء الاصطناعي يحول الرعاية الصحية" # Arabic
]
doc_embeds = co.embed(
texts=docs,
model="embed-v4",
input_type="search_document"
).embeddings
# Embed query (different language OK)
query = "How is AI used in medicine?"
query_embed = co.embed(
texts=[query],
model="embed-v4",
input_type="search_query"
).embeddings[0]
# Search across all languages
similarities = cosine_similarity([query_embed], doc_embeds)
# Returns high similarity to all three documents despite language differencesUse cases
1. Cross-lingual customer support
Index support docs in multiple languages, enable search in user's preferred language.
2. Multilingual knowledge bases
Companies with global teams can search unified knowledge base regardless of document language.
3. International e-commerce
Product search works across localized descriptions (search in English, find products described in Chinese/Spanish).
Pricing
| Model | Price ($/M tokens) | Dimensions | Languages |
|---|---|---|---|
| Cohere Embed V4 | $0.10 | 1024 | 100+ |
| Cohere Embed V3 | $0.10 | 1024 | 100 |
| OpenAI text-embedding-3-small | $0.02 | 1536 | ~40 |
| OpenAI text-embedding-3-large | $0.13 | 3072 | ~40 |
Value proposition: Better multilingual support than OpenAI at competitive price.
Migration from V3
Breaking changes: None -V4 is drop-in replacement
Recommended approach:
- Re-embed knowledge base with V4
- Run parallel testing (V3 vs V4 retrieval accuracy)
- Cutover once validated
Timeline: 2-3 days for most applications
Call-to-action (Consideration stage) Test Cohere Embed V4 in the playground with multilingual queries.
FAQs
Can I mix V3 and V4 embeddings?
No, incompatible embedding spaces. Must fully migrate or maintain separate indexes.
Does it work with pgvector/Pinecone?
Yes, standard dense vectors compatible with all major vector databases.
How does cross-lingual retrieval work?
Embeddings trained on parallel corpora so semantically similar text in different languages maps to nearby vectors.
Is there a self-hosted option?
No, API-only currently.
Summary
Cohere Embed V4 expands multilingual support to 100+ languages with improved retrieval accuracy and reduced dimensionality. Best for global RAG systems requiring cross-lingual search. OpenAI remains cheaper for English-only use cases.
Internal links:
External references:
- Cohere Embed V4 Announcement – launch post
- MTEB Leaderboard – benchmark results
Crosslinks:
More from the blog
OpenHelm vs runCLAUDErun: Which Claude Code Scheduler Is Right for You?
A direct comparison of the two most popular Claude Code schedulers, how each works, what each costs, and which fits your workflow.
Claude Code vs Cursor Pro: Real Developer Cost Comparison
An honest look at what developers actually spend on Claude Code, Cursor Pro, and GitHub Copilot, and how to get the most from each.