News

Cohere Embed V4: Multilingual Embeddings for Global RAG Systems

Cohere released Embed V4 with support for 100+ languages, improved retrieval accuracy, and reduced dimensionality for faster vector search.

Max Beech· Founder

·Nov 5, 2025·6 min read

TL;DR

Embed V4 supports 100+ languages with unified embedding space.
1024 dimensions (vs 1536 for OpenAI) = 33% faster vector search.
Improved retrieval: +12% accuracy on MTEB benchmark vs V3.
Pricing: $0.10/million tokens (same as OpenAI text-embedding-3-small).

# Cohere Embed V4: Multilingual Embeddings for Global RAG Systems

Cohere launched Embed V4 in November 2024, significantly expanding multilingual support from 100 to 100+ languages while improving retrieval accuracy and reducing computational overhead. For companies building RAG systems serving global users, V4 enables single-model deployment across markets instead of language-specific embedding models.

Key improvements

Multilingual coverage

V3: 100 languages (good but gaps in regional languages)

V4: 100+ languages including:

Major: English, Chinese, Spanish, Arabic, French, German, Japanese
Regional: Swahili, Bengali, Vietnamese, Thai, Turkish
Low-resource: Hausa, Zulu, Pashto

Unified embedding space: All languages map to same 1024-dimensional space, enabling cross-lingual search (query in English, retrieve German documents).

Retrieval accuracy

MTEB (Massive Text Embedding Benchmark) scores:

Model	Avg score	Retrieval	Classification
Cohere Embed V4	69.8%	58.2%	78.4%
Cohere Embed V3	62.3%	52.1%	74.2%
OpenAI text-embedding-3-small	62.3%	49.2%	70.9%
OpenAI text-embedding-3-large	64.6%	54.9%	75.4%

V4 leads on retrieval tasks (RAG use case).

Dimensionality reduction

1024 dimensions vs 1536 (OpenAI), 3072 (OpenAI large)

Benefits:

33% faster vector similarity calculations
33% less storage required
Maintains 95%+ of retrieval quality

Trade-off: Slightly less precision for classification tasks (acceptable for most RAG systems).

"Execution beats strategy every time. A good plan well-executed will outperform a perfect plan poorly executed." - Marc Benioff, CEO at Salesforce

Implementation

import cohere

co = cohere.Client(api_key="...")

# Embed documents (any language)
docs = [
    "AI is transforming healthcare",  # English
    "Die KI verändert das Gesundheitswesen",  # German
    "الذكاء الاصطناعي يحول الرعاية الصحية"  # Arabic
]

doc_embeds = co.embed(
    texts=docs,
    model="embed-v4",
    input_type="search_document"
).embeddings

# Embed query (different language OK)
query = "How is AI used in medicine?"
query_embed = co.embed(
    texts=[query],
    model="embed-v4",
    input_type="search_query"
).embeddings[0]

# Search across all languages
similarities = cosine_similarity([query_embed], doc_embeds)
# Returns high similarity to all three documents despite language differences

Use cases

1. Cross-lingual customer support

Index support docs in multiple languages, enable search in user's preferred language.

2. Multilingual knowledge bases

Companies with global teams can search unified knowledge base regardless of document language.

3. International e-commerce

Product search works across localized descriptions (search in English, find products described in Chinese/Spanish).

Pricing

Model	Price ($/M tokens)	Dimensions	Languages
Cohere Embed V4	$0.10	1024	100+
Cohere Embed V3	$0.10	1024	100
OpenAI text-embedding-3-small	$0.02	1536	~40
OpenAI text-embedding-3-large	$0.13	3072	~40

Value proposition: Better multilingual support than OpenAI at competitive price.

Migration from V3

Breaking changes: None -V4 is drop-in replacement

Recommended approach:

Re-embed knowledge base with V4
Run parallel testing (V3 vs V4 retrieval accuracy)
Cutover once validated

Timeline: 2-3 days for most applications

Call-to-action (Consideration stage) Test Cohere Embed V4 in the playground with multilingual queries.

FAQs

Can I mix V3 and V4 embeddings?

No, incompatible embedding spaces. Must fully migrate or maintain separate indexes.

Does it work with pgvector/Pinecone?

Yes, standard dense vectors compatible with all major vector databases.

How does cross-lingual retrieval work?

Embeddings trained on parallel corpora so semantically similar text in different languages maps to nearby vectors.

Is there a self-hosted option?

No, API-only currently.

Summary

Cohere Embed V4 expands multilingual support to 100+ languages with improved retrieval accuracy and reduced dimensionality. Best for global RAG systems requiring cross-lingual search. OpenAI remains cheaper for English-only use cases.

Internal links:

External references:

Cohere Embed V4 Announcement – launch post
MTEB Leaderboard – benchmark results

Crosslinks:

See also /blog/pinecone-vs-weaviate-vs-qdrant-vector-databases

Stop doing the work around the work

OpenHelm connects to your tools, reads the context, and does the steps, so you sign off on the result instead of producing it. See how it covers an entire role’s weekly workload, check the pricing, or run it yourself with the free local app.

Book a demo Explore use cases

Back to Blog