Mistral Embed
Mistral Embed is Mistral AI's general-purpose text embedding model with 1024 dimensions, designed for semantic search and retrieval tasks with a 55.26 score on the MTEB benchmark.
import { embed } from 'ai';
const result = await embed({ model: 'mistral/mistral-embed', value: 'Sunny day at the beach',})Providers
Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.
| Provider |
|---|
More models by Mistral AI
| Model |
|---|
About Mistral Embed
Mistral Embed launched alongside La Plateforme as Mistral AI's retrieval-focused embedding endpoint. Mistral Embed produces 1024-dimensional vector representations and scores 55.26 on the Massive Text Embedding Benchmark (MTEB), a standard evaluation suite for embedding model quality.
The embedding space preserves semantic similarity for nearest-neighbor retrieval. Documents with similar meaning cluster closely, while semantically distinct texts land farther apart in the vector space.
Mistral Embed integrates into retrieval-augmented generation (RAG) architectures where a Mistral AI generation model handles question answering and Mistral Embed indexes the knowledge base. Using the same provider ecosystem for both embedding and generation simplifies the stack and keeps provider management consolidated through AI Gateway.
What To Consider When Choosing a Provider
- Configuration: If your corpus is primarily source code rather than natural language, consider Codestral Embed, which was trained specifically on code and outperforms general embedding models on code retrieval benchmarks.
- Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Mistral Embed
Best For
- Semantic search: Retrieval over natural-language document collections where lexical search falls short
- RAG pipelines: Pair embedding with Mistral AI generation models
- Document similarity and clustering: Grouping and deduplicating content for organization or analytics
- Recommendation systems: Recommender architectures based on textual content similarity
- Multilingual retrieval: Covering European languages supported by the Mistral AI ecosystem
Consider Alternatives When
- Source code corpus: Use Codestral Embed, which is specialized for code
- Variable dimension embeddings: You need support for storage cost optimization
- Domain-specific retrieval: Highly specialized text may benefit from fine-tuned embeddings
Conclusion
Mistral Embed is a general-purpose retrieval foundation for Mistral-based stacks. Mistral Embed's 1024-dimensional representations and MTEB-evaluated quality make it a choice for teams building semantic search and RAG systems that want to keep their provider footprint within the Mistral AI ecosystem.