GS Catalyst - voyage-3 Large Embedding model

Product Description

Overview
Text embedding models convert written content into numerical vectors and form the foundation of semantic search, retrieval systems, and retrieval-augmented generation (RAG), directly influencing retrieval performance. voyage-3-large is a cutting-edge, general-purpose, multilingual embedding model that ranks #1 across eight domains and 100 evaluated datasets, including law, finance, and code. It surpasses OpenAI-v3-large and Cohere-v3-English by average margins of 9.74% and 20.71%, respectively. Using Matryoshka learning and quantization-aware training, the model supports lower vector dimensions and int8 or binary quantization, significantly reducing vector database costs with little loss in retrieval quality. It delivers 90 ms latency for single queries (≤100 tokens) and 12.6M tokens/hour throughput at $0.22 per 1M tokens on ml.g6.xlarge.

Highlights

Achieves top performance across 100 datasets in eight domains, outperforming OpenAI-v3-large by 9.74% and Cohere-v3-English by 20.71% on average.
Supports 2048, 1024, 512, and 256-dimension embeddings with multiple quantization options: float, int8, uint8, binary, and ubinary.
Offers 32K token context length, 90 ms latency (≤100 tokens), and 12.6M tokens/hour throughput at $0.22 per 1M tokens on ml.g6.xlarge.

Tell Us About Your Needs

Company Name *

Company Industry *

Request Private Offer

How can we help?

Submit Request Browse Other Products