GS Catalyst - voyage-multimodal-3.5 Embedding Model

Product Description

Overview

Multimodal embedding models are neural networks that transform multiple modalities, such as text and images, into numerical vectors. They are a crucial building block for semantic search/retrieval systems and retrieval-augmented generation (RAG) and are responsible for the retrieval quality.

voyage-multimodal-3.5 is a state-of-the-art multimodal embedding model capable of vectorizing not only text, images, and video individually, but also content that interleaves all three modalities. It delivers excellent performance for mixed-modality searches involving text and visual content such as PDF screenshots, figures, tables, videos, and more. Enabled by Matryoshka learning and quantization-aware training, voyage-multimodal-3.5 supports embeddings in 2048, 1024, 512, and 256 dimensions, with multiple quantization options.

Learn more about voyage-multimodal-3.5 here: https://blog.voyageai.com/2026/01/15/voyage-multimodal-3-5

Highlights

State-of-the-art multimodal embedding model capable of vectorizing not only text, images, and video individually, but also content that interleaves all three modalities. It delivers excellent performance for mixed-modality searches involving text and visual content such as PDF screenshots, figures, tables, videos, and more.
Supports embeddings of 2048, 1024, 512, and 256 dimensions and offers multiple embedding quantization, including float (32-bit floating point), int8 (8-bit signed integer), uint8 (8-bit unsigned integer), binary (bit-packed int8), and ubinary (bit-packed uint8).
32K token context length.

Tell Us About Your Needs

Company Name *

Company Industry *

Request Private Offer

How can we help?

Submit Request

Shopping Cart

voyage-multimodal-3.5 Embedding Model

Product Description

Overview

Highlights

Tell Us About Your Needs