Preview / closed beta. Interested? Email info@indexify.dev.

Developer Guide

Embedding models & rerank providers

A complete, hands-on guide for developers building agents: start from scratch, get a working integration shipped, then level it up with advanced retrieval, structured document access, and the operational details you’ll want in production.

Embedding models & rerank providers

Embedding models supported by Indexify

Each knowledge base names an embedding model and optional config under settings.index.embeddingConfig. The public API validates model ids and forwards supported options to the upstream provider. Index and query embeddings must stay compatible—switching models generally requires reprocessing. Exact schema: API reference; KB design: Knowledge Base Design.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • How embeddings are configured
    • Provider credentials (API keys, cloud roles) are attached to your account through the product’s account or project settings—not embedded in KB JSON you share in repos.
    • If you omit an override, the platform default documented in the spec applies (commonly openai:text-embedding-3-small—confirm in the API reference).
    • Vectors are tied to model and dimensionality; plan a re-ingest or re-index when you change embedding family.
    • Benchmark openai:text-embedding-3-small against voyage:voyage-3-lite (and others below) on your queries before locking in cost and latency.

OpenAI (openai:text-embedding-3-small, openai:text-embedding-3-large)

Widely used text embeddings with strong general quality; 3-small balances cost and latency, 3-large suits harder retrieval when budget allows.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
  • When to use
    • General English/product docs and mixed corpora; easy if you already use OpenAI.
    • text-embedding-3-large when you need max fidelity and accept higher cost.

Cohere (cohere:embed-v3)

Multilingual embedding model suited to enterprise RAG; pairs naturally with Cohere rerank for a single-vendor retrieval stack.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
    • Vendor: https://docs.cohere.com/docs/embeddings
    • Indexify config (optional): input_type or taskType (search_document | search_query | classification | clustering; input_type also allows image), truncate (NONE | START | END).
  • When to use
    • Multilingual and enterprise RAG; pairs with Cohere rerank on one stack.

Google (google:text-embedding-004)

Gemini / Google AI embedding API with task-type hints and optional output dimensionality for efficiency.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
  • When to use
    • Google AI–standard stacks; code corpora with CODE_RETRIEVAL_QUERY; smaller vectors via outputDimensionality.

AWS Bedrock (Titan Text Embeddings)

Amazon Titan Embeddings via Bedrock—fits AWS-native compliance and billing boundaries with no extra tunable embedding fields in Indexify.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

Voyage AI (voyage:voyage-3, voyage:voyage-3-lite, voyage:voyage-3.5-lite)

Purpose-built retrieval embeddings with optional dimension and dtype control; strong choice when retrieval quality per dollar is the goal.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
  • When to use
    • Strong quality/$; voyage-3.5-lite / voyage-3-lite for cost/latency; Matryoshka sizing with output_dimension.

Nomic (text embeddings)

Open-weight-style embedding workflow via Nomic’s API; supports Matryoshka dimensionalities for flexible index size.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
  • When to use
    • Research stacks; smaller vector sizes with compatible prefixes; cost-sensitive indexes.

Rerank providers supported by Indexify

Optional second-stage reranking scores each query–chunk pair so you can retrieve broadly then trim. The rerank provider is configured on the KB, not per request. On POST …/search, enable settings.rerank and tune retrieveTopK / rerank.topK—full field list on that route in API reference. Practical flow: Advanced Retrieval. MCP: indexify_search.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Reranking basics
    • Increase retrieveTopK, then enable reranking and set rerank.topK so the model sees fewer but sharper passages.
    • Using the same vendor for embed + rerank can simplify billing and support; still measure on your corpus.
    • Latency, token limits, and pricing follow the upstream provider—consult their docs for capacity planning.

Cohere rerank (provider: "cohere")

Production-grade rerank models with several quality/speed tradeoffs; common default in Indexify samples.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & models
  • When to use
    • General RAG; rerank-multilingual-v3.0 for multilingual; rerank-english-v3.0 for English-heavy; rerank-v3.5 as a balanced default; newer fast and pro model ids when your account supports them and latency budgets are tight.

Voyage AI rerank (provider: "voyage")

Rerank 2.x family tuned for retrieval; optional truncation flag matches Voyage API behavior.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
    • Vendor: https://docs.voyageai.com/docs/reranker
    • Indexify config.model (optional): rerank-2.5, rerank-2.5-lite, rerank-2, rerank-2-lite, rerank-1, rerank-lite-1.
    • Indexify config.truncation (optional boolean) forwarded to Voyage.
  • When to use
    • With Voyage embeddings; long passages where truncation policy matters; dense technical text.

Jina AI rerank (provider: "jina")

Multilingual and ColBERT-style options via Jina’s reranker lineup; supports truncation in Indexify config.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Docs & config
    • Vendor: https://jina.ai/reranker
    • Indexify config.model (optional): jina-reranker-v3, jina-reranker-m0, jina-reranker-v2-base-multilingual, jina-colbert-v2.
    • Indexify config.truncation (optional boolean).
  • When to use
    • Multilingual help; ColBERT-style routing with jina-colbert-v2; jina-reranker-m0 when cost matters and quality fits.