Each knowledge base names an embedding model and optional config under settings.index.embeddingConfig. The public API validates model ids and forwards supported options to the upstream provider. Index and query embeddings must stay compatible—switching models generally requires reprocessing. Exact schema: API reference; KB design: Knowledge Base Design.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- How embeddings are configured
- Provider credentials (API keys, cloud roles) are attached to your account through the product’s account or project settings—not embedded in KB JSON you share in repos.
- If you omit an override, the platform default documented in the spec applies (commonly
openai:text-embedding-3-small—confirm in the API reference). - Vectors are tied to model and dimensionality; plan a re-ingest or re-index when you change embedding family.
- Benchmark
openai:text-embedding-3-smallagainstvoyage:voyage-3-lite(and others below) on your queries before locking in cost and latency.
Widely used text embeddings with strong general quality; 3-small balances cost and latency, 3-large suits harder retrieval when budget allows.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://platform.openai.com/docs/guides/embeddings
- Indexify
config(optional):dimensions≤ 3072; omit for model default.
- When to use
- General English/product docs and mixed corpora; easy if you already use OpenAI.
text-embedding-3-largewhen you need max fidelity and accept higher cost.
Multilingual embedding model suited to enterprise RAG; pairs naturally with Cohere rerank for a single-vendor retrieval stack.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://docs.cohere.com/docs/embeddings
- Indexify
config(optional):input_typeortaskType(search_document|search_query|classification|clustering;input_typealso allowsimage),truncate(NONE|START|END).
- When to use
- Multilingual and enterprise RAG; pairs with Cohere rerank on one stack.
Gemini / Google AI embedding API with task-type hints and optional output dimensionality for efficiency.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://ai.google.dev/gemini-api/docs/embeddings
- Indexify
config(optional):task_type(e.g.RETRIEVAL_QUERY,RETRIEVAL_DOCUMENT,CODE_RETRIEVAL_QUERY, …),outputDimensionality(1–768).
- When to use
- Google AI–standard stacks; code corpora with
CODE_RETRIEVAL_QUERY; smaller vectors viaoutputDimensionality.
- Google AI–standard stacks; code corpora with
Amazon Titan Embeddings via Bedrock—fits AWS-native compliance and billing boundaries with no extra tunable embedding fields in Indexify.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html
- Indexify
configmust be{}(or omitted). Use the Bedrock Titan embedding discriminator from the API reference.
- When to use
- AWS-only compliance/billing; IAM access; no extra SaaS outside your org.
Purpose-built retrieval embeddings with optional dimension and dtype control; strong choice when retrieval quality per dollar is the goal.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://docs.voyageai.com/docs/embeddings
- Indexify
config(optional):truncation,output_dimension,output_dtype. Voyageinput_typeis set automatically (document for indexing, query for search).
- When to use
- Strong quality/$;
voyage-3.5-lite/voyage-3-litefor cost/latency; Matryoshka sizing withoutput_dimension.
- Strong quality/$;
Open-weight-style embedding workflow via Nomic’s API; supports Matryoshka dimensionalities for flexible index size.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://docs.nomic.ai/
- Indexify
config(optional):task_type/taskType,dimensionality(64–768).
- When to use
- Research stacks; smaller vector sizes with compatible prefixes; cost-sensitive indexes.
Optional second-stage reranking scores each query–chunk pair so you can retrieve broadly then trim. The rerank provider is configured on the KB, not per request. On POST …/search, enable settings.rerank and tune retrieveTopK / rerank.topK—full field list on that route in API reference. Practical flow: Advanced Retrieval. MCP: indexify_search.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Reranking basics
- Increase
retrieveTopK, then enable reranking and setrerank.topKso the model sees fewer but sharper passages. - Using the same vendor for embed + rerank can simplify billing and support; still measure on your corpus.
- Latency, token limits, and pricing follow the upstream provider—consult their docs for capacity planning.
- Increase
Production-grade rerank models with several quality/speed tradeoffs; common default in Indexify samples.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & models
- Vendor: https://docs.cohere.com/docs/rerank
- Indexify
config.model(optional):rerank-v4.0-pro,rerank-v4.0-fast,rerank-v3.5,rerank-english-v3.0,rerank-multilingual-v3.0.
- When to use
- General RAG;
rerank-multilingual-v3.0for multilingual;rerank-english-v3.0for English-heavy;rerank-v3.5as a balanced default; newer fast and pro model ids when your account supports them and latency budgets are tight.
- General RAG;
Rerank 2.x family tuned for retrieval; optional truncation flag matches Voyage API behavior.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://docs.voyageai.com/docs/reranker
- Indexify
config.model(optional):rerank-2.5,rerank-2.5-lite,rerank-2,rerank-2-lite,rerank-1,rerank-lite-1. - Indexify
config.truncation(optional boolean) forwarded to Voyage.
- When to use
- With Voyage embeddings; long passages where truncation policy matters; dense technical text.
Multilingual and ColBERT-style options via Jina’s reranker lineup; supports truncation in Indexify config.
Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).
- Docs & config
- Vendor: https://jina.ai/reranker
- Indexify
config.model(optional):jina-reranker-v3,jina-reranker-m0,jina-reranker-v2-base-multilingual,jina-colbert-v2. - Indexify
config.truncation(optional boolean).
- When to use
- Multilingual help; ColBERT-style routing with
jina-colbert-v2;jina-reranker-m0when cost matters and quality fits.
- Multilingual help; ColBERT-style routing with