indexify.dev
Developer-first ingestion infrastructure for AI agents
Parse, chunk, embed, index, and retrieve knowledge through one API.
Indexify turns documents into agent-ready knowledge with production-grade ingestion and retrieval primitives—so you can ship RAG pipelines without rebuilding the ingestion layer from scratch. ingest→parse→chunk→index→search
$ curl -s -X POST "https://api.indexify.dev/projects/{PROJECT_ID}/kbs/{KB_ID}/search?format=markdown" \
-H "Authorization: Bearer {ACCESS_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"query": "When a customer's health score drops to at-risk, what's the outreach and escalation playbook?",
"settings": {
"searchMode": "hybrid",
"includeSourceMetadata": true
}
}'The problem
RAG breaks before the prompt does
Most teams can prototype document search in a weekend. Production is where ingestion becomes infrastructure: parsing edge cases, chunk quality, retries, re-indexing, metadata, and retrieval debugging.
On paper
- Upload a few files
- Parse text
- Split chunks
- Embed vectors
- Ask questions
In production
- PDFs, tables, scans, and layouts break extraction
- Bad chunks bury the right answer
- Queues, retries, job state, and webhook glue
- Re-embedding + re-indexing when docs or models change
- Metadata, citations, and retrieval debugging
Indexify gives you that ingestion and retrieval layer as an API, so your team can focus on the agent experience instead of rebuilding RAG plumbing.
You ingest documents. Indexify gives you agent-ready knowledge.
One input action from your side, three concrete outcomes from Indexify that directly improve agent quality and shipping speed—ideal when you're building solo or on a small team and don't want to own full RAG infra yet.
Grounded retrieval
Semantic and hybrid POST …/search, optional reranking, and per-hit source metadata when you enable it on the request (for citations and debugging).
Agent-ready structure
Parsed elements, relationships, and section trees so agents can navigate docs instead of guessing.
Production reliability
Jobs, webhooks, and retry-safe workflows that keep ingestion and updates stable in real deployments.
Net effect for developers: less time on ingestion and retrieval plumbing, more time building agent behavior, tooling, and product workflows.
How it works
One HTTP API from upload to agent retrieval—KB settings own the middle. No custom ETL for parsers, embeddings, and storage.
Ingestion
upload
Send docs/URLs via ingestion API
Pipeline processing
parse
Extract structured content reliably
chunk
Generate retrieval-friendly chunks
index
Async processing + indexing
Consuming the data
search
Semantic + hybrid retrieval via API
agent
Grounded context for agents/tools via MCP
Order: upload, parse, chunk, index, search, then agent retrieval.
What you get
Semantic search matches by meaning (great for paraphrases and “symptom-style” questions). Hybrid search blends semantic similarity with keyword matching so exact terms still win when they should.
This is ideal for agents working with real systems: log lines, error codes, endpoint paths, class names, ticket IDs, and product SKUs often need literal matches, while the surrounding explanation benefits from semantic recall.
Outcome: fewer missed matches on exact terms and less backtracking in multi-step agent runs.
Retrieve is where you choose how hits are scored: semantic (meaning), keyword (lexical), or hybrid (both)—per search request, not extra stages before retrieval.
Developer-first API
Ingest and retrieve knowledge through one API
Upload documents, monitor ingestion, and query retrieval results using simple APIs designed for production AI workflows. The eleven starred curls match Quickstart and API Runner.
Create a developer account (human identity).
curl -s -X POST "https://api.indexify.dev/auth/signup" \
-H "Content-Type: application/json" \
-d '{
"email": "developer@example.com",
"password": "SecurePassw0rd!"
}'Run these flows interactively in the API runner (Quickstart endpoints) or browse OpenAPI.
Built for production workflows
Operational tooling for real-world AI systems
Indexify handles the ingestion lifecycle behind production RAG systems: async jobs, retries, processing state, webhooks, and retrieval-ready indexing workflows.
Net effect: fewer bespoke queues and cron glue, more predictable ingestion and easier retrieval debugging.
Build vs maintain
What starts as a simple RAG pipeline becomes infrastructure
Most teams can prototype ingestion quickly. Maintaining reliable parsing, retrieval quality, re-indexing workflows, and operational visibility over time is the harder problem.
- Parse text
- Generate embeddings
- Store vectors
- Run semantic search
- PDFs, scans, tables, and layouts break extraction
- Chunking quality becomes retrieval quality
- Async workflows need retries, state, and monitoring
- Model + document changes force re-indexing workflows
- Metadata, citations, and grounding become mandatory
- Pipeline debugging becomes operational work
Indexify gives teams the ingestion and retrieval infrastructure layer so they can focus on building AI products instead of maintaining RAG plumbing.
Retrieval quality starts at ingestion
Better retrieval comes from better document understanding
Most RAG pipelines flatten documents into text chunks. Indexify preserves structure, metadata, and relationships so agents retrieve more relevant and grounded context.
Structured extraction
Preserve sections, hierarchy, tables, and metadata during ingestion—PDFs and layouts are not plain text.
Retrieval-friendly chunking
Generate chunks tuned for semantic search instead of arbitrary splits that bury the right passage.
Grounded results
Return context with metadata and citations so agents can justify answers and you can debug retrieval.
Async re-indexing
Reprocess knowledge bases when documents change or you rotate embedding models—without rewriting glue code.
Production workflows
Jobs, retries, webhooks, and observable processing state so ingestion stays reliable under load.
Tables, figures & relationships
Extract typed tables, figures, and cross-references during parse so agents can query structure—not only flattened text.
Strong ingestion reduces hallucination-prone retrieval misses—it's the input stage agents inherit downstream.
Go beyond text chunks
Built for agent developers: preserve structure end-to-end, run citation-ready search, and expose sections, elements, and relationships for navigation UIs, compliance flows, and RAG tools.
Typed elements
Headings, paragraphs, tables, figures, captions, and sections are stored as first-class entities.
Structured tables
Tables keep row/column/cell shape for retrieval and downstream processing.
Figure-aware retrieval
Figures are linked to captions, grounding metadata, and related elements.
Relationship graph
Typed relationships (containment, reading order, captions, cross-references, …) are available through document relationship endpoints.
Grounded search
Search supports chunk, table-row, and figure-caption retrieval with lineage and grounding metadata.
Processing visibility
Poll GET …/jobs/{jobId} for status and optional stages; list outputs with GET …/jobs/{jobId}/artifacts.
Indexify in your agent stack
REST: call Indexify from your backend
Send a project access token on every request: Authorization: Bearer. Paths include {projectId}, {kbId}, and document ids as shown below.
The groups below are the REST endpoints that correspond to the hosted MCP tools. The Indexify REST API covers much more—ingestion, jobs, webhooks, project credentials, KB administration, and other operations. See the API reference for the full surface.
GET /projects/{projectId}/kbs— list knowledge bases in the projectGET /projects/{projectId}/kbs/{kbId}/documents— paginated document inventory for a KBGET /projects/{projectId}/kbs/{kbId}/documents/{documentId}— metadata for one document
Ingestion contract
Indexify runs this lifecycle
Upload documents and let Indexify handle parsing, chunking, indexing, and job-state transitions.
POST /documents and subscribe to webhook events to automate downstream workflows.
Retrieval contract
Indexify retrieves; your app decides and generates
Your app sends a query and receives ranked, citation-ready results. Your agent chooses how to use that context with any model or orchestration layer.
POST /search returns retrieval context; your LLM stack remains fully under your control.
Agent loop contract
MCP tools implement the same REST contracts
Composite search (optional API diagnostics, neighbor-chunk context), section and element reads, and drill-down retrieval—without changing what the HTTP API guarantees.
View in Guide — step-by-step tools and patterns for this loop.
Agent-ready by design
Indexify is a managed, SaaS platform for developers building agents. “Agent-ready” means your agents can discover capabilities, call retrieval tools safely, and follow predictable contracts without custom glue.
View in GuideMCP tools
Connect your IDE or agent runtime to Indexify’s hosted MCP server for read-oriented retrieval (search, chunks, elements, sections).
See the agent loopOpenAPI contract
A strict OpenAPI spec powers the docs and API runner. Use it for codegen, validation, and keeping integrations predictable.
Download OpenAPIAgent guide (downloadable)
A concise markdown file you can paste into a coding agent context, explaining what’s MCP vs HTTP and the golden paths.
Download agent guideA data model that fits the real world
Projects keep corpora isolated. Each project contains one or more KBs (with their own parsing and retrieval settings), and search always targets a single KB.
Tip: one KB per surface (help center, runbooks, policies) keeps config and relevance aligned.
Chunk search and structured reads use the same KB scope—predictable ranking and citations.
Three made-up-but-plausible setups—think of them as sketches, not a checklist. They only exist to show how you might carve a project and split KBs so search stays on-topic; your names and boundaries can be whatever fits you.
Say you are building a tutor: you might keep “what we teach” away from “how we stay safe.” Course stuff in one pile, moderation and policy in another, so searches do not blur the two.
For a support assistant, a common split is customer-facing help vs internal runbooks vs release notes. Same kind of UX; you just point the search at the KB that matches who is asking.
Compliance is a mess of overlapping docs. One approach: a KB per policy area (HR, security, legal regs) so you are less likely to cite the wrong playbook when someone asks a narrow question.
What you can build after ingestion
Grounded answers from your KBs: each search call targets one KB (`kbId` in the path); aggregate in your app when you need multiple corpora. Use search settings for citation metadata.
View in guideAn agent retrieves facts from your KB, then calls APIs and tools to execute actions with context instead of guesswork.
View in guideLet users navigate long documents by structure: sections, headings, tables, and related elements, not plain text only.
View in guideDrive backend workflows when webhooks fire: your service calls search, parsed content, and structure APIs. Optional MCP tooling (see Developer Guide) can expose the same operations to IDE agents—it is not part of the OpenAPI document.
View in guideScoped to one knowledge base per search request; add per-hit source metadata on search when you need traceable citations in your UI.
View in guideConnect Cursor or another MCP host to your KB: the agent searches internal docs, reads chunks and sections, and cites sources while you code—no custom REST glue in the IDE.
View in guideSupported formats
Upload formats match the document upload contract: parsing follows your KB's settings.parse (OCR, transcription, and output formats are configuration-dependent).
Rich documents
PDF, DOCX, PPTX
Markup
HTML, Markdown, AsciiDoc, WebVTT
Tabular
XLSX, CSV
Images
PNG, JPEG, TIFF, BMP, WEBP (OCR when configured)
Audio
MP3, WAV (transcription when configured)
Limits (files per request, max size) are defined on the upload operation in the API reference.
Supported Embedding Models
Choose from leading embedding providers. Configure your knowledge base with the model that fits your latency, cost, and quality needs.
Supported Rerank Engines
Configure the rerank provider and model on the knowledge base. On each POST …/search call, set settings.rerank.isEnabled to "yes" to run that reranker (see OpenAPI SearchRequest / SearchRequestSettings).
Why Indexify vs DIY, libraries, or managed alternatives
Most builders can ship an ingest prototype. Fewer founders and small teams want to own parser drift, re-index cycles, retrieval tuning, and operational glue in production. Indexify provides that layer as a stable API.
DIY vs Indexify
Time to first result
DIY
Days / weeks
Indexify
Minutes with API-first setup
Retry & failures
DIY
Custom code
Indexify
Jobs + webhooks + retry-safe patterns
Re-processing
DIY
Painful
Indexify
First-class reprocess workflows
Libraries (LlamaIndex/LangChain) vs Indexify
Type
Libraries
Framework / SDK
Indexify
Managed platform
Infra required
Libraries
Yes
Indexify
No
Ingestion lifecycle
Libraries
DIY
Indexify
Managed parse/chunk/index lifecycle
LlamaCloud (Managed) vs Indexify
Type
LlamaCloud
Managed LlamaIndex backend
Indexify
Managed ingestion platform
Ingestion lifecycle
LlamaCloud
LlamaIndex-native
Indexify
API-driven lifecycle + retrieval controls
Stack lock-in
LlamaCloud
LlamaIndex ecosystem
Indexify
Vendor-neutral REST integration
Sign up, wire auth, then KB → upload → search.
Project + credentials → POST /oauth/token (client_credentials) → KB → upload → jobs/webhooks → POST …/search. Same path for a weekend prototype or your first production agent product—use the API runner to exercise the sequence.