We're in preview / closed beta.Preview / closed beta. Interested? Email info@indexify.dev.

indexify.dev

Developer-first ingestion infrastructure for AI agents

Parse, chunk, embed, index, and retrieve knowledge through one API.

Indexify turns documents into agent-ready knowledge with production-grade ingestion and retrieval primitives—so you can ship RAG pipelines without rebuilding the ingestion layer from scratch. ingest→parse→chunk→index→search

$ curl -s -X POST "https://api.indexify.dev/projects/{PROJECT_ID}/kbs/{KB_ID}/search?format=markdown" \
  -H "Authorization: Bearer {ACCESS_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
  "query": "When a customer's health score drops to at-risk, what's the outreach and escalation playbook?",
  "settings": {
    "searchMode": "hybrid",
    "includeSourceMetadata": true
  }
}'

./start_indexing ./view_docs

The problem

RAG breaks before the prompt does

Most teams can prototype document search in a weekend. Production is where ingestion becomes infrastructure: parsing edge cases, chunk quality, retries, re-indexing, metadata, and retrieval debugging.

On paper

Upload a few files
Parse text
Split chunks
Embed vectors
Ask questions

In production

PDFs, tables, scans, and layouts break extraction
Bad chunks bury the right answer
Queues, retries, job state, and webhook glue
Re-embedding + re-indexing when docs or models change
Metadata, citations, and retrieval debugging

Indexify gives you that ingestion and retrieval layer as an API, so your team can focus on the agent experience instead of rebuilding RAG plumbing.

You ingest documents. Indexify gives you agent-ready knowledge.

One input action from your side, three concrete outcomes from Indexify that directly improve agent quality and shipping speed—ideal when you're building solo or on a small team and don't want to own full RAG infra yet.

You upload documents

Indexify returns grounded context for agents

Grounded retrieval

Semantic and hybrid POST …/search, optional reranking, and per-hit source metadata when you enable it on the request (for citations and debugging).

Agent-ready structure

Parsed elements, relationships, and section trees so agents can navigate docs instead of guessing.

Production reliability

Jobs, webhooks, and retry-safe workflows that keep ingestion and updates stable in real deployments.

Net effect for developers: less time on ingestion and retrieval plumbing, more time building agent behavior, tooling, and product workflows.

How it works

One HTTP API from upload to agent retrieval—KB settings own the middle. No custom ETL for parsers, embeddings, and storage.

Ingestion

Step 1

upload

Send docs/URLs via ingestion API

Pipeline processing

Step 2

parse

Extract structured content reliably

Step 3

chunk

Generate retrieval-friendly chunks

Step 4

index

Async processing + indexing

Consuming the data

Step 5

Semantic + hybrid retrieval via API

Step 6

agent

Grounded context for agents/tools via MCP

What you get

Semantic search matches by meaning (great for paraphrases and “symptom-style” questions). Hybrid search blends semantic similarity with keyword matching so exact terms still win when they should.

This is ideal for agents working with real systems: log lines, error codes, endpoint paths, class names, ticket IDs, and product SKUs often need literal matches, while the surrounding explanation benefits from semantic recall.

Outcome: fewer missed matches on exact terms and less backtracking in multi-step agent runs.

QueryRetrieveRerank (optional)Answer

Retrieve is where you choose how hits are scored: semantic (meaning), keyword (lexical), or hybrid (both)—per search request, not extra stages before retrieval.

View Retrieval in Guide

Developer-first API

Ingest and retrieve knowledge through one API

Upload documents, monitor ingestion, and query retrieval results using simple APIs designed for production AI workflows. The eleven starred curls match Quickstart and API Runner.

Create a developer account (human identity).

curl -s -X POST "https://api.indexify.dev/auth/signup" \
  -H "Content-Type: application/json" \
  -d '{
    "email": "developer@example.com",
    "password": "SecurePassw0rd!"
  }'

Run these flows interactively in the API runner (Quickstart endpoints) or browse OpenAPI.

Built for production workflows

Operational tooling for real-world AI systems

Indexify handles the ingestion lifecycle behind production RAG systems: async jobs, retries, processing state, webhooks, and retrieval-ready indexing workflows.

Async processing

Ingest documents through asynchronous pipelines designed for large workloads and incremental updates.

Jobs and state

Track parsing, chunking, embedding, and indexing status across ingestion runs—without building your own orchestrator.

Retries + re-indexing

Retry failed work and reprocess when documents or embedding models evolve, while keeping workflows predictable.

Webhooks and events

Trigger downstream agents and workflows when ingestion completes with job lifecycle events and delivery retries.

Operational visibility

Inspect processing stages and outputs so teams can debug retrieval quality from source → chunk → index → search.

Ingestion job lifecycle

GET …/jobs/{jobId}

pendingqueued for processing

parsingextract text + structure

chunkingsplit w/ metadata

indexingembed + store for search

completedretrieval-ready

Events (webhook)

job.startedjob.completedjob.failed

Net effect: fewer bespoke queues and cron glue, more predictable ingestion and easier retrieval debugging.

Build vs maintain

What starts as a simple RAG pipeline becomes infrastructure

Most teams can prototype ingestion quickly. Maintaining reliable parsing, retrieval quality, re-indexing workflows, and operational visibility over time is the harder problem.

Weekend prototype

Parse text
Generate embeddings
Store vectors
Run semantic search

Production reality

PDFs, scans, tables, and layouts break extraction
Chunking quality becomes retrieval quality
Async workflows need retries, state, and monitoring
Model + document changes force re-indexing workflows
Metadata, citations, and grounding become mandatory
Pipeline debugging becomes operational work

Indexify gives teams the ingestion and retrieval infrastructure layer so they can focus on building AI products instead of maintaining RAG plumbing.

Retrieval quality starts at ingestion

Better retrieval comes from better document understanding

Most RAG pipelines flatten documents into text chunks. Indexify preserves structure, metadata, and relationships so agents retrieve more relevant and grounded context.

Structured extraction

Preserve sections, hierarchy, tables, and metadata during ingestion—PDFs and layouts are not plain text.

Retrieval-friendly chunking

Generate chunks tuned for semantic search instead of arbitrary splits that bury the right passage.

Grounded results

Return context with metadata and citations so agents can justify answers and you can debug retrieval.

Async re-indexing

Reprocess knowledge bases when documents change or you rotate embedding models—without rewriting glue code.

Production workflows

Jobs, retries, webhooks, and observable processing state so ingestion stays reliable under load.

Tables, figures & relationships

Extract typed tables, figures, and cross-references during parse so agents can query structure—not only flattened text.

Strong ingestion reduces hallucination-prone retrieval misses—it's the input stage agents inherit downstream.

Go beyond text chunks

Built for agent developers: preserve structure end-to-end, run citation-ready search, and expose sections, elements, and relationships for navigation UIs, compliance flows, and RAG tools.

View structured document access in Guide

Typed elements

Headings, paragraphs, tables, figures, captions, and sections are stored as first-class entities.

Structured tables

Tables keep row/column/cell shape for retrieval and downstream processing.

Figure-aware retrieval

Figures are linked to captions, grounding metadata, and related elements.

Relationship graph

Typed relationships (containment, reading order, captions, cross-references, …) are available through document relationship endpoints.

Grounded search

Search supports chunk, table-row, and figure-caption retrieval with lineage and grounding metadata.

Processing visibility

Poll GET …/jobs/{jobId} for status and optional stages; list outputs with GET …/jobs/{jobId}/artifacts.

Indexify in your agent stack

Vanilla RAG

“retrieve once, answer once”

Via Indexify Search API

Agentic RAG

“think, retrieve, decide, repeat until done”

Via MCP tools that wraps multiple HTTP calls (search, elements, sections, …)

REST API

REST: call Indexify from your backend

Send a project access token on every request: Authorization: Bearer. Paths include {projectId}, {kbId}, and document ids as shown below.

The groups below are the REST endpoints that correspond to the hosted MCP tools. The Indexify REST API covers much more—ingestion, jobs, webhooks, project credentials, KB administration, and other operations. See the API reference for the full surface.

GET /projects/{projectId}/kbs — list knowledge bases in the project
GET /projects/{projectId}/kbs/{kbId}/documents — paginated document inventory for a KB
GET /projects/{projectId}/kbs/{kbId}/documents/{documentId} — metadata for one document

Ingestion contract

Indexify runs this lifecycle

Upload documents and let Indexify handle parsing, chunking, indexing, and job-state transitions.

DocumentIngestParseChunkEmbedStore

POST /documents and subscribe to webhook events to automate downstream workflows.

Retrieval contract

Indexify retrieves; your app decides and generates

Your app sends a query and receives ranked, citation-ready results. Your agent chooses how to use that context with any model or orchestration layer.

User QueryEmbedRetrieveContextYour LLMResponse

POST /search returns retrieval context; your LLM stack remains fully under your control.

Agent loop contract

MCP tools implement the same REST contracts

Composite search (optional API diagnostics, neighbor-chunk context), section and element reads, and drill-down retrieval—without changing what the HTTP API guarantees.

ThinkRetrieveEvaluateRefineRetrieveAnswer

View in Guide — step-by-step tools and patterns for this loop.

Agent-ready by design

Indexify is a managed, SaaS platform for developers building agents. “Agent-ready” means your agents can discover capabilities, call retrieval tools safely, and follow predictable contracts without custom glue.

View in Guide

MCP tools

Connect your IDE or agent runtime to Indexify’s hosted MCP server for read-oriented retrieval (search, chunks, elements, sections).

See the agent loop

OpenAPI contract

A strict OpenAPI spec powers the docs and API runner. Use it for codegen, validation, and keeping integrations predictable.

Download OpenAPI

Agent guide (downloadable)

A concise markdown file you can paste into a coding agent context, explaining what’s MCP vs HTTP and the golden paths.

Download agent guide

Looking for a crawler-friendly entrypoint? Start at /.well-known/llms.txt.

A data model that fits the real world

Projects keep corpora isolated. Each project contains one or more KBs (with their own parsing and retrieval settings), and search always targets a single KB.

View in Guide

Hierarchy at a glance

Developer

1∶*

Project

1∶*

Knowledge base

1∶*

Document

0∶1

Webhook

Tip: one KB per surface (help center, runbooks, policies) keeps config and relevance aligned.

Developer

Owns projects

Project

Credentials & scopes

Knowledge base

Indexes & search config

Document

One KB only

Search: one KB

Chunk search and structured reads use the same KB scope—predictable ranking and citations.

POST /projects/{projectId}/kbs/{kbId}/search

GET /projects/{projectId}/kbs/{kbId}/documents/{documentId}/sections

Example project + KB layouts

Three made-up-but-plausible setups—think of them as sketches, not a checklist. They only exist to show how you might carve a project and split KBs so search stays on-topic; your names and boundaries can be whatever fits you.

View in Guide

Education tutor

Say you are building a tutor: you might keep “what we teach” away from “how we stay safe.” Course stuff in one pile, moderation and policy in another, so searches do not blur the two.

`curriculum-kb`: standards, syllabus, pacing, rubrics

`course-materials-kb`: slides, worksheets, textbook PDFs, labs

`safety-policy-kb`: age rules, refusals, escalation, citations

SaaS support

For a support assistant, a common split is customer-facing help vs internal runbooks vs release notes. Same kind of UX; you just point the search at the KB that matches who is asking.

`help-center-kb`: FAQs, setup, troubleshooting

`runbooks-kb`: SOPs, admin steps, scripts, escalation paths

`release-notes-kb`: changelogs, migrations, known issues

Compliance assistant

Compliance is a mess of overlapping docs. One approach: a KB per policy area (HR, security, legal regs) so you are less likely to cite the wrong playbook when someone asks a narrow question.

`hr-policy-kb`: handbook, PTO, onboarding

`security-policy-kb`: access, incident response, data handling

`legal-regs-kb`: GDPR/SOC2, retention, audit mappings

What you can build after ingestion

Ask-anything knowledge chatbot

Grounded answers from your KBs: each search call targets one KB (`kbId` in the path); aggregate in your app when you need multiple corpora. Use search settings for citation metadata.

View in guide

KB-aware tool-calling agent

An agent retrieves facts from your KB, then calls APIs and tools to execute actions with context instead of guesswork.

View in guide

Document navigation copilot

Let users navigate long documents by structure: sections, headings, tables, and related elements, not plain text only.

View in guide

Workflow automation via the HTTP API

Drive backend workflows when webhooks fire: your service calls search, parsed content, and structure APIs. Optional MCP tooling (see Developer Guide) can expose the same operations to IDE agents—it is not part of the OpenAPI document.

View in guide

Compliance and policy Q&A

Scoped to one knowledge base per search request; add per-hit source metadata on search when you need traceable citations in your UI.

View in guide

IDE knowledge agent (MCP)

Connect Cursor or another MCP host to your KB: the agent searches internal docs, reads chunks and sections, and cites sources while you code—no custom REST glue in the IDE.

View in guide

Supported formats

Upload formats match the document upload contract: parsing follows your KB's settings.parse (OCR, transcription, and output formats are configuration-dependent).

View in Guide

Rich documents

PDF, DOCX, PPTX

Markup

HTML, Markdown, AsciiDoc, WebVTT

Tabular

XLSX, CSV

Images

PNG, JPEG, TIFF, BMP, WEBP (OCR when configured)

Audio

MP3, WAV (transcription when configured)

Limits (files per request, max size) are defined on the upload operation in the API reference.

Supported Embedding Models

Choose from leading embedding providers. Configure your knowledge base with the model that fits your latency, cost, and quality needs.

OpenAI

openai:text-embedding-3-small, openai:text-embedding-3-large

View in guide View official docs

Cohere

cohere:embed-v3

View in guide View official docs

Google

google:text-embedding-004

View in guide View official docs

Voyage AI

voyage:voyage-3, voyage:voyage-3-lite, voyage:voyage-3.5-lite

View in guide View official docs

Nomic

nomic:nomic-embed-text-v1.5

View in guide View official docs

AWS (Bedrock)

aws:titan-embed-text-v1

View in guide View official docs

Supported Rerank Engines

Configure the rerank provider and model on the knowledge base. On each POST …/search call, set settings.rerank.isEnabled to "yes" to run that reranker (see OpenAPI SearchRequest / SearchRequestSettings).

Cohere

rerank-v4.0-pro, rerank-v4.0-fast, rerank-v3.5, rerank-english-v3.0, rerank-multilingual-v3.0

View in guide View official docs

Voyage AI

rerank-2.5, rerank-2.5-lite, rerank-2, rerank-2-lite, rerank-1, rerank-lite-1

View in guide View official docs

Jina

jina-reranker-v3, jina-reranker-m0, jina-reranker-v2-base-multilingual, jina-colbert-v2

View in guide View official docs

Why Indexify vs DIY, libraries, or managed alternatives

Most builders can ship an ingest prototype. Fewer founders and small teams want to own parser drift, re-index cycles, retrieval tuning, and operational glue in production. Indexify provides that layer as a stable API.

DIY vs Indexify

Aspect

Time to first result

DIY

Days / weeks

Indexify

Minutes with API-first setup

Aspect

Retry & failures

DIY

Custom code

Indexify

Jobs + webhooks + retry-safe patterns

Aspect

Re-processing

DIY

Painful

Indexify

First-class reprocess workflows

Aspect

DIY

Indexify

Time to first result

Days / weeks

Minutes with API-first setup

Retry & failures

Custom code

Jobs + webhooks + retry-safe patterns

Re-processing

Painful

First-class reprocess workflows

Libraries (LlamaIndex/LangChain) vs Indexify

Aspect

Type

Libraries

Framework / SDK

Indexify

Managed platform

Aspect

Infra required

Libraries

Yes

Indexify

Aspect

Ingestion lifecycle

Libraries

DIY

Indexify

Managed parse/chunk/index lifecycle

Aspect

Libraries

Indexify

Type

Framework / SDK

Managed platform

Infra required

Yes

Ingestion lifecycle

DIY

Managed parse/chunk/index lifecycle

LlamaCloud (Managed) vs Indexify

Aspect

Type

LlamaCloud

Managed LlamaIndex backend

Indexify

Managed ingestion platform

Aspect

Ingestion lifecycle

LlamaCloud

LlamaIndex-native

Indexify

API-driven lifecycle + retrieval controls

Aspect

Stack lock-in

LlamaCloud

LlamaIndex ecosystem

Indexify

Vendor-neutral REST integration

Aspect

LlamaCloud

Indexify

Type

Managed LlamaIndex backend

Managed ingestion platform

Ingestion lifecycle

LlamaIndex-native

API-driven lifecycle + retrieval controls

Stack lock-in

LlamaIndex ecosystem

Vendor-neutral REST integration

Sign up, wire auth, then KB → upload → search.

Project + credentials → POST /oauth/token (client_credentials) → KB → upload → jobs/webhooks → POST …/search. Same path for a weekend prototype or your first production agent product—use the API runner to exercise the sequence.

./start_building ./open_api_runner

indexify.dev

Developer-first ingestion infrastructure for AI agents

RAG breaks before the prompt does

On paper

In production

You ingest documents. Indexify gives you agent-ready knowledge.

Grounded retrieval

Agent-ready structure

Production reliability

How it works

What you get

Semantic and Hybrid SearchMeaning when it matters; keywords when they are critical.

Reranking and cite sourcesBetter precision with verifiable provenance.

Multimodal document processingTables, figures, layout-aware structure—not plain text only.

Top embedding modelsChoose the model that fits your domain and budget.

Top rerank enginesImprove relevance when the query is tricky.

Ingest and retrieve knowledge through one API

Operational tooling for real-world AI systems

What starts as a simple RAG pipeline becomes infrastructure

Better retrieval comes from better document understanding

Structured extraction

Retrieval-friendly chunking

Grounded results

Async re-indexing

Production workflows

Tables, figures & relationships

Go beyond text chunks

Typed elements

Structured tables

Figure-aware retrieval

Relationship graph

Grounded search

Processing visibility

Indexify in your agent stack

REST: call Indexify from your backend

Knowledge bases & documents

Search & parsed content

Elements & artifacts

Relationships

Sections, tables & figures

Ingestion contract

Retrieval contract

Agent loop contract

Agent-ready by design

MCP tools

OpenAPI contract

Agent guide (downloadable)

A data model that fits the real world

What you can build after ingestion

Supported formats

Supported Embedding Models

Supported Rerank Engines

Why Indexify vs DIY, libraries, or managed alternatives

DIY vs Indexify

Libraries (LlamaIndex/LangChain) vs Indexify

LlamaCloud (Managed) vs Indexify

Sign up, wire auth, then KB → upload → search.