Preview / closed beta. Interested? Email info@indexify.dev.

indexify.dev

Developer-first ingestion infrastructure for AI agents

Parse, chunk, embed, index, and retrieve knowledge through one API.

Indexify turns documents into agent-ready knowledge with production-grade ingestion and retrieval primitives—so you can ship RAG pipelines without rebuilding the ingestion layer from scratch. ingestparsechunkindexsearch

$ curl -s -X POST "https://api.indexify.dev/projects/{PROJECT_ID}/kbs/{KB_ID}/search?format=markdown" \
  -H "Authorization: Bearer {ACCESS_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
  "query": "When a customer's health score drops to at-risk, what's the outreach and escalation playbook?",
  "settings": {
    "searchMode": "hybrid",
    "includeSourceMetadata": true
  }
}'

The problem

RAG breaks before the prompt does

Most teams can prototype document search in a weekend. Production is where ingestion becomes infrastructure: parsing edge cases, chunk quality, retries, re-indexing, metadata, and retrieval debugging.

On paper

  • Upload a few files
  • Parse text
  • Split chunks
  • Embed vectors
  • Ask questions

In production

  • PDFs, tables, scans, and layouts break extraction
  • Bad chunks bury the right answer
  • Queues, retries, job state, and webhook glue
  • Re-embedding + re-indexing when docs or models change
  • Metadata, citations, and retrieval debugging

Indexify gives you that ingestion and retrieval layer as an API, so your team can focus on the agent experience instead of rebuilding RAG plumbing.

You ingest documents. Indexify gives you agent-ready knowledge.

One input action from your side, three concrete outcomes from Indexify that directly improve agent quality and shipping speed—ideal when you're building solo or on a small team and don't want to own full RAG infra yet.

You upload documents
Indexify returns grounded context for agents

Grounded retrieval

Semantic and hybrid POST …/search, optional reranking, and per-hit source metadata when you enable it on the request (for citations and debugging).

Agent-ready structure

Parsed elements, relationships, and section trees so agents can navigate docs instead of guessing.

Production reliability

Jobs, webhooks, and retry-safe workflows that keep ingestion and updates stable in real deployments.

Net effect for developers: less time on ingestion and retrieval plumbing, more time building agent behavior, tooling, and product workflows.

How it works

One HTTP API from upload to agent retrieval—KB settings own the middle. No custom ETL for parsers, embeddings, and storage.

Ingestion

Step 1

upload

Send docs/URLs via ingestion API

Pipeline processing

Step 2

parse

Extract structured content reliably

Step 3

chunk

Generate retrieval-friendly chunks

Step 4

index

Async processing + indexing

Consuming the data

Step 5

search

Semantic + hybrid retrieval via API

Step 6

agent

Grounded context for agents/tools via MCP

Order: upload, parse, chunk, index, search, then agent retrieval.

What you get

Semantic search matches by meaning (great for paraphrases and “symptom-style” questions). Hybrid search blends semantic similarity with keyword matching so exact terms still win when they should.

This is ideal for agents working with real systems: log lines, error codes, endpoint paths, class names, ticket IDs, and product SKUs often need literal matches, while the surrounding explanation benefits from semantic recall.

Outcome: fewer missed matches on exact terms and less backtracking in multi-step agent runs.

QueryRetrieveRerank (optional)Answer

Retrieve is where you choose how hits are scored: semantic (meaning), keyword (lexical), or hybrid (both)—per search request, not extra stages before retrieval.

Developer-first API

Ingest and retrieve knowledge through one API

Upload documents, monitor ingestion, and query retrieval results using simple APIs designed for production AI workflows. The eleven starred curls match Quickstart and API Runner.

Create a developer account (human identity).

curl -s -X POST "https://api.indexify.dev/auth/signup" \
  -H "Content-Type: application/json" \
  -d '{
    "email": "developer@example.com",
    "password": "SecurePassw0rd!"
  }'

Run these flows interactively in the API runner (Quickstart endpoints) or browse OpenAPI.

Built for production workflows

Operational tooling for real-world AI systems

Indexify handles the ingestion lifecycle behind production RAG systems: async jobs, retries, processing state, webhooks, and retrieval-ready indexing workflows.

Async processing
Ingest documents through asynchronous pipelines designed for large workloads and incremental updates.
Jobs and state
Track parsing, chunking, embedding, and indexing status across ingestion runs—without building your own orchestrator.
Retries + re-indexing
Retry failed work and reprocess when documents or embedding models evolve, while keeping workflows predictable.
Webhooks and events
Trigger downstream agents and workflows when ingestion completes with job lifecycle events and delivery retries.
Operational visibility
Inspect processing stages and outputs so teams can debug retrieval quality from source → chunk → index → search.
Ingestion job lifecycle
GET …/jobs/{jobId}
pendingqueued for processing
parsingextract text + structure
chunkingsplit w/ metadata
indexingembed + store for search
completedretrieval-ready
Events (webhook)
job.startedjob.completedjob.failed

Net effect: fewer bespoke queues and cron glue, more predictable ingestion and easier retrieval debugging.

Build vs maintain

What starts as a simple RAG pipeline becomes infrastructure

Most teams can prototype ingestion quickly. Maintaining reliable parsing, retrieval quality, re-indexing workflows, and operational visibility over time is the harder problem.

Weekend prototype
  • Parse text
  • Generate embeddings
  • Store vectors
  • Run semantic search
Production reality
  • PDFs, scans, tables, and layouts break extraction
  • Chunking quality becomes retrieval quality
  • Async workflows need retries, state, and monitoring
  • Model + document changes force re-indexing workflows
  • Metadata, citations, and grounding become mandatory
  • Pipeline debugging becomes operational work

Indexify gives teams the ingestion and retrieval infrastructure layer so they can focus on building AI products instead of maintaining RAG plumbing.

Retrieval quality starts at ingestion

Better retrieval comes from better document understanding

Most RAG pipelines flatten documents into text chunks. Indexify preserves structure, metadata, and relationships so agents retrieve more relevant and grounded context.

Structured extraction

Preserve sections, hierarchy, tables, and metadata during ingestion—PDFs and layouts are not plain text.

Retrieval-friendly chunking

Generate chunks tuned for semantic search instead of arbitrary splits that bury the right passage.

Grounded results

Return context with metadata and citations so agents can justify answers and you can debug retrieval.

Async re-indexing

Reprocess knowledge bases when documents change or you rotate embedding models—without rewriting glue code.

Production workflows

Jobs, retries, webhooks, and observable processing state so ingestion stays reliable under load.

Tables, figures & relationships

Extract typed tables, figures, and cross-references during parse so agents can query structure—not only flattened text.

Strong ingestion reduces hallucination-prone retrieval misses—it's the input stage agents inherit downstream.

Go beyond text chunks

Built for agent developers: preserve structure end-to-end, run citation-ready search, and expose sections, elements, and relationships for navigation UIs, compliance flows, and RAG tools.

View structured document access in Guide

Typed elements

Headings, paragraphs, tables, figures, captions, and sections are stored as first-class entities.

Structured tables

Tables keep row/column/cell shape for retrieval and downstream processing.

Figure-aware retrieval

Figures are linked to captions, grounding metadata, and related elements.

Relationship graph

Typed relationships (containment, reading order, captions, cross-references, …) are available through document relationship endpoints.

Grounded search

Search supports chunk, table-row, and figure-caption retrieval with lineage and grounding metadata.

Processing visibility

Poll GET …/jobs/{jobId} for status and optional stages; list outputs with GET …/jobs/{jobId}/artifacts.

Indexify in your agent stack

Vanilla RAG
“retrieve once, answer once”
Via Indexify Search API
Agentic RAG
“think, retrieve, decide, repeat until done”
Via MCP tools that wraps multiple HTTP calls (search, elements, sections, …)
REST API

REST: call Indexify from your backend

Send a project access token on every request: Authorization: Bearer. Paths include {projectId}, {kbId}, and document ids as shown below.

The groups below are the REST endpoints that correspond to the hosted MCP tools. The Indexify REST API covers much more—ingestion, jobs, webhooks, project credentials, KB administration, and other operations. See the API reference for the full surface.

Indexify

ingest

parsed

chunks

embeddings

question

relevant context

Document

Parse

Chunk

Index

User

Agent

Search

Knowledge Base

LLM

Ingestion contract

Indexify runs this lifecycle

Upload documents and let Indexify handle parsing, chunking, indexing, and job-state transitions.

DocumentIngestParseChunkEmbedStore

POST /documents and subscribe to webhook events to automate downstream workflows.

Retrieval contract

Indexify retrieves; your app decides and generates

Your app sends a query and receives ranked, citation-ready results. Your agent chooses how to use that context with any model or orchestration layer.

User QueryEmbedRetrieveContextYour LLMResponse

POST /search returns retrieval context; your LLM stack remains fully under your control.

Agent loop contract

MCP tools implement the same REST contracts

Composite search (optional API diagnostics, neighbor-chunk context), section and element reads, and drill-down retrieval—without changing what the HTTP API guarantees.

ThinkRetrieveEvaluateRefineRetrieveAnswer

View in Guide — step-by-step tools and patterns for this loop.

Agent-ready by design

Indexify is a managed, SaaS platform for developers building agents. “Agent-ready” means your agents can discover capabilities, call retrieval tools safely, and follow predictable contracts without custom glue.

View in Guide

MCP tools

Connect your IDE or agent runtime to Indexify’s hosted MCP server for read-oriented retrieval (search, chunks, elements, sections).

See the agent loop

OpenAPI contract

A strict OpenAPI spec powers the docs and API runner. Use it for codegen, validation, and keeping integrations predictable.

Download OpenAPI

Agent guide (downloadable)

A concise markdown file you can paste into a coding agent context, explaining what’s MCP vs HTTP and the golden paths.

Download agent guide
Looking for a crawler-friendly entrypoint? Start at /.well-known/llms.txt.

A data model that fits the real world

Projects keep corpora isolated. Each project contains one or more KBs (with their own parsing and retrieval settings), and search always targets a single KB.

View in Guide
Hierarchy at a glance

Tip: one KB per surface (help center, runbooks, policies) keeps config and relevance aligned.

Developer
Owns projects
Project
Credentials & scopes
Knowledge base
Indexes & search config
Document
One KB only
Search: one KB

Chunk search and structured reads use the same KB scope—predictable ranking and citations.

POST /projects/{projectId}/kbs/{kbId}/search
GET /projects/{projectId}/kbs/{kbId}/documents/{documentId}/sections
Example project + KB layouts

Three made-up-but-plausible setups—think of them as sketches, not a checklist. They only exist to show how you might carve a project and split KBs so search stays on-topic; your names and boundaries can be whatever fits you.

View in Guide
Education tutor

Say you are building a tutor: you might keep “what we teach” away from “how we stay safe.” Course stuff in one pile, moderation and policy in another, so searches do not blur the two.

`curriculum-kb`: standards, syllabus, pacing, rubrics
`course-materials-kb`: slides, worksheets, textbook PDFs, labs
`safety-policy-kb`: age rules, refusals, escalation, citations
SaaS support

For a support assistant, a common split is customer-facing help vs internal runbooks vs release notes. Same kind of UX; you just point the search at the KB that matches who is asking.

`help-center-kb`: FAQs, setup, troubleshooting
`runbooks-kb`: SOPs, admin steps, scripts, escalation paths
`release-notes-kb`: changelogs, migrations, known issues
Compliance assistant

Compliance is a mess of overlapping docs. One approach: a KB per policy area (HR, security, legal regs) so you are less likely to cite the wrong playbook when someone asks a narrow question.

`hr-policy-kb`: handbook, PTO, onboarding
`security-policy-kb`: access, incident response, data handling
`legal-regs-kb`: GDPR/SOC2, retention, audit mappings

What you can build after ingestion

Ask-anything knowledge chatbot

Grounded answers from your KBs: each search call targets one KB (`kbId` in the path); aggregate in your app when you need multiple corpora. Use search settings for citation metadata.

View in guide
KB-aware tool-calling agent

An agent retrieves facts from your KB, then calls APIs and tools to execute actions with context instead of guesswork.

View in guide
Document navigation copilot

Let users navigate long documents by structure: sections, headings, tables, and related elements, not plain text only.

View in guide
Workflow automation via the HTTP API

Drive backend workflows when webhooks fire: your service calls search, parsed content, and structure APIs. Optional MCP tooling (see Developer Guide) can expose the same operations to IDE agents—it is not part of the OpenAPI document.

View in guide
Compliance and policy Q&A

Scoped to one knowledge base per search request; add per-hit source metadata on search when you need traceable citations in your UI.

View in guide
IDE knowledge agent (MCP)

Connect Cursor or another MCP host to your KB: the agent searches internal docs, reads chunks and sections, and cites sources while you code—no custom REST glue in the IDE.

View in guide

Supported formats

Upload formats match the document upload contract: parsing follows your KB's settings.parse (OCR, transcription, and output formats are configuration-dependent).

View in Guide

Rich documents

PDF, DOCX, PPTX

Markup

HTML, Markdown, AsciiDoc, WebVTT

Tabular

XLSX, CSV

Images

PNG, JPEG, TIFF, BMP, WEBP (OCR when configured)

Audio

MP3, WAV (transcription when configured)

Limits (files per request, max size) are defined on the upload operation in the API reference.

Supported Embedding Models

Choose from leading embedding providers. Configure your knowledge base with the model that fits your latency, cost, and quality needs.

OpenAI

openai:text-embedding-3-small, openai:text-embedding-3-large

Cohere

cohere:embed-v3

Google

google:text-embedding-004

Voyage AI

voyage:voyage-3, voyage:voyage-3-lite, voyage:voyage-3.5-lite

Nomic

nomic:nomic-embed-text-v1.5

AWS (Bedrock)

aws:titan-embed-text-v1

Supported Rerank Engines

Configure the rerank provider and model on the knowledge base. On each POST …/search call, set settings.rerank.isEnabled to "yes" to run that reranker (see OpenAPI SearchRequest / SearchRequestSettings).

Cohere

rerank-v4.0-pro, rerank-v4.0-fast, rerank-v3.5, rerank-english-v3.0, rerank-multilingual-v3.0

Voyage AI

rerank-2.5, rerank-2.5-lite, rerank-2, rerank-2-lite, rerank-1, rerank-lite-1

Jina

jina-reranker-v3, jina-reranker-m0, jina-reranker-v2-base-multilingual, jina-colbert-v2

Why Indexify vs DIY, libraries, or managed alternatives

Most builders can ship an ingest prototype. Fewer founders and small teams want to own parser drift, re-index cycles, retrieval tuning, and operational glue in production. Indexify provides that layer as a stable API.

DIY vs Indexify

Aspect

Time to first result

DIY

Days / weeks

Indexify

Minutes with API-first setup

Aspect

Retry & failures

DIY

Custom code

Indexify

Jobs + webhooks + retry-safe patterns

Aspect

Re-processing

DIY

Painful

Indexify

First-class reprocess workflows

Libraries (LlamaIndex/LangChain) vs Indexify

Aspect

Type

Libraries

Framework / SDK

Indexify

Managed platform

Aspect

Infra required

Libraries

Yes

Indexify

No

Aspect

Ingestion lifecycle

Libraries

DIY

Indexify

Managed parse/chunk/index lifecycle

LlamaCloud (Managed) vs Indexify

Aspect

Type

LlamaCloud

Managed LlamaIndex backend

Indexify

Managed ingestion platform

Aspect

Ingestion lifecycle

LlamaCloud

LlamaIndex-native

Indexify

API-driven lifecycle + retrieval controls

Aspect

Stack lock-in

LlamaCloud

LlamaIndex ecosystem

Indexify

Vendor-neutral REST integration

Sign up, wire auth, then KB → upload → search.

Project + credentials → POST /oauth/token (client_credentials) → KB → upload → jobs/webhooks → POST …/search. Same path for a weekend prototype or your first production agent product—use the API runner to exercise the sequence.