Preview / closed beta. Interested? Email info@indexify.dev.

Developer Guide

Setup & Security

A complete, hands-on guide for developers building agents: start from scratch, get a working integration shipped, then level it up with advanced retrieval, structured document access, and the operational details you’ll want in production.

Setup & Security

Authentication and Security

Indexify exposes two authorization paths on the public API: a developer bearer token for account and project administration, and project access tokens (OAuth2 client credentials) for runtime work on knowledge bases, documents, jobs, search, and webhooks. Design least-privilege credentials per service, rotate secrets deliberately, and verify webhook signatures before any side effect. Deep dives: scopes, webhook verification.

If you’re reading this as a reference, feel free to jump around—each section is written to stand on its own, and the left-hand search is the fastest way to find a specific endpoint or concept.

Human auth vs machine auth

Use developer JWT only for account/project operations; use project tokens for runtime workflows.

Tip: the snippets are meant to be copy/paste friendly—start with the happy path, then intentionally poke at scopes, missing fields, and bad inputs so you know exactly what your app will see in production.

  • Which token when
    • Developer JWT — Control & Ingestion Plane actions (developer console / account UI and related APIs) by authenticated humans.
    • Project access token — Server automation: KBs, ingestion, jobs, search, webhooks.
    • Never send project credentials to browsers or untrusted clients.
    • Prefer backend token exchange so frontends never hold machine credentials.

Developer JWT (human identity) -> Control & Ingestion Plane actions List projects (optional ?limit=20&sort=createdAt&order=desc; use nextCursor as cursor for the next page)

curl -s -G "https://api.indexify.dev/projects" \
  --data-urlencode "limit=20" \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJkZXYifQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"

Project access token (machine identity) -> runtime data plane actions OAuth2 client_credentials grant (credentials in form body only; HTTP Basic not supported)

curl -s -X POST "https://api.indexify.dev/oauth/token" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  --data-urlencode "grant_type=client_credentials" \
  --data-urlencode "client_id=idx_sample_client_id_01" \
  --data-urlencode "client_secret=idx_sample_client_secret_01"

Use short-lived access token for KB/doc/search/webhook endpoints

curl -s -X POST "https://api.indexify.dev/projects/550e8400-e29b-41d4-a716-446655440001/kbs/660e8400-e29b-41d4-a716-446655440002/search" \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJtYWNoaW5lIn0.dBjftJeZ4CVP-mB92K27uhbUJU1p1b_wW1gFWFOEjXk" \
  -H "Content-Type: application/json" \
  -d '{"query":"incident response policy","settings":{"includeSourceMetadata":true}}'

Search with optional diagnostics (retrieval stats on the response; no query echo)

curl -s -X POST "https://api.indexify.dev/projects/550e8400-e29b-41d4-a716-446655440001/kbs/660e8400-e29b-41d4-a716-446655440002/search?format=markdown" \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJtYWNoaW5lIn0.dBjftJeZ4CVP-mB92K27uhbUJU1p1b_wW1gFWFOEjXk" \
  -H "Content-Type: application/json" \
  -d '{"query":"incident response policy","settings":{"diagnostics":true,"includeSourceMetadata":true}}'

Scope model and least privilege

Grant only required scopes for service components (ingest/search/webhooks).

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Least privilege
    • Map each service to minimum API capabilities; issue dedicated credentials.
    • Split read-only retrieval from write-capable ingestion.
    • Review scopes at release time; keep an auditable inventory.
    • Separate credentials per environment so dev leaks cannot hit production.

Available scopes and what each one allows

Use this as the canonical scope reference when designing service credentials for Indexify APIs.

Use the bullets as a checklist: read once, then keep this open while you implement so you don’t miss the small but important details (like token type, scopes, and strict query rules).

  • Scope reference
    • kb:read — List/get KBs and settings.
    • kb:write — Create/update KBs; write-level KB ops (e.g. bulk reprocess).
    • kb:delete — Delete KB resources.
    • docs:read — Document metadata and original file content download (GET …/content).
    • docs:ingest — Upload/reprocess; ingestion writes.
    • docs:delete — Delete documents.
    • docs:parsed:read — Parsed body, chunks, sections, elements, tables, figures, relationships, and artifacts.
    • jobs:read — List jobs, get job detail (including optional per-stage breakdown), list job artifacts.
    • jobs:cancel — Cancel running jobs.
    • jobs:retry — Retry failed jobs.
    • search:run — KB search.
    • webhooks:read / webhooks:write / webhooks:delete — Webhook config CRUD.
    • webhooks:test — Test webhook deliveries.
    • admin:* — Wildcard; use only for tightly controlled automation.
    • Default credentials are broad; prefer purpose-specific credentials per service in production.

Webhook signature verification

Validate X-Indexify-Signature before processing webhook deliveries: HMAC-SHA256 over the raw body bytes, hex-encoded in lowercase, compared with a constant-time check.

Tip: the snippets are meant to be copy/paste friendly—start with the happy path, then intentionally poke at scopes, missing fields, and bad inputs so you know exactly what your app will see in production.

  • Verification
    • Read the raw POST body exactly as received—before JSON parsing—so the signed bytes match what Indexify hashed.
    • Compute HMAC-SHA256 with your webhook secret as the key and those raw bytes as the message; compare the lowercase hex digest to X-Indexify-Signature using a constant-time comparison.
    • Reject stale or replayed deliveries when your contract exposes timestamps or nonces.
  • OpenSSL one-liner
    • With body bytes in body.raw and secret in WEBHOOK_SECRET, the printed hex should match the header:
printf '%s' "$(cat body.raw)" | openssl dgst -sha256 -hmac "whsec_sample_signing_secret_01" | awk '{print $2}'