Skip to content

API Reference

Base path: /api/v1

Conventions

Standard Envelope

Most endpoints return:

{
  "data": {},
  "meta": {}
}
  • meta is optional (pagination, recalculation counts, etc.).

Errors return:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable message",
    "details": []
  }
}

details is optional (usually validation details).

Endpoints That Do Not Use the Envelope

  • GET /api/v1/jobs/:jobId/progress (SSE stream)
  • GET /api/v1/export/duplicates.csv (raw CSV download)
  • GET /api/v1/export/config.json (raw JSON download)
  • GET /api/v1/paperless/documents/:paperlessId/preview (proxied binary stream)
  • GET /api/v1/paperless/documents/:paperlessId/thumb (proxied binary stream)
  • GET /api/v1/metrics (Prometheus text exposition format)

Error Codes

Code Status Meaning
BAD_REQUEST 400 Invalid path/query input
VALIDATION_FAILED 400 Invalid request body or params
UNAUTHORIZED 401 Auth failed/missing
NOT_FOUND 404 Resource not found
CONFLICT 409 State conflict
JOB_ALREADY_RUNNING 409 Same-type job already running/pending
BAD_GATEWAY 502 Upstream Paperless request failed
NOT_READY 503 Readiness checks failed
INTERNAL_ERROR 500 Unexpected server error

Pagination

Common pagination params:

Param Default Range
limit 50 1..100
offset 0 >=0

GET /api/v1/jobs uses limit range 1..200.

Health

GET /api/v1/health

Returns process liveness.

{
  "data": {
    "status": "ok",
    "timestamp": "2026-02-15T10:00:00.000Z"
  }
}

GET /api/v1/ready

Returns readiness checks (database + Paperless reachability).

{
  "data": {
    "status": "ready",
    "checks": {
      "database": { "status": "ok" },
      "paperless": { "status": "ok" }
    }
  }
}

If any check fails, returns HTTP 503 with NOT_READY.

GET /api/v1/metrics

Returns application metrics in Prometheus exposition format. Requires OTEL_PROMETHEUS_ENABLED=true.

Returns 404 with a text message when the feature is not enabled.

Content-Type: text/plain; version=0.0.4; charset=utf-8

See Configuration > Prometheus Scrape Endpoint for setup details.

Dashboard

GET /api/v1/dashboard

Returns summary cards and top correspondents.

Sync and Analysis

POST /api/v1/sync

Starts sync job.

Optional JSON body:

{ "force": true }

Response (202):

{ "data": { "jobId": "..." } }

GET /api/v1/sync/status

Returns last sync metadata plus active sync job status.

POST /api/v1/analysis

Starts analysis job.

Optional JSON body:

{ "force": true }

Response (202):

{ "data": { "jobId": "..." } }

GET /api/v1/analysis/status

Returns last analysis metadata plus active analysis job status.

Jobs

GET /api/v1/jobs

List jobs.

Query params:

  • type: sync | analysis | batch_operation
  • status: pending | running | completed | failed | cancelled
  • limit: 1..200

GET /api/v1/jobs/:jobId

Get one job.

Job object fields:

{
  "id": "...",
  "type": "sync",
  "status": "running",
  "progress": 0.42,
  "progressMessage": "...",
  "startedAt": "...",
  "completedAt": null,
  "errorMessage": null,
  "resultJson": null,
  "createdAt": "..."
}

progress is 0..1.

GET /api/v1/jobs/:jobId/progress

SSE stream.

Events:

  • progress
  • complete

Event data shape:

{
  "progress": 0.42,
  "message": "...",
  "status": "running"
}

POST /api/v1/jobs/:jobId/cancel

Cancels a pending/running job.

{ "data": { "jobId": "...", "status": "cancelled" } }

Configuration

GET /api/v1/config

Returns app config key/value map from DB.

PUT /api/v1/config

Accepts either:

{ "key": "some.key", "value": "some-value" }

or:

{ "settings": { "some.key": "value", "another.key": "value" } }

Returns full config map.

POST /api/v1/config/test-connection

Tests Paperless connectivity.

  • If body includes url, validates explicit body config.
  • Otherwise uses current runtime env config.

Body shape when explicit:

{
  "url": "http://paperless:8000",
  "token": "..."
}

or

{
  "url": "http://paperless:8000",
  "username": "admin",
  "password": "..."
}

Success:

{
  "data": {
    "connected": true,
    "version": "2.x",
    "documentCount": 1234
  }
}

GET /api/v1/config/dedup

Returns effective dedup config.

PUT /api/v1/config/dedup

Updates dedup config (partial).

  • Requires Content-Type: application/json
  • confidenceWeightJaccard + confidenceWeightFuzzy must sum to 100
  • discriminativePenaltyStrength is independent (0..100)

If weight or penalty keys change, meta.recalculatedGroups is included. If any LSH/threshold parameters change in a way that makes the current analysis stale, meta.analysisStale is included.

Legacy field name confidenceWeightDiscriminative is accepted and automatically converted to discriminativePenaltyStrength.

{
  "data": {
    "numPermutations": 256,
    "numBands": 32,
    "ngramSize": 3,
    "minWords": 20,
    "similarityThreshold": 0.75,
    "confidenceWeightJaccard": 60,
    "confidenceWeightFuzzy": 40,
    "discriminativePenaltyStrength": 70,
    "fuzzySampleSize": 10000,
    "autoAnalyze": true
  },
  "meta": {
    "recalculatedGroups": 42
  }
}

Documents

GET /api/v1/documents

Lists documents.

Query params:

  • limit, offset
  • correspondent
  • documentType
  • tag
  • processingStatus (pending | completed)
  • search (title match)
  • noAiResult (true to filter documents without an AI result)

GET /api/v1/documents/:id

Returns one document, plus content and group memberships.

GET /api/v1/documents/:id/content

Returns the text content for a single document.

{
  "data": {
    "fullText": "...",
    "wordCount": 450
  }
}

Returns 404 NOT_FOUND if the document content is not available.

GET /api/v1/documents/stats

Returns aggregate document analytics.

Duplicates

GET /api/v1/duplicates

Lists duplicate groups.

Query params:

  • limit, offset
  • minConfidence, maxConfidence (0..1)
  • status (comma-separated: pending,false_positive,ignored,deleted)
  • sortBy (confidence | created_at | member_count)
  • sortOrder (asc | desc)

GET /api/v1/duplicates/:id

Returns group details.

Optional query:

  • light=true to omit full member text content.

DELETE /api/v1/duplicates/:id

Deletes group record and memberships (does not delete documents in Paperless).

GET /api/v1/duplicates/:id/content

Returns text content for two members in a group.

Required query params:

  • docA
  • docB

PUT /api/v1/duplicates/:id/status

Request:

{ "status": "false_positive" }

Valid statuses: pending, false_positive, ignored, deleted.

PUT /api/v1/duplicates/:id/primary

Request:

{ "documentId": "..." }

GET /api/v1/duplicates/stats

Returns totals by status, confidence buckets, and top correspondents.

GET /api/v1/duplicates/graph

Returns graph nodes/edges for visualization.

Query params:

  • minConfidence, maxConfidence
  • status (comma-separated)
  • maxGroups (1..500, default 100)

AI Processing

All AI endpoints return 400 BAD_REQUEST when AI_ENABLED is false.

GET /api/v1/ai/config

Returns the current AI configuration object.

{
  "data": {
    "provider": "openai",
    "model": "gpt-5.4-mini",
    "maxContentLength": 8000,
    "batchSize": 10,
    "rateDelayMs": 500,
    "autoProcess": false,
    "includeCorrespondents": false,
    "includeDocumentTypes": false,
    "includeTags": false,
    "reasoningEffort": "low",
    "maxRetries": 3
  }
}

PUT /api/v1/ai/config

Partial update of AI config. Validated against the config schema.

{
  "provider": "openai",
  "model": "gpt-5.4-mini",
  "includeCorrespondents": true
}

Returns the full updated config.

GET /api/v1/ai/models

Returns available models for a provider.

Query params:

  • provider: openai (required)
{
  "data": [
    { "id": "gpt-5.4", "name": "GPT-5.4" },
    { "id": "gpt-5.4-mini", "name": "GPT-5.4 Mini" },
    { "id": "gpt-5.4-nano", "name": "GPT-5.4 Nano" }
  ]
}

POST /api/v1/ai/process

Starts an AI processing batch job.

Optional JSON body:

{
  "reprocess": false,
  "documentIds": ["doc-id-1", "doc-id-2"]
}
  • reprocess: if true, re-processes all documents (not just new ones)
  • documentIds: process only specific documents

Response (202):

{ "data": { "jobId": "..." } }

Returns 409 JOB_ALREADY_RUNNING if a processing job is already active.

GET /api/v1/ai/results

Lists AI processing results.

Query params:

  • status: pending, applied, rejected, partial
  • search: title substring match
  • failed: true to show only failed results
  • minConfidence, maxConfidence: filter by confidence range (0..1)
  • provider: filter by AI provider name
  • model: filter by AI model name
  • limit, offset
{
  "data": [
    {
      "id": "...",
      "documentTitle": "Invoice 2024-001",
      "suggestedCorrespondent": "Amazon",
      "suggestedDocumentType": "Invoice",
      "suggestedTags": ["shopping", "2024"],
      "confidence": { "correspondent": 0.95, "documentType": 0.90, "tags": 0.80 },
      "currentCorrespondent": null,
      "appliedStatus": "pending"
    }
  ],
  "meta": { "total": 42, "limit": 20, "offset": 0 }
}

GET /api/v1/ai/results/:id

Returns full details for a single result, including token counts and processing time.

POST /api/v1/ai/results/:id/apply

Applies AI suggestions to the document in Paperless-NGX.

Optional JSON body:

{
  "fields": ["correspondent", "tags"],
  "allowClearing": false,
  "createMissingEntities": true
}
  • fields: if omitted, all three fields (correspondent, documentType, tags) are applied. Partial field lists result in partial status.
  • allowClearing: when true, allows setting fields to null. Default false.
  • createMissingEntities: when true, missing correspondents, document types, and tags are created automatically in Paperless-NGX. Default true.
{ "data": { "applied": true } }

POST /api/v1/ai/results/:id/reject

Marks a result as rejected.

Optional JSON body:

{ "reason": "Incorrect suggestion" }
  • reason: optional string explaining why the result was rejected.
{ "data": { "rejected": true } }

POST /api/v1/ai/results/batch-apply

Applies multiple results as a background job.

{
  "resultIds": ["id-1", "id-2"],
  "fields": ["correspondent", "documentType", "tags"],
  "allowClearing": false,
  "createMissingEntities": true
}

Alternatively, use a scope object instead of resultIds:

{
  "scope": { "type": "current_filter", "filters": { "status": "pending" } },
  "fields": ["correspondent", "documentType", "tags"]
}

Response (202):

{ "data": { "jobId": "..." } }

Returns 409 JOB_ALREADY_RUNNING if an apply job is already active.

POST /api/v1/ai/results/batch-reject

Rejects multiple results.

{ "resultIds": ["id-1", "id-2"] }

Response:

{ "data": { "rejected": 2 } }

POST /api/v1/ai/results/reject-all

Rejects all pending AI results in one call. No request body required.

{ "data": { "rejected": 15 } }

Returns 0 if there are no pending results.

POST /api/v1/ai/results/apply-all

Applies all pending AI results as a background job.

Optional JSON body:

{
  "scope": { "type": "all_pending" },
  "fields": ["correspondent", "documentType", "tags"],
  "allowClearing": false,
  "createMissingEntities": true
}
  • scope: defaults to { "type": "all_pending" }. Can also use { "type": "current_filter", "filters": {...} }.
  • fields: defaults to all three fields if omitted.

Response (202):

{ "data": { "jobId": "..." } }

Returns 409 JOB_ALREADY_RUNNING if an apply job is already active.

POST /api/v1/ai/results/preflight

Pre-validates what would happen if results were applied, without making changes.

Request:

{
  "scope": { "type": "all_pending" },
  "fields": ["correspondent", "documentType", "tags"],
  "allowClearing": false,
  "createMissingEntities": true
}
  • scope is required.
  • fields: defaults to all three.
  • createMissingEntities: default true.

Returns a summary of entities that would be created, documents that would be modified, and any conflicts.

GET /api/v1/ai/results/groups

Groups AI results by a specified field.

Required query param:

  • groupBy: suggestedCorrespondent, suggestedDocumentType, failureType, or confidenceBand

Optional filter query params:

  • status, search, failed (true), minConfidence, maxConfidence, provider, model, changedOnly (true)

Returns grouped result counts and summaries.

POST /api/v1/ai/results/:id/revert

Reverts a previously applied AI result, restoring the original field values in Paperless-NGX using the pre-apply snapshot.

No request body required.

{ "data": { "reverted": true } }

Error conditions:

  • 404 NOT_FOUND if the result does not exist
  • 400 BAD_REQUEST if the result was not applied or has no pre-apply snapshot

POST /api/v1/ai/results/:id/feedback

Records user feedback on an AI result.

Request:

{
  "action": "rejected",
  "rejectedFields": ["correspondent"],
  "corrections": { "correspondent": "Acme Corp" },
  "reason": "Wrong company identified"
}
  • action (required): rejected, corrected, or partial_applied
  • rejectedFields: array of field names that were incorrect
  • corrections: object mapping field names to corrected values
  • reason: optional free-text explanation

Response (201):

{ "data": { "recorded": true } }

GET /api/v1/ai/feedback/summary

Returns an aggregate summary of all recorded AI feedback.

GET /api/v1/ai/costs

Returns AI cost statistics.

Optional query param:

  • days: integer, limits stats to the last N days

GET /api/v1/ai/costs/estimate

Estimates the cost of processing a batch of documents.

Required query param:

  • documentCount: positive integer

Returns 404 NOT_FOUND if no pricing data is available for the configured model.

GET /api/v1/ai/stats

Returns aggregate AI processing statistics.

{
  "data": {
    "totalProcessed": 150,
    "pendingReview": 42,
    "applied": 95,
    "rejected": 13,
    "failed": 3,
    "totalPromptTokens": 450000,
    "totalCompletionTokens": 25000
  }
}

Document Q&A

All endpoints return 404 when RAG_ENABLED=false.

GET /api/v1/rag/config

Returns current RAG configuration.

{
  "data": {
    "embeddingModel": "text-embedding-3-small",
    "embeddingDimensions": 1536,
    "chunkSize": 400,
    "chunkOverlap": 40,
    "topK": 10,
    "answerProvider": "openai",
    "answerModel": "gpt-5.4-mini",
    "systemPrompt": "You are a helpful document assistant...",
    "maxContextTokens": 4000,
    "autoIndex": false
  }
}

PUT /api/v1/rag/config

Updates RAG configuration. Accepts a partial object — only included fields are updated.

{
  "embeddingModel": "text-embedding-3-large",
  "topK": 15,
  "autoIndex": true
}

POST /api/v1/rag/ask

Streams a Q&A response. Send a question and optionally a conversation ID for multi-turn.

Request:

{
  "question": "What was my electricity bill last quarter?",
  "conversationId": "abc123"
}

Response: Server-Sent Events stream (Vercel AI SDK data stream format). Custom headers:

Header Description
X-Conversation-Id ID of the conversation (created if not provided)
X-Sources JSON array of source citations

Each source citation:

{
  "documentId": "doc_abc",
  "title": "British Gas Energy Bill Q1 2024",
  "chunkContent": "Period: 1 Jan - 31 Mar 2024. Total: £142.50...",
  "score": 0.034
}

POST /api/v1/rag/index

Starts a background indexing job. Returns 202 with a job ID.

Request (optional):

{
  "rebuild": true
}
Field Type Default Description
rebuild boolean false When true, deletes all existing chunks and re-indexes everything. Required after changing embedding model or dimensions.

Response:

{
  "data": { "jobId": "job_xyz" }
}

Track progress via GET /api/v1/jobs/:jobId/progress (SSE).

GET /api/v1/rag/stats

Returns index and conversation statistics.

{
  "data": {
    "totalChunks": 12500,
    "indexedDocuments": 2400,
    "unindexedDocuments": 100,
    "embeddingModel": "text-embedding-3-small",
    "lastIndexedAt": "2026-03-23T10:30:00.000Z",
    "totalConversations": 15,
    "totalMessages": 87
  }
}

GET /api/v1/rag/conversations

Lists conversations, ordered by most recent first.

Query parameters: limit (default 20), offset (default 0).

{
  "data": [
    {
      "id": "conv_abc",
      "title": "What was my electricity bill last quarter?",
      "createdAt": "2026-03-23T10:30:00.000Z",
      "updatedAt": "2026-03-23T10:32:00.000Z",
      "messageCount": 4
    }
  ],
  "meta": { "total": 15 }
}

GET /api/v1/rag/conversations/:id

Returns a single conversation with all messages.

{
  "data": {
    "id": "conv_abc",
    "title": "What was my electricity bill last quarter?",
    "createdAt": "2026-03-23T10:30:00.000Z",
    "updatedAt": "2026-03-23T10:32:00.000Z",
    "messages": [
      {
        "id": "msg_1",
        "role": "user",
        "content": "What was my electricity bill last quarter?",
        "sourcesJson": "[{\"documentId\":\"doc_abc\",\"title\":\"...\"}]",
        "tokenUsage": null,
        "createdAt": "2026-03-23T10:30:00.000Z"
      },
      {
        "id": "msg_2",
        "role": "assistant",
        "content": "Based on your British Gas energy bill...",
        "sourcesJson": null,
        "tokenUsage": 450,
        "createdAt": "2026-03-23T10:30:05.000Z"
      }
    ]
  }
}

DELETE /api/v1/rag/conversations/:id

Deletes a conversation and all its messages. Returns 404 if not found.

{
  "data": { "deleted": true }
}

Batch Operations

POST /api/v1/batch/status

Updates status for many groups.

{
  "groupIds": ["...", "..."],
  "status": "ignored"
}

groupIds must contain 1..1000 ids.

Response:

{ "data": { "updated": 2 } }

POST /api/v1/batch/delete-non-primary

Starts destructive batch delete in Paperless (background job).

{
  "groupIds": ["...", "..."],
  "confirm": true
}

Rules:

  • confirm must be true
  • all groups must currently be pending

Response (202):

{ "data": { "jobId": "..." } }

POST /api/v1/batch/purge-deleted

Purges all duplicate groups with deleted status from the database. No request body required.

Returns the count of purged groups.

Export / Import

GET /api/v1/export/duplicates.csv

Downloads CSV. Supports same filter params as GET /api/v1/duplicates.

GET /api/v1/export/config.json

Downloads config backup JSON.

POST /api/v1/import/config

Imports config backup JSON.

Response:

{
  "data": {
    "appConfigKeys": 5,
    "dedupConfigUpdated": true
  }
}

Paperless Proxy Endpoints

GET /api/v1/paperless/documents/:paperlessId/preview

Proxies Paperless preview stream.

GET /api/v1/paperless/documents/:paperlessId/thumb

Proxies Paperless thumbnail stream.

GET /api/v1/paperless/trash

Returns recycle bin count.

{ "data": { "count": 12 } }

POST /api/v1/paperless/trash

Request:

{ "action": "empty" }

Response:

{ "data": { "emptied": true } }