API Reference¶
Base path: /api/v1
Conventions¶
Standard Envelope¶
Most endpoints return:
metais optional (pagination, recalculation counts, etc.).
Errors return:
details is optional (usually validation details).
Endpoints That Do Not Use the Envelope¶
GET /api/v1/jobs/:jobId/progress(SSE stream)GET /api/v1/export/duplicates.csv(raw CSV download)GET /api/v1/export/config.json(raw JSON download)GET /api/v1/paperless/documents/:paperlessId/preview(proxied binary stream)GET /api/v1/paperless/documents/:paperlessId/thumb(proxied binary stream)GET /api/v1/metrics(Prometheus text exposition format)
Error Codes¶
| Code | Status | Meaning |
|---|---|---|
BAD_REQUEST |
400 | Invalid path/query input |
VALIDATION_FAILED |
400 | Invalid request body or params |
UNAUTHORIZED |
401 | Auth failed/missing |
NOT_FOUND |
404 | Resource not found |
CONFLICT |
409 | State conflict |
JOB_ALREADY_RUNNING |
409 | Same-type job already running/pending |
BAD_GATEWAY |
502 | Upstream Paperless request failed |
NOT_READY |
503 | Readiness checks failed |
INTERNAL_ERROR |
500 | Unexpected server error |
Pagination¶
Common pagination params:
| Param | Default | Range |
|---|---|---|
limit |
50 |
1..100 |
offset |
0 |
>=0 |
GET /api/v1/jobs uses limit range 1..200.
Health¶
GET /api/v1/health¶
Returns process liveness.
GET /api/v1/ready¶
Returns readiness checks (database + Paperless reachability).
{
"data": {
"status": "ready",
"checks": {
"database": { "status": "ok" },
"paperless": { "status": "ok" }
}
}
}
If any check fails, returns HTTP 503 with NOT_READY.
GET /api/v1/metrics¶
Returns application metrics in Prometheus exposition format. Requires OTEL_PROMETHEUS_ENABLED=true.
Returns 404 with a text message when the feature is not enabled.
Content-Type: text/plain; version=0.0.4; charset=utf-8
See Configuration > Prometheus Scrape Endpoint for setup details.
Dashboard¶
GET /api/v1/dashboard¶
Returns summary cards and top correspondents.
Sync and Analysis¶
POST /api/v1/sync¶
Starts sync job.
Optional JSON body:
Response (202):
GET /api/v1/sync/status¶
Returns last sync metadata plus active sync job status.
POST /api/v1/analysis¶
Starts analysis job.
Optional JSON body:
Response (202):
GET /api/v1/analysis/status¶
Returns last analysis metadata plus active analysis job status.
Jobs¶
GET /api/v1/jobs¶
List jobs.
Query params:
type:sync | analysis | batch_operationstatus:pending | running | completed | failed | cancelledlimit:1..200
GET /api/v1/jobs/:jobId¶
Get one job.
Job object fields:
{
"id": "...",
"type": "sync",
"status": "running",
"progress": 0.42,
"progressMessage": "...",
"startedAt": "...",
"completedAt": null,
"errorMessage": null,
"resultJson": null,
"createdAt": "..."
}
progress is 0..1.
GET /api/v1/jobs/:jobId/progress¶
SSE stream.
Events:
progresscomplete
Event data shape:
POST /api/v1/jobs/:jobId/cancel¶
Cancels a pending/running job.
Configuration¶
GET /api/v1/config¶
Returns app config key/value map from DB.
PUT /api/v1/config¶
Accepts either:
or:
Returns full config map.
POST /api/v1/config/test-connection¶
Tests Paperless connectivity.
- If body includes
url, validates explicit body config. - Otherwise uses current runtime env config.
Body shape when explicit:
or
Success:
GET /api/v1/config/dedup¶
Returns effective dedup config.
PUT /api/v1/config/dedup¶
Updates dedup config (partial).
- Requires
Content-Type: application/json confidenceWeightJaccard+confidenceWeightFuzzymust sum to 100discriminativePenaltyStrengthis independent (0..100)
If weight or penalty keys change, meta.recalculatedGroups is included. If any LSH/threshold parameters change in a way that makes the current analysis stale, meta.analysisStale is included.
Legacy field name confidenceWeightDiscriminative is accepted and automatically converted to discriminativePenaltyStrength.
{
"data": {
"numPermutations": 256,
"numBands": 32,
"ngramSize": 3,
"minWords": 20,
"similarityThreshold": 0.75,
"confidenceWeightJaccard": 60,
"confidenceWeightFuzzy": 40,
"discriminativePenaltyStrength": 70,
"fuzzySampleSize": 10000,
"autoAnalyze": true
},
"meta": {
"recalculatedGroups": 42
}
}
Documents¶
GET /api/v1/documents¶
Lists documents.
Query params:
limit,offsetcorrespondentdocumentTypetagprocessingStatus(pending | completed)search(title match)noAiResult(trueto filter documents without an AI result)
GET /api/v1/documents/:id¶
Returns one document, plus content and group memberships.
GET /api/v1/documents/:id/content¶
Returns the text content for a single document.
Returns 404 NOT_FOUND if the document content is not available.
GET /api/v1/documents/stats¶
Returns aggregate document analytics.
Duplicates¶
GET /api/v1/duplicates¶
Lists duplicate groups.
Query params:
limit,offsetminConfidence,maxConfidence(0..1)status(comma-separated:pending,false_positive,ignored,deleted)sortBy(confidence | created_at | member_count)sortOrder(asc | desc)
GET /api/v1/duplicates/:id¶
Returns group details.
Optional query:
light=trueto omit full member text content.
DELETE /api/v1/duplicates/:id¶
Deletes group record and memberships (does not delete documents in Paperless).
GET /api/v1/duplicates/:id/content¶
Returns text content for two members in a group.
Required query params:
docAdocB
PUT /api/v1/duplicates/:id/status¶
Request:
Valid statuses: pending, false_positive, ignored, deleted.
PUT /api/v1/duplicates/:id/primary¶
Request:
GET /api/v1/duplicates/stats¶
Returns totals by status, confidence buckets, and top correspondents.
GET /api/v1/duplicates/graph¶
Returns graph nodes/edges for visualization.
Query params:
minConfidence,maxConfidencestatus(comma-separated)maxGroups(1..500, default100)
AI Processing¶
All AI endpoints return 400 BAD_REQUEST when AI_ENABLED is false.
GET /api/v1/ai/config¶
Returns the current AI configuration object.
{
"data": {
"provider": "openai",
"model": "gpt-5.4-mini",
"maxContentLength": 8000,
"batchSize": 10,
"rateDelayMs": 500,
"autoProcess": false,
"includeCorrespondents": false,
"includeDocumentTypes": false,
"includeTags": false,
"reasoningEffort": "low",
"maxRetries": 3
}
}
PUT /api/v1/ai/config¶
Partial update of AI config. Validated against the config schema.
Returns the full updated config.
GET /api/v1/ai/models¶
Returns available models for a provider.
Query params:
provider:openai(required)
{
"data": [
{ "id": "gpt-5.4", "name": "GPT-5.4" },
{ "id": "gpt-5.4-mini", "name": "GPT-5.4 Mini" },
{ "id": "gpt-5.4-nano", "name": "GPT-5.4 Nano" }
]
}
POST /api/v1/ai/process¶
Starts an AI processing batch job.
Optional JSON body:
reprocess: iftrue, re-processes all documents (not just new ones)documentIds: process only specific documents
Response (202):
Returns 409 JOB_ALREADY_RUNNING if a processing job is already active.
GET /api/v1/ai/results¶
Lists AI processing results.
Query params:
status:pending,applied,rejected,partialsearch: title substring matchfailed:trueto show only failed resultsminConfidence,maxConfidence: filter by confidence range (0..1)provider: filter by AI provider namemodel: filter by AI model namelimit,offset
{
"data": [
{
"id": "...",
"documentTitle": "Invoice 2024-001",
"suggestedCorrespondent": "Amazon",
"suggestedDocumentType": "Invoice",
"suggestedTags": ["shopping", "2024"],
"confidence": { "correspondent": 0.95, "documentType": 0.90, "tags": 0.80 },
"currentCorrespondent": null,
"appliedStatus": "pending"
}
],
"meta": { "total": 42, "limit": 20, "offset": 0 }
}
GET /api/v1/ai/results/:id¶
Returns full details for a single result, including token counts and processing time.
POST /api/v1/ai/results/:id/apply¶
Applies AI suggestions to the document in Paperless-NGX.
Optional JSON body:
fields: if omitted, all three fields (correspondent,documentType,tags) are applied. Partial field lists result inpartialstatus.allowClearing: whentrue, allows setting fields tonull. Defaultfalse.createMissingEntities: whentrue, missing correspondents, document types, and tags are created automatically in Paperless-NGX. Defaulttrue.
POST /api/v1/ai/results/:id/reject¶
Marks a result as rejected.
Optional JSON body:
reason: optional string explaining why the result was rejected.
POST /api/v1/ai/results/batch-apply¶
Applies multiple results as a background job.
{
"resultIds": ["id-1", "id-2"],
"fields": ["correspondent", "documentType", "tags"],
"allowClearing": false,
"createMissingEntities": true
}
Alternatively, use a scope object instead of resultIds:
{
"scope": { "type": "current_filter", "filters": { "status": "pending" } },
"fields": ["correspondent", "documentType", "tags"]
}
Response (202):
Returns 409 JOB_ALREADY_RUNNING if an apply job is already active.
POST /api/v1/ai/results/batch-reject¶
Rejects multiple results.
Response:
POST /api/v1/ai/results/reject-all¶
Rejects all pending AI results in one call. No request body required.
Returns 0 if there are no pending results.
POST /api/v1/ai/results/apply-all¶
Applies all pending AI results as a background job.
Optional JSON body:
{
"scope": { "type": "all_pending" },
"fields": ["correspondent", "documentType", "tags"],
"allowClearing": false,
"createMissingEntities": true
}
scope: defaults to{ "type": "all_pending" }. Can also use{ "type": "current_filter", "filters": {...} }.fields: defaults to all three fields if omitted.
Response (202):
Returns 409 JOB_ALREADY_RUNNING if an apply job is already active.
POST /api/v1/ai/results/preflight¶
Pre-validates what would happen if results were applied, without making changes.
Request:
{
"scope": { "type": "all_pending" },
"fields": ["correspondent", "documentType", "tags"],
"allowClearing": false,
"createMissingEntities": true
}
scopeis required.fields: defaults to all three.createMissingEntities: defaulttrue.
Returns a summary of entities that would be created, documents that would be modified, and any conflicts.
GET /api/v1/ai/results/groups¶
Groups AI results by a specified field.
Required query param:
groupBy:suggestedCorrespondent,suggestedDocumentType,failureType, orconfidenceBand
Optional filter query params:
status,search,failed(true),minConfidence,maxConfidence,provider,model,changedOnly(true)
Returns grouped result counts and summaries.
POST /api/v1/ai/results/:id/revert¶
Reverts a previously applied AI result, restoring the original field values in Paperless-NGX using the pre-apply snapshot.
No request body required.
Error conditions:
404 NOT_FOUNDif the result does not exist400 BAD_REQUESTif the result was not applied or has no pre-apply snapshot
POST /api/v1/ai/results/:id/feedback¶
Records user feedback on an AI result.
Request:
{
"action": "rejected",
"rejectedFields": ["correspondent"],
"corrections": { "correspondent": "Acme Corp" },
"reason": "Wrong company identified"
}
action(required):rejected,corrected, orpartial_appliedrejectedFields: array of field names that were incorrectcorrections: object mapping field names to corrected valuesreason: optional free-text explanation
Response (201):
GET /api/v1/ai/feedback/summary¶
Returns an aggregate summary of all recorded AI feedback.
GET /api/v1/ai/costs¶
Returns AI cost statistics.
Optional query param:
days: integer, limits stats to the last N days
GET /api/v1/ai/costs/estimate¶
Estimates the cost of processing a batch of documents.
Required query param:
documentCount: positive integer
Returns 404 NOT_FOUND if no pricing data is available for the configured model.
GET /api/v1/ai/stats¶
Returns aggregate AI processing statistics.
{
"data": {
"totalProcessed": 150,
"pendingReview": 42,
"applied": 95,
"rejected": 13,
"failed": 3,
"totalPromptTokens": 450000,
"totalCompletionTokens": 25000
}
}
Document Q&A¶
All endpoints return 404 when RAG_ENABLED=false.
GET /api/v1/rag/config¶
Returns current RAG configuration.
{
"data": {
"embeddingModel": "text-embedding-3-small",
"embeddingDimensions": 1536,
"chunkSize": 400,
"chunkOverlap": 40,
"topK": 10,
"answerProvider": "openai",
"answerModel": "gpt-5.4-mini",
"systemPrompt": "You are a helpful document assistant...",
"maxContextTokens": 4000,
"autoIndex": false
}
}
PUT /api/v1/rag/config¶
Updates RAG configuration. Accepts a partial object — only included fields are updated.
POST /api/v1/rag/ask¶
Streams a Q&A response. Send a question and optionally a conversation ID for multi-turn.
Request:
Response: Server-Sent Events stream (Vercel AI SDK data stream format). Custom headers:
| Header | Description |
|---|---|
X-Conversation-Id |
ID of the conversation (created if not provided) |
X-Sources |
JSON array of source citations |
Each source citation:
{
"documentId": "doc_abc",
"title": "British Gas Energy Bill Q1 2024",
"chunkContent": "Period: 1 Jan - 31 Mar 2024. Total: £142.50...",
"score": 0.034
}
POST /api/v1/rag/index¶
Starts a background indexing job. Returns 202 with a job ID.
Request (optional):
| Field | Type | Default | Description |
|---|---|---|---|
rebuild |
boolean | false |
When true, deletes all existing chunks and re-indexes everything. Required after changing embedding model or dimensions. |
Response:
Track progress via GET /api/v1/jobs/:jobId/progress (SSE).
GET /api/v1/rag/stats¶
Returns index and conversation statistics.
{
"data": {
"totalChunks": 12500,
"indexedDocuments": 2400,
"unindexedDocuments": 100,
"embeddingModel": "text-embedding-3-small",
"lastIndexedAt": "2026-03-23T10:30:00.000Z",
"totalConversations": 15,
"totalMessages": 87
}
}
GET /api/v1/rag/conversations¶
Lists conversations, ordered by most recent first.
Query parameters: limit (default 20), offset (default 0).
{
"data": [
{
"id": "conv_abc",
"title": "What was my electricity bill last quarter?",
"createdAt": "2026-03-23T10:30:00.000Z",
"updatedAt": "2026-03-23T10:32:00.000Z",
"messageCount": 4
}
],
"meta": { "total": 15 }
}
GET /api/v1/rag/conversations/:id¶
Returns a single conversation with all messages.
{
"data": {
"id": "conv_abc",
"title": "What was my electricity bill last quarter?",
"createdAt": "2026-03-23T10:30:00.000Z",
"updatedAt": "2026-03-23T10:32:00.000Z",
"messages": [
{
"id": "msg_1",
"role": "user",
"content": "What was my electricity bill last quarter?",
"sourcesJson": "[{\"documentId\":\"doc_abc\",\"title\":\"...\"}]",
"tokenUsage": null,
"createdAt": "2026-03-23T10:30:00.000Z"
},
{
"id": "msg_2",
"role": "assistant",
"content": "Based on your British Gas energy bill...",
"sourcesJson": null,
"tokenUsage": 450,
"createdAt": "2026-03-23T10:30:05.000Z"
}
]
}
}
DELETE /api/v1/rag/conversations/:id¶
Deletes a conversation and all its messages. Returns 404 if not found.
Batch Operations¶
POST /api/v1/batch/status¶
Updates status for many groups.
groupIds must contain 1..1000 ids.
Response:
POST /api/v1/batch/delete-non-primary¶
Starts destructive batch delete in Paperless (background job).
Rules:
confirmmust betrue- all groups must currently be
pending
Response (202):
POST /api/v1/batch/purge-deleted¶
Purges all duplicate groups with deleted status from the database. No request body required.
Returns the count of purged groups.
Export / Import¶
GET /api/v1/export/duplicates.csv¶
Downloads CSV. Supports same filter params as GET /api/v1/duplicates.
GET /api/v1/export/config.json¶
Downloads config backup JSON.
POST /api/v1/import/config¶
Imports config backup JSON.
Response:
Paperless Proxy Endpoints¶
GET /api/v1/paperless/documents/:paperlessId/preview¶
Proxies Paperless preview stream.
GET /api/v1/paperless/documents/:paperlessId/thumb¶
Proxies Paperless thumbnail stream.
GET /api/v1/paperless/trash¶
Returns recycle bin count.
POST /api/v1/paperless/trash¶
Request:
Response: