AI Processing¶
Paperless NGX Dedupe can use large language models to classify your documents and suggest metadata -- correspondent, document type, and tags -- with per-field confidence scores. Results are stored for review before anything is changed in Paperless-NGX.
Setup¶
AI processing is disabled by default. Enable it with two environment variables:
| Variable | Required | Default | Notes |
|---|---|---|---|
AI_ENABLED |
No | false |
Master switch for all AI features |
AI_OPENAI_API_KEY |
When AI enabled | - | OpenAI API key |
The API key is required when AI_ENABLED=true.
Configuration¶
After enabling, configure processing behavior in Settings > AI Processing or via PUT /api/v1/ai/config. All settings are stored in the database and take effect immediately.
| Setting | Default | Range | Description |
|---|---|---|---|
provider |
openai |
openai |
LLM provider to use |
model |
gpt-5.4-mini |
see below | Model identifier |
promptTemplate |
built-in | string | Prompt template with placeholders |
maxContentLength |
8000 |
500--100,000 | Max characters of document text sent to the model |
batchSize |
10 |
1--100 | Maximum concurrent AI requests |
rateDelayMs |
0 |
0--60,000 | Delay (ms) between launching requests. 0 = auto-pacing |
autoProcess |
false |
boolean | Auto-start processing after sync |
processedTagName |
ai-processed |
string | Tag name added when suggestions are applied |
addProcessedTag |
false |
boolean | Whether to add the processed tag on apply |
includeCorrespondents |
false |
boolean | Send existing correspondents as reference data |
includeDocumentTypes |
false |
boolean | Send existing document types as reference data |
includeTags |
false |
boolean | Send existing tags as reference data |
flexProcessing |
true |
boolean | Use OpenAI Flex Processing for ~50% lower costs |
reasoningEffort |
low |
none, low, medium, high |
Reasoning effort level |
maxRetries |
3 |
0--10 | Retry count on transient API failures |
confidenceThresholdGlobal |
0 |
0--1 | Minimum confidence for all fields (floor) |
confidenceThresholdCorrespondent |
0 |
0--1 | Per-field override for correspondent |
confidenceThresholdDocumentType |
0 |
0--1 | Per-field override for document type |
confidenceThresholdTags |
0 |
0--1 | Per-field override for tags |
neverAutoCreateEntities |
false |
boolean | Block suggestions that would create new correspondents, document types, or tags |
neverOverwriteNonEmpty |
false |
boolean | Block suggestions that would overwrite an existing non-empty value |
tagsOnlyAutoApply |
false |
boolean | When auto-applying, only touch tags -- leave correspondent and document type for manual review |
autoApplyEnabled |
false |
boolean | Enable automatic application of high-confidence results after processing |
autoApplyRequireAllAboveThreshold |
true |
boolean | All fields must meet their confidence threshold |
autoApplyRequireNoNewEntities |
true |
boolean | Block auto-apply if new entities would be created |
autoApplyRequireNoClearing |
true |
boolean | Block auto-apply if existing values would be cleared |
autoApplyRequireOcrText |
true |
boolean | Block auto-apply for documents with no OCR text |
Available Models¶
| Model ID | Name |
|---|---|
gpt-5.4 |
GPT-5.4 |
gpt-5.4-mini |
GPT-5.4 Mini |
gpt-5.4-nano |
GPT-5.4 Nano |
How Processing Works¶
flowchart LR
A[Start Batch] --> B[Fetch Reference Data]
B --> C[For Each Document]
C --> D[Build Prompt]
D --> E[Call LLM]
E --> F[Store Result]
F --> G{More Docs?}
G -->|Yes| H[Rate Delay]
H --> C
G -->|No| I[Complete]
style A fill:#e8eaf6,stroke:#3f51b5
style B fill:#e8eaf6,stroke:#3f51b5
style C fill:#e8eaf6,stroke:#3f51b5
style D fill:#e8eaf6,stroke:#3f51b5
style E fill:#e8eaf6,stroke:#3f51b5
style F fill:#e8eaf6,stroke:#3f51b5
style G fill:#e8eaf6,stroke:#3f51b5
style H fill:#e8eaf6,stroke:#3f51b5
style I fill:#e8eaf6,stroke:#3f51b5
Processing runs as a background job with real-time progress via SSE:
-
Fetch reference data -- If the
includeCorrespondents,includeDocumentTypes, orincludeTagstoggles are enabled, the current lists are fetched from Paperless-NGX and included in the prompt so the model can match existing names. -
For each document -- The document's text is truncated to
maxContentLength(preserving the beginning and end), combined with the prompt template, and sent to the configured provider. -
Store result -- The model's structured response (suggested correspondent, document type, up to 5 tags, and per-field confidence scores) is stored in the database. If processing fails for a document, the error is stored instead.
-
Rate delay -- A configurable pause between API calls prevents rate-limit errors.
Documents without text content are skipped automatically. When re-processing, existing results are overwritten.
Reviewing Results¶
Open the AI Processing page to review suggestions. Each result shows:
- The document title
- Current vs. suggested correspondent, document type, and tags
- Per-field confidence scores (color-coded: green >= 80%, yellow >= 50%, red < 50%)
- An evidence snippet from the document
Status Lifecycle¶
Results move through these statuses:
| Status | Meaning |
|---|---|
pending_review |
Awaiting human review |
applied |
All suggested fields applied to Paperless-NGX |
partial |
Some fields applied (e.g., only correspondent and tags) |
rejected |
Dismissed by user |
reverted |
Previously applied result restored to its pre-apply state |
failed |
AI extraction failed (see error message for details) |
Applying Suggestions¶
When you apply a result:
- Each suggested name is resolved to its Paperless-NGX ID (case-insensitive match)
- If a correspondent, document type, or tag does not exist, it is created automatically in Paperless-NGX
- The document is updated via the Paperless-NGX API
- If
addProcessedTagis enabled, the configured tag is also added
You can apply all fields at once, or select specific fields for partial application. Batch apply and batch reject are supported for bulk review.
By default, applying a result will not clear existing Paperless-NGX metadata when the AI has no suggestion for a field. Pass allowClearing: true in the API to explicitly allow clearing. New entities (correspondents, document types, tags) are created automatically unless createMissingEntities: false is specified.
Reverting Applied Results¶
Applied or partially applied results can be reverted to restore the document's pre-apply state in Paperless-NGX. Revert uses the audit snapshot recorded at apply time to restore the original correspondent, document type, and tags. Results applied before audit tracking was introduced cannot be reverted.
Revert via the UI or POST /api/v1/ai/results/:id/revert.
Processing Scopes¶
When starting a batch, you choose which documents to process:
| Scope | Description |
|---|---|
new_only |
Only documents without an existing AI result (default) |
failed_only |
Re-process only documents whose previous run failed |
selected_document_ids |
Process a specific set of document IDs |
current_filter |
Process documents matching the current filter criteria |
full_reprocess |
Re-process every document, overwriting existing results |
Apply Scopes¶
When applying results in bulk, you choose which results to target:
| Scope | Description |
|---|---|
selected_result_ids |
Apply specific result IDs |
all_pending |
Apply all results in pending_review status |
current_filter |
Apply results matching the current filter criteria |
Preflight Validation¶
Before applying results in bulk, you can run a preflight check (POST /api/v1/ai/preflight) to preview the impact. The preflight report shows:
- How many fields would change per category (correspondent, document type, tags)
- Which new entities would be created
- How many results have low confidence
- How many results are no-ops (already matching)
- How many results would destructively clear existing values
- A confidence distribution breakdown (high/medium/low)
- Gate evaluation results showing how many results would auto-apply vs. be blocked, with reasons
Confidence Gates and Auto-Apply¶
Confidence Gates¶
Confidence gates control which AI results are eligible for automatic application. Each suggestion's per-field confidence score is checked against configurable thresholds:
- Global threshold (
confidenceThresholdGlobal) -- floor applied to all fields - Per-field thresholds (
confidenceThresholdCorrespondent,confidenceThresholdDocumentType,confidenceThresholdTags) -- per-field overrides; the effective threshold is the higher of global and per-field
Additional gates refine eligibility:
neverAutoCreateEntities-- blocks any suggestion that would create a new correspondent, document type, or tagneverOverwriteNonEmpty-- blocks any suggestion that would overwrite an existing non-empty value with a different onetagsOnlyAutoApply-- restricts auto-apply to tags only; correspondent and document type are left for manual review
Auto-Apply¶
When autoApplyEnabled is turned on, results that pass all configured gates are automatically applied to Paperless-NGX after processing completes. Auto-apply is maximally conservative by default -- all four safety requirements are enabled:
| Requirement | Default | Description |
|---|---|---|
autoApplyRequireAllAboveThreshold |
true |
Every field must meet its confidence threshold |
autoApplyRequireNoNewEntities |
true |
No new correspondents, types, or tags may be created |
autoApplyRequireNoClearing |
true |
Existing values may not be cleared |
autoApplyRequireOcrText |
true |
The document must have OCR text |
Auto-apply never passes allowClearing: true, so existing metadata is always preserved. Results that fail gate evaluation remain in pending_review for manual review.
Cost Tracking¶
Per-result cost estimates are computed automatically using pricing data fetched from the LiteLLM public pricing index, which is refreshed every 24 hours. The cost statistics API (GET /api/v1/ai/costs) provides:
- Total cost across all AI processing
- Cost broken down by provider and model
- Cost over time (daily aggregation)
- Token usage by provider and model
Use POST /api/v1/ai/costs/estimate to estimate the cost of a batch before running it. The estimate uses historical average token counts when available, falling back to conservative defaults.
Feedback¶
User actions on AI results are recorded as feedback for auditing and analysis:
- Rejected -- result dismissed by user, with an optional reason
- Partial applied -- some fields applied, some excluded
- Corrected -- user edited the AI's suggestion before applying
The feedback summary (GET /api/v1/ai/feedback) shows aggregate statistics including the most frequently rejected fields and common correction patterns. This data can help tune your prompt template and confidence thresholds.
Provider Implementation Details¶
OpenAI extraction uses the Responses API (responses.parse()) with Zod-based structured output. This ensures the model returns a valid JSON object matching the extraction schema. The reasoningEffort setting controls the effort parameter when supported.
Key details:
- Uses
developerrole for the system prompt - Structured output via
zodTextFormat - Handles refusal detection and incomplete response detection
- Reports cached token counts when available
- When
flexProcessingis enabled, requests are submitted to OpenAI's Flex Processing tier for ~50% lower costs (with relaxed latency SLA)
Prompt Customization¶
The built-in prompt works well for general document classification. For specialized libraries you can edit the prompt template in Settings.
The template supports these placeholders:
| Placeholder | Replaced With |
|---|---|
{{existing_correspondents}} |
Comma-separated list of existing correspondent names (when includeCorrespondents is enabled) |
{{existing_document_types}} |
Comma-separated list of existing document type names (when includeDocumentTypes is enabled) |
{{existing_tags}} |
Comma-separated list of existing tag names (when includeTags is enabled) |
The document title and text content are included automatically in the user prompt (not the system prompt template). They do not need placeholders.
The prompt uses markdown sections for structure.
Enable reference data for better matching
Turning on includeCorrespondents, includeDocumentTypes, and includeTags helps the model reuse your existing names rather than inventing new ones. This is especially useful for established libraries.
Tips¶
Best practices
- Start small -- Process a handful of documents first to verify the prompt produces good results before running a full batch.
- Tune
maxContentLength-- Lower values reduce cost; higher values give the model more context. 8,000 characters is a good default for most documents. - Use
rateDelayMs-- Provider rate limits vary by plan. Increase the delay if you hit 429 errors. reasoningEffort-- Controls the OpenAI reasoning effort parameter. Higher effort may improve accuracy at the cost of latency and tokens.- Review before applying -- AI suggestions are not always correct. The confidence scores help prioritize review, but always verify before applying to Paperless-NGX.
See Also¶
- Configuration -- environment variables and runtime settings
- API Reference -- AI REST API endpoints
- How It Works -- the deduplication pipeline