Compare commits

...

25 Commits

Author SHA1 Message Date
shamoon
9f3946d938 Docs 2025-04-24 14:52:30 -07:00
shamoon
a3eed49638 Use password and select config fields 2025-04-24 14:14:25 -07:00
shamoon
e14f508327 Use a frontend config 2025-04-23 19:24:32 -07:00
shamoon
3cbea6cc51 Pass AI enabled to frontend 2025-04-23 12:59:35 -07:00
shamoon
5cd54f833a Basic handling of non-AI response 2025-04-23 09:25:54 -07:00
shamoon
a59fe0cb3c Cleaner auto-remove 2025-04-22 08:08:30 -07:00
shamoon
3fc9992f4d Automatically remove suggestions after add 2025-04-22 08:08:30 -07:00
shamoon
778f6c8162 Test views, caching 2025-04-22 08:08:29 -07:00
shamoon
ddd2428d9c Invalidate llm suggestion cache on doc save 2025-04-22 08:08:29 -07:00
shamoon
c872e50739 Fix 2025-04-22 08:08:29 -07:00
shamoon
b938d0aeba Backend tests 2025-04-22 08:08:28 -07:00
shamoon
dd78c5d496 Correct object retrieval 2025-04-22 08:08:28 -07:00
shamoon
131ae28794 Refactor 2025-04-22 08:08:28 -07:00
shamoon
16d7d95517 Move module 2025-04-22 08:08:27 -07:00
shamoon
53c0b6e1c7 Hook up the add buttons 2025-04-22 08:08:27 -07:00
shamoon
168ad3f9a2 Refine the suggestions dropdown ui a bit 2025-04-22 08:08:27 -07:00
shamoon
f9ffc97970 Suggestions dropdown 2025-04-22 08:08:26 -07:00
shamoon
89671eb303 Messing with a suggest button 2025-04-22 08:08:26 -07:00
shamoon
7f37832ea0 Rename config 2025-04-22 08:08:25 -07:00
shamoon
d31b4ced63 Title suggestion ui 2025-04-22 08:08:25 -07:00
shamoon
e3f0a37f91 Just start the frontend
[ci skip]
2025-04-22 08:08:25 -07:00
shamoon
8c0a61dbc6 wow llama3 is bad 2025-04-22 08:08:24 -07:00
shamoon
f3c7c95c69 Changeup logging 2025-04-22 08:08:24 -07:00
shamoon
7e3ec32580 Some logging, error handling 2025-04-22 08:08:24 -07:00
shamoon
84da2ce145 Basic start 2025-04-22 08:08:23 -07:00
39 changed files with 1487 additions and 112 deletions

View File

@@ -1700,3 +1700,48 @@ password. All of these options come from their similarly-named [Django settings]
#### [`PAPERLESS_EMAIL_USE_SSL=<bool>`](#PAPERLESS_EMAIL_USE_SSL) {#PAPERLESS_EMAIL_USE_SSL}
: Defaults to false.
## AI {#ai}
#### [`PAPERLESS_ENABLE_AI=<bool>`](#PAPERLESS_ENABLE_AI) {#PAPERLESS_ENABLE_AI}
: Enables the AI features in Paperless. This includes the AI-based
suggestions. This setting is required to be set to true in order to use the AI features.
Defaults to false.
#### [`PAPERLESS_AI_BACKEND=<str>`](#PAPERLESS_AI_BACKEND) {#PAPERLESS_AI_BACKEND}
: The AI backend to use. This can be either "openai" or "ollama". If set to "ollama", the AI
features will be run locally on your machine. If set to "openai", the AI features will be run
using the OpenAI API. This setting is required to be set to use the AI features.
Defaults to None.
!!! note
The OpenAI API is a paid service. You will need to set up an OpenAI account and
will be charged for usage incurred by Paperless-ngx features and your document data
will (of course) be shared with OpenAI. Paperless-ngx does not endorse the use of the
OpenAI API in any way.
Refer to the OpenAI terms of service, and use at your own risk.
#### [`PAPERLESS_LLM_MODEL=<str>`](#PAPERLESS_LLM_MODEL) {#PAPERLESS_LLM_MODEL}
: The model to use for the AI backend, i.e. "gpt-3.5-turbo", "gpt-4" or any of the models supported by the
current backend. This setting is required to be set to use the AI features.
Defaults to None.
#### [`PAPERLESS_LLM_API_KEY=<str>`](#PAPERLESS_LLM_API_KEY) {#PAPERLESS_LLM_API_KEY}
: The API key to use for the AI backend. This is required for the OpenAI backend only.
Defaults to None.
#### [`PAPERLESS_LLM_URL=<str>`](#PAPERLESS_LLM_URL) {#PAPERLESS_LLM_URL}
: The URL to use for the AI backend. This is required for the Ollama backend only.
Defaults to None.

View File

@@ -25,11 +25,12 @@ physical documents into a searchable online archive so you can keep, well, _less
## Features
- **Organize and index** your scanned documents with tags, correspondents, types, and more.
- _Your_ data is stored locally on _your_ server and is never transmitted or shared in any way.
- _Your_ data is stored locally on _your_ server and is never transmitted or shared in any way, unless you explicitly choose to do so.
- Performs **OCR** on your documents, adding searchable and selectable text, even to documents scanned with only images.
- Utilizes the open-source Tesseract engine to recognize more than 100 languages.
- Documents are saved as PDF/A format which is designed for long term storage, alongside the unaltered originals.
- Uses machine-learning to automatically add tags, correspondents and document types to your documents.
- **New**: Paperless-ngx can now leverage AI (Large Language Models or LLMs) for document suggestions. This is an optional feature that can be enabled (and is disabled by default).
- Supports PDF documents, images, plain text files, Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents)[^1] and more.
- Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely with different configurations assigned to different documents.
- **Beautiful, modern web application** that features:

View File

@@ -260,6 +260,14 @@ Once setup, navigating to the email settings page in Paperless-ngx will allow yo
You can also submit a document using the REST API, see [POSTing documents](api.md#file-uploads)
for details.
## Document Suggestions
Paperless-ngx can suggest tags, correspondents, document types and storage paths for documents based on the content of the document. This is done using a machine learning model that is trained on the documents in your database. The suggestions are shown in the document detail page and can be accepted or rejected by the user.
### AI-Enhanced Suggestions
If enabled, Paperless-ngx can use an AI LLM model to suggest document titles, dates, tags, correspondents and document types for documents. This feature will always be "opt-in" and does not disable the existing suggestion system. Currently, both remote (via the OpenAI API) and local (via Ollama) models are supported, see [configuration](configuration.md#ai) for details.
## Sharing documents from Paperless-ngx
Paperless-ngx supports sharing documents with other users by assigning them [permissions](#object-permissions)

View File

@@ -35,6 +35,7 @@
@case (ConfigOptionType.String) { <pngx-input-text [formControlName]="option.key" [error]="errors[option.key]"></pngx-input-text> }
@case (ConfigOptionType.JSON) { <pngx-input-text [formControlName]="option.key" [error]="errors[option.key]"></pngx-input-text> }
@case (ConfigOptionType.File) { <pngx-input-file [formControlName]="option.key" (upload)="uploadFile($event, option.key)" [error]="errors[option.key]"></pngx-input-file> }
@case (ConfigOptionType.Password) { <pngx-input-password [formControlName]="option.key" [error]="errors[option.key]"></pngx-input-password> }
}
</div>
</div>

View File

@@ -29,6 +29,7 @@ import { SettingsService } from 'src/app/services/settings.service'
import { ToastService } from 'src/app/services/toast.service'
import { FileComponent } from '../../common/input/file/file.component'
import { NumberComponent } from '../../common/input/number/number.component'
import { PasswordComponent } from '../../common/input/password/password.component'
import { SelectComponent } from '../../common/input/select/select.component'
import { SwitchComponent } from '../../common/input/switch/switch.component'
import { TextComponent } from '../../common/input/text/text.component'
@@ -46,6 +47,7 @@ import { LoadingComponentWithPermissions } from '../../loading-component/loading
TextComponent,
NumberComponent,
FileComponent,
PasswordComponent,
AsyncPipe,
NgbNavModule,
FormsModule,

View File

@@ -1,7 +1,7 @@
<div ngbDropdown #fieldDropdown="ngbDropdown" (openChange)="onOpenClose($event)" [popperOptions]="popperOptions" placement="bottom-end">
<button class="btn btn-sm btn-outline-primary" id="customFieldsDropdown" [disabled]="disabled" ngbDropdownToggle>
<div ngbDropdown #fieldDropdown="ngbDropdown" (openChange)="onOpenClose($event)" [popperOptions]="popperOptions">
<button type="button" class="btn btn-sm btn-outline-primary" id="customFieldsDropdown" [disabled]="disabled" ngbDropdownToggle>
<i-bs name="ui-radios"></i-bs>
<div class="d-none d-sm-inline">&nbsp;<ng-container i18n>Custom Fields</ng-container></div>
<div class="d-none d-lg-inline">&nbsp;<ng-container i18n>Custom Fields</ng-container></div>
</button>
<div ngbDropdownMenu aria-labelledby="customFieldsDropdown" class="shadow custom-fields-dropdown">
<div class="list-group list-group-flush" (keydown)="listKeyDown($event)">

View File

@@ -1,17 +1,24 @@
<div class="mb-3">
<label class="form-label" [for]="inputId">{{title}}</label>
<div class="input-group" [class.is-invalid]="error">
<input #inputField [type]="showReveal && textVisible ? 'text' : 'password'" class="form-control" [class.is-invalid]="error" [id]="inputId" [(ngModel)]="value" (focus)="onFocus()" (focusout)="onFocusOut()" (change)="onChange(value)" [disabled]="disabled" [autocomplete]="autocomplete">
@if (showReveal) {
<button type="button" class="btn btn-outline-secondary" (click)="toggleVisibility()" i18n-title title="Show password" [disabled]="disabled || disableRevealToggle">
<i-bs name="eye"></i-bs>
</button>
<div class="mb-3" [class.pb-3]="error">
<div class="row">
<div class="d-flex align-items-center position-relative hidden-button-container" [class.col-md-3]="horizontal">
@if (title) {
<label class="form-label" [class.mb-md-0]="horizontal" [for]="inputId">{{title}}</label>
}
</div>
<div class="position-relative" [class.col-md-9]="horizontal">
<div class="input-group" [class.is-invalid]="error">
<input #inputField [type]="showReveal && textVisible ? 'text' : 'password'" class="form-control" [class.is-invalid]="error" [id]="inputId" [(ngModel)]="value" (focus)="onFocus()" (focusout)="onFocusOut()" (change)="onChange(value)" [disabled]="disabled" [autocomplete]="autocomplete">
@if (showReveal) {
<button type="button" class="btn btn-outline-secondary" (click)="toggleVisibility()" i18n-title title="Show password" [disabled]="disabled || disableRevealToggle">
<i-bs name="eye"></i-bs>
</button>
}
</div>
<div class="invalid-feedback">
{{error}}
</div>
@if (hint) {
<small class="form-text text-muted" [innerHTML]="hint | safeHtml"></small>
}
</div>
<div class="invalid-feedback">
{{error}}
</div>
@if (hint) {
<small class="form-text text-muted" [innerHTML]="hint | safeHtml"></small>
}
</div>

View File

@@ -15,6 +15,12 @@
@if (hint) {
<small class="form-text text-muted" [innerHTML]="hint | safeHtml"></small>
}
@if (getSuggestion()?.length > 0) {
<small>
<span i18n>Suggestion:</span>&nbsp;
<a (click)="applySuggestion(s)" [routerLink]="[]">{{getSuggestion()}}</a>&nbsp;
</small>
}
<div class="invalid-feedback position-absolute top-100">
{{error}}
</div>

View File

@@ -26,10 +26,20 @@ describe('TextComponent', () => {
it('should support use of input field', () => {
expect(component.value).toBeUndefined()
// TODO: why doesn't this work?
// input.value = 'foo'
// input.dispatchEvent(new Event('change'))
// fixture.detectChanges()
// expect(component.value).toEqual('foo')
input.value = 'foo'
input.dispatchEvent(new Event('input'))
fixture.detectChanges()
expect(component.value).toBe('foo')
})
it('should support suggestion', () => {
component.value = 'foo'
component.suggestion = 'foo'
expect(component.getSuggestion()).toBe('')
component.value = 'bar'
expect(component.getSuggestion()).toBe('foo')
component.applySuggestion()
fixture.detectChanges()
expect(component.value).toBe('foo')
})
})

View File

@@ -4,6 +4,7 @@ import {
NG_VALUE_ACCESSOR,
ReactiveFormsModule,
} from '@angular/forms'
import { RouterLink } from '@angular/router'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { SafeHtmlPipe } from 'src/app/pipes/safehtml.pipe'
import { AbstractInputComponent } from '../abstract-input'
@@ -24,6 +25,7 @@ import { AbstractInputComponent } from '../abstract-input'
ReactiveFormsModule,
SafeHtmlPipe,
NgxBootstrapIconsModule,
RouterLink,
],
})
export class TextComponent extends AbstractInputComponent<string> {
@@ -33,7 +35,19 @@ export class TextComponent extends AbstractInputComponent<string> {
@Input()
placeholder: string = ''
@Input()
suggestion: string = ''
constructor() {
super()
}
getSuggestion() {
return this.value !== this.suggestion ? this.suggestion : ''
}
applySuggestion() {
this.value = this.suggestion
this.onChange(this.value)
}
}

View File

@@ -0,0 +1,49 @@
<div class="btn-group">
<button type="button" class="btn btn-sm btn-outline-primary" (click)="clickSuggest()" [disabled]="loading || (suggestions && !aiEnabled)">
@if (loading) {
<div class="spinner-border spinner-border-sm" role="status"></div>
} @else {
<i-bs width="1.2em" height="1.2em" name="stars"></i-bs>
}
<span class="d-none d-lg-inline ps-1" i18n>Suggest</span>
@if (totalSuggestions > 0) {
<span class="badge bg-primary ms-2">{{ totalSuggestions }}</span>
}
</button>
@if (aiEnabled) {
<div class="btn-group" ngbDropdown #dropdown="ngbDropdown" [popperOptions]="popperOptions">
<button type="button" class="btn btn-sm btn-outline-primary" ngbDropdownToggle [disabled]="loading || !suggestions" aria-expanded="false" aria-controls="suggestionsDropdown" aria-label="Suggestions dropdown">
<span class="visually-hidden" i18n>Show suggestions</span>
</button>
<div ngbDropdownMenu aria-labelledby="suggestionsDropdown" class="shadow suggestions-dropdown">
<div class="list-group list-group-flush small pb-0">
@if (!suggestions?.suggested_tags && !suggestions?.suggested_document_types && !suggestions?.suggested_correspondents) {
<div class="list-group-item text-muted fst-italic">
<small class="text-muted small fst-italic" i18n>No novel suggestions</small>
</div>
}
@if (suggestions?.suggested_tags.length > 0) {
<small class="list-group-item text-uppercase text-muted small">Tags</small>
@for (tag of suggestions.suggested_tags; track tag) {
<button type="button" class="list-group-item list-group-item-action bg-light" (click)="addTag.emit(tag)" i18n>{{ tag }}</button>
}
}
@if (suggestions?.suggested_document_types.length > 0) {
<div class="list-group-item text-uppercase text-muted small">Document Types</div>
@for (type of suggestions.suggested_document_types; track type) {
<button type="button" class="list-group-item list-group-item-action bg-light" (click)="addDocumentType.emit(type)" i18n>{{ type }}</button>
}
}
@if (suggestions?.suggested_correspondents.length > 0) {
<div class="list-group-item text-uppercase text-muted small">Correspondents</div>
@for (correspondent of suggestions.suggested_correspondents; track correspondent) {
<button type="button" class="list-group-item list-group-item-action bg-light" (click)="addCorrespondent.emit(correspondent)" i18n>{{ correspondent }}</button>
}
}
</div>
</div>
</div>
}
</div>

View File

@@ -0,0 +1,3 @@
.suggestions-dropdown {
min-width: 250px;
}

View File

@@ -0,0 +1,51 @@
import { ComponentFixture, TestBed } from '@angular/core/testing'
import { NgbDropdownModule } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule, allIcons } from 'ngx-bootstrap-icons'
import { SuggestionsDropdownComponent } from './suggestions-dropdown.component'
describe('SuggestionsDropdownComponent', () => {
let component: SuggestionsDropdownComponent
let fixture: ComponentFixture<SuggestionsDropdownComponent>
beforeEach(() => {
TestBed.configureTestingModule({
imports: [
NgbDropdownModule,
NgxBootstrapIconsModule.pick(allIcons),
SuggestionsDropdownComponent,
],
providers: [],
})
fixture = TestBed.createComponent(SuggestionsDropdownComponent)
component = fixture.componentInstance
fixture.detectChanges()
})
it('should calculate totalSuggestions', () => {
component.suggestions = {
suggested_correspondents: ['John Doe'],
suggested_tags: ['Tag1', 'Tag2'],
suggested_document_types: ['Type1'],
}
expect(component.totalSuggestions).toBe(4)
})
it('should emit getSuggestions when clickSuggest is called and suggestions are null', () => {
jest.spyOn(component.getSuggestions, 'emit')
component.suggestions = null
component.clickSuggest()
expect(component.getSuggestions.emit).toHaveBeenCalled()
})
it('should toggle dropdown when clickSuggest is called and suggestions are not null', () => {
component.aiEnabled = true
fixture.detectChanges()
component.suggestions = {
suggested_correspondents: [],
suggested_tags: [],
suggested_document_types: [],
}
component.clickSuggest()
expect(component.dropdown.open).toBeTruthy()
})
})

View File

@@ -0,0 +1,64 @@
import {
Component,
EventEmitter,
Input,
Output,
ViewChild,
} from '@angular/core'
import { NgbDropdown, NgbDropdownModule } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { DocumentSuggestions } from 'src/app/data/document-suggestions'
import { pngxPopperOptions } from 'src/app/utils/popper-options'
@Component({
selector: 'pngx-suggestions-dropdown',
imports: [NgbDropdownModule, NgxBootstrapIconsModule],
templateUrl: './suggestions-dropdown.component.html',
styleUrl: './suggestions-dropdown.component.scss',
})
export class SuggestionsDropdownComponent {
public popperOptions = pngxPopperOptions
@ViewChild('dropdown') dropdown: NgbDropdown
@Input()
suggestions: DocumentSuggestions = null
@Input()
aiEnabled: boolean = false
@Input()
loading: boolean = false
@Input()
disabled: boolean = false
@Output()
getSuggestions: EventEmitter<SuggestionsDropdownComponent> =
new EventEmitter()
@Output()
addTag: EventEmitter<string> = new EventEmitter()
@Output()
addDocumentType: EventEmitter<string> = new EventEmitter()
@Output()
addCorrespondent: EventEmitter<string> = new EventEmitter()
public clickSuggest(): void {
if (!this.suggestions) {
this.getSuggestions.emit(this)
} else {
this.dropdown?.toggle()
}
}
get totalSuggestions(): number {
return (
this.suggestions?.suggested_correspondents?.length +
this.suggestions?.suggested_tags?.length +
this.suggestions?.suggested_document_types?.length || 0
)
}
}

View File

@@ -72,16 +72,6 @@
</div>
</div>
<pngx-custom-fields-dropdown
*pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.CustomField }"
[documentId]="documentId"
[disabled]="!userCanEdit"
[existingFields]="document?.custom_fields"
(created)="refreshCustomFields()"
(added)="addField($event)">
</pngx-custom-fields-dropdown>
<div class="ms-auto" ngbDropdown>
<button class="btn btn-sm btn-outline-primary" id="sendDropdown" ngbDropdownToggle>
<i-bs name="send"></i-bs>
@@ -102,7 +92,7 @@
</pngx-page-header>
<div class="row">
<div class="col-md-6 col-xl-4 mb-4">
<div class="col-md-6 col-xl-5 mb-4">
<form [formGroup]='documentForm' (ngSubmit)="save()">
@@ -119,6 +109,32 @@
</button>
</div>
<ng-container *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.Document }">
<div class="btn-group pb-3 ms-auto">
<pngx-suggestions-dropdown *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.Document }"
[disabled]="!userCanEdit || suggestionsLoading"
[loading]="suggestionsLoading"
[suggestions]="suggestions"
[aiEnabled]="aiEnabled"
(getSuggestions)="getSuggestions()"
(addTag)="createTag($event)"
(addDocumentType)="createDocumentType($event)"
(addCorrespondent)="createCorrespondent($event)">
</pngx-suggestions-dropdown>
</div>
<div class="btn-group pb-3 ms-2">
<pngx-custom-fields-dropdown
*pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.CustomField }"
[documentId]="documentId"
[disabled]="!userCanEdit"
[existingFields]="document?.custom_fields"
(created)="refreshCustomFields()"
(added)="addField($event)">
</pngx-custom-fields-dropdown>
</div>
</ng-container>
<ng-container *ngTemplateOutlet="saveButtons"></ng-container>
</div>
@@ -127,7 +143,7 @@
<a ngbNavLink i18n>Details</a>
<ng-template ngbNavContent>
<div>
<pngx-input-text #inputTitle i18n-title title="Title" formControlName="title" [horizontal]="true" (keyup)="titleKeyUp($event)" [error]="error?.title"></pngx-input-text>
<pngx-input-text #inputTitle i18n-title title="Title" formControlName="title" [horizontal]="true" [suggestion]="suggestions?.title" (keyup)="titleKeyUp($event)" [error]="error?.title"></pngx-input-text>
<pngx-input-number i18n-title title="Archive serial number" [error]="error?.archive_serial_number" [horizontal]="true" formControlName='archive_serial_number'></pngx-input-number>
<pngx-input-date i18n-title title="Date created" formControlName="created_date" [suggestions]="suggestions?.dates" [showFilter]="true" [horizontal]="true" (filterDocuments)="filterDocuments($event)"
[error]="error?.created_date"></pngx-input-date>
@@ -137,7 +153,7 @@
(createNew)="createDocumentType($event)" [hideAddButton]="createDisabled(DataType.DocumentType)" [suggestions]="suggestions?.document_types" *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.DocumentType }"></pngx-input-select>
<pngx-input-select [items]="storagePaths" i18n-title title="Storage path" formControlName="storage_path" [allowNull]="true" [showFilter]="true" [horizontal]="true" (filterDocuments)="filterDocuments($event, DataType.StoragePath)"
(createNew)="createStoragePath($event)" [hideAddButton]="createDisabled(DataType.StoragePath)" [suggestions]="suggestions?.storage_paths" i18n-placeholder placeholder="Default" *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.StoragePath }"></pngx-input-select>
<pngx-input-tags formControlName="tags" [suggestions]="suggestions?.tags" [showFilter]="true" [horizontal]="true" (filterDocuments)="filterDocuments($event, DataType.Tag)" [hideAddButton]="createDisabled(DataType.Tag)" *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.Tag }"></pngx-input-tags>
<pngx-input-tags #tagsInput formControlName="tags" [suggestions]="suggestions?.tags" [showFilter]="true" [horizontal]="true" (filterDocuments)="filterDocuments($event, DataType.Tag)" [hideAddButton]="createDisabled(DataType.Tag)" *pngxIfPermissions="{ action: PermissionAction.View, type: PermissionType.Tag }"></pngx-input-tags>
@for (fieldInstance of document?.custom_fields; track fieldInstance.field; let i = $index) {
<div [formGroup]="customFieldFormFields.controls[i]">
@switch (getCustomFieldFromInstance(fieldInstance)?.data_type) {
@@ -351,14 +367,14 @@
</form>
</div>
<div class="col-md-6 col-xl-8 mb-3 d-none d-md-block position-relative" #pdfPreview>
<div class="col-md-6 col-xl-7 mb-3 d-none d-md-block position-relative" #pdfPreview>
<ng-container *ngTemplateOutlet="previewContent"></ng-container>
</div>
</div>
<ng-template #saveButtons>
<div class="btn-group pb-3 ms-auto">
<div class="btn-group pb-3 ms-4">
<ng-container *pngxIfPermissions="{ action: PermissionAction.Change, type: PermissionType.Document }">
<button type="submit" class="order-3 btn btn-sm btn-primary" i18n [disabled]="!userCanEdit || networkActive || (isDirty$ | async) !== true">Save</button>
@if (hasNext()) {

View File

@@ -156,6 +156,16 @@ describe('DocumentDetailComponent', () => {
{
provide: TagService,
useValue: {
getCachedMany: (ids: number[]) =>
of(
ids.map((id) => ({
id,
name: `Tag${id}`,
is_inbox_tag: true,
color: '#ff0000',
text_color: '#000000',
}))
),
listAll: () =>
of({
count: 3,
@@ -382,8 +392,32 @@ describe('DocumentDetailComponent', () => {
currentUserCan = true
})
it('should support creating document type', () => {
it('should support creating tag, remove from suggestions', () => {
initNormally()
component.suggestions = {
suggested_tags: ['Tag1', 'NewTag12'],
}
let openModal: NgbModalRef
modalService.activeInstances.subscribe((modal) => (openModal = modal[0]))
const modalSpy = jest.spyOn(modalService, 'open')
component.createTag('NewTag12')
expect(modalSpy).toHaveBeenCalled()
openModal.componentInstance.succeeded.next({
id: 12,
name: 'NewTag12',
is_inbox_tag: true,
color: '#ff0000',
text_color: '#000000',
})
expect(component.documentForm.get('tags').value).toContain(12)
expect(component.suggestions.suggested_tags).not.toContain('NewTag12')
})
it('should support creating document type, remove from suggestions', () => {
initNormally()
component.suggestions = {
suggested_document_types: ['DocumentType1', 'NewDocType2'],
}
let openModal: NgbModalRef
modalService.activeInstances.subscribe((modal) => (openModal = modal[0]))
const modalSpy = jest.spyOn(modalService, 'open')
@@ -391,10 +425,16 @@ describe('DocumentDetailComponent', () => {
expect(modalSpy).toHaveBeenCalled()
openModal.componentInstance.succeeded.next({ id: 12, name: 'NewDocType12' })
expect(component.documentForm.get('document_type').value).toEqual(12)
expect(component.suggestions.suggested_document_types).not.toContain(
'NewDocType2'
)
})
it('should support creating correspondent', () => {
it('should support creating correspondent, remove from suggestions', () => {
initNormally()
component.suggestions = {
suggested_correspondents: ['Correspondent1', 'NewCorrrespondent12'],
}
let openModal: NgbModalRef
modalService.activeInstances.subscribe((modal) => (openModal = modal[0]))
const modalSpy = jest.spyOn(modalService, 'open')
@@ -405,6 +445,9 @@ describe('DocumentDetailComponent', () => {
name: 'NewCorrrespondent12',
})
expect(component.documentForm.get('correspondent').value).toEqual(12)
expect(component.suggestions.suggested_correspondents).not.toContain(
'NewCorrrespondent12'
)
})
it('should support creating storage path', () => {
@@ -983,7 +1026,7 @@ describe('DocumentDetailComponent', () => {
expect(component.document.custom_fields).toHaveLength(initialLength - 1)
expect(component.customFieldFormFields).toHaveLength(initialLength - 1)
expect(
fixture.debugElement.query(By.css('form')).nativeElement.textContent
fixture.debugElement.query(By.css('form ul')).nativeElement.textContent
).not.toContain('Field 1')
const updateSpy = jest.spyOn(documentService, 'update')
component.save(true)
@@ -1020,10 +1063,22 @@ describe('DocumentDetailComponent', () => {
it('should get suggestions', () => {
const suggestionsSpy = jest.spyOn(documentService, 'getSuggestions')
suggestionsSpy.mockReturnValue(of({ tags: [42, 43] }))
suggestionsSpy.mockReturnValue(
of({
tags: [42, 43],
suggested_tags: [],
suggested_document_types: [],
suggested_correspondents: [],
})
)
initNormally()
expect(suggestionsSpy).toHaveBeenCalled()
expect(component.suggestions).toEqual({ tags: [42, 43] })
expect(component.suggestions).toEqual({
tags: [42, 43],
suggested_tags: [],
suggested_document_types: [],
suggested_correspondents: [],
})
})
it('should show error if needed for get suggestions', () => {

View File

@@ -74,6 +74,7 @@ import { CustomFieldsService } from 'src/app/services/rest/custom-fields.service
import { DocumentTypeService } from 'src/app/services/rest/document-type.service'
import { DocumentService } from 'src/app/services/rest/document.service'
import { StoragePathService } from 'src/app/services/rest/storage-path.service'
import { TagService } from 'src/app/services/rest/tag.service'
import { UserService } from 'src/app/services/rest/user.service'
import { SettingsService } from 'src/app/services/settings.service'
import { ToastService } from 'src/app/services/toast.service'
@@ -89,6 +90,7 @@ import { CorrespondentEditDialogComponent } from '../common/edit-dialog/correspo
import { DocumentTypeEditDialogComponent } from '../common/edit-dialog/document-type-edit-dialog/document-type-edit-dialog.component'
import { EditDialogMode } from '../common/edit-dialog/edit-dialog.component'
import { StoragePathEditDialogComponent } from '../common/edit-dialog/storage-path-edit-dialog/storage-path-edit-dialog.component'
import { TagEditDialogComponent } from '../common/edit-dialog/tag-edit-dialog/tag-edit-dialog.component'
import { EmailDocumentDialogComponent } from '../common/email-document-dialog/email-document-dialog.component'
import { CheckComponent } from '../common/input/check/check.component'
import { DateComponent } from '../common/input/date/date.component'
@@ -102,6 +104,7 @@ import { TextComponent } from '../common/input/text/text.component'
import { UrlComponent } from '../common/input/url/url.component'
import { PageHeaderComponent } from '../common/page-header/page-header.component'
import { ShareLinksDialogComponent } from '../common/share-links-dialog/share-links-dialog.component'
import { SuggestionsDropdownComponent } from '../common/suggestions-dropdown/suggestions-dropdown.component'
import { DocumentHistoryComponent } from '../document-history/document-history.component'
import { DocumentNotesComponent } from '../document-notes/document-notes.component'
import { ComponentWithPermissions } from '../with-permissions/with-permissions.component'
@@ -158,6 +161,7 @@ export enum ZoomSetting {
NumberComponent,
MonetaryComponent,
UrlComponent,
SuggestionsDropdownComponent,
CustomDatePipe,
FileSizePipe,
IfPermissionsDirective,
@@ -179,6 +183,8 @@ export class DocumentDetailComponent
@ViewChild('inputTitle')
titleInput: TextComponent
@ViewChild('tagsInput') tagsInput: TagsComponent
expandOriginalMetadata = false
expandArchivedMetadata = false
@@ -190,6 +196,7 @@ export class DocumentDetailComponent
document: Document
metadata: DocumentMetadata
suggestions: DocumentSuggestions
suggestionsLoading: boolean = false
users: User[]
title: string
@@ -262,6 +269,7 @@ export class DocumentDetailComponent
constructor(
private documentsService: DocumentService,
private route: ActivatedRoute,
private tagService: TagService,
private correspondentService: CorrespondentService,
private documentTypeService: DocumentTypeService,
private router: Router,
@@ -291,6 +299,10 @@ export class DocumentDetailComponent
return this.settings.get(SETTINGS_KEYS.USE_NATIVE_PDF_VIEWER)
}
get aiEnabled(): boolean {
return this.settings.get(SETTINGS_KEYS.AI_ENABLED)
}
get archiveContentRenderType(): ContentRenderType {
return this.document?.archived_file_name
? this.getRenderType('application/pdf')
@@ -645,25 +657,12 @@ export class DocumentDetailComponent
PermissionType.Document
)
) {
this.documentsService
.getSuggestions(doc.id)
.pipe(
first(),
takeUntil(this.unsubscribeNotifier),
takeUntil(this.docChangeNotifier)
)
.subscribe({
next: (result) => {
this.suggestions = result
},
error: (error) => {
this.suggestions = null
this.toastService.showError(
$localize`Error retrieving suggestions.`,
error
)
},
})
this.tagService.getCachedMany(doc.tags).subscribe((tags) => {
// only show suggestions if document has inbox tags
if (tags.some((tag) => tag.is_inbox_tag)) {
this.getSuggestions()
}
})
}
this.title = this.documentTitlePipe.transform(doc.title)
const docFormValues = Object.assign({}, doc)
@@ -680,6 +679,56 @@ export class DocumentDetailComponent
return this.documentForm.get('custom_fields') as FormArray
}
getSuggestions() {
this.suggestionsLoading = true
this.documentsService
.getSuggestions(this.documentId)
.pipe(
first(),
takeUntil(this.unsubscribeNotifier),
takeUntil(this.docChangeNotifier)
)
.subscribe({
next: (result) => {
this.suggestions = result
this.suggestionsLoading = false
},
error: (error) => {
this.suggestions = null
this.suggestionsLoading = false
this.toastService.showError(
$localize`Error retrieving suggestions.`,
error
)
},
})
}
createTag(newName: string) {
var modal = this.modalService.open(TagEditDialogComponent, {
backdrop: 'static',
})
modal.componentInstance.dialogMode = EditDialogMode.CREATE
if (newName) modal.componentInstance.object = { name: newName }
modal.componentInstance.succeeded
.pipe(
switchMap((newTag) => {
return this.tagService
.listAll()
.pipe(map((tags) => ({ newTag, tags })))
})
)
.pipe(takeUntil(this.unsubscribeNotifier))
.subscribe(({ newTag, tags }) => {
this.tagsInput.tags = tags.results
this.tagsInput.addTag(newTag.id)
if (this.suggestions) {
this.suggestions.suggested_tags =
this.suggestions.suggested_tags.filter((tag) => tag !== newName)
}
})
}
createDocumentType(newName: string) {
var modal = this.modalService.open(DocumentTypeEditDialogComponent, {
backdrop: 'static',
@@ -698,6 +747,12 @@ export class DocumentDetailComponent
.subscribe(({ newDocumentType, documentTypes }) => {
this.documentTypes = documentTypes.results
this.documentForm.get('document_type').setValue(newDocumentType.id)
if (this.suggestions) {
this.suggestions.suggested_document_types =
this.suggestions.suggested_document_types.filter(
(dt) => dt !== newName
)
}
})
}
@@ -721,6 +776,12 @@ export class DocumentDetailComponent
.subscribe(({ newCorrespondent, correspondents }) => {
this.correspondents = correspondents.results
this.documentForm.get('correspondent').setValue(newCorrespondent.id)
if (this.suggestions) {
this.suggestions.suggested_correspondents =
this.suggestions.suggested_correspondents.filter(
(c) => c !== newName
)
}
})
}

View File

@@ -1,11 +1,17 @@
export interface DocumentSuggestions {
title?: string
tags?: number[]
suggested_tags?: string[]
correspondents?: number[]
suggested_correspondents?: string[]
document_types?: number[]
suggested_document_types?: string[]
storage_paths?: number[]
suggested_storage_paths?: string[]
dates?: string[] // ISO-formatted date string e.g. 2022-11-03
}

View File

@@ -44,11 +44,18 @@ export enum ConfigOptionType {
Boolean = 'boolean',
JSON = 'json',
File = 'file',
Password = 'password',
}
export const ConfigCategory = {
General: $localize`General Settings`,
OCR: $localize`OCR Settings`,
AI: $localize`AI Settings`,
}
export const LLMBackendConfig = {
OPENAI: 'openai',
OLLAMA: 'ollama',
}
export interface ConfigOption {
@@ -180,6 +187,42 @@ export const PaperlessConfigOptions: ConfigOption[] = [
config_key: 'PAPERLESS_APP_TITLE',
category: ConfigCategory.General,
},
{
key: 'ai_enabled',
title: $localize`AI Enabled`,
type: ConfigOptionType.Boolean,
config_key: 'PAPERLESS_AI_ENABLED',
category: ConfigCategory.AI,
},
{
key: 'llm_backend',
title: $localize`LLM Backend`,
type: ConfigOptionType.Select,
choices: mapToItems(LLMBackendConfig),
config_key: 'PAPERLESS_LLM_BACKEND',
category: ConfigCategory.AI,
},
{
key: 'llm_model',
title: $localize`LLM Model`,
type: ConfigOptionType.String,
config_key: 'PAPERLESS_LLM_MODEL',
category: ConfigCategory.AI,
},
{
key: 'llm_api_key',
title: $localize`LLM API Key`,
type: ConfigOptionType.Password,
config_key: 'PAPERLESS_LLM_API_KEY',
category: ConfigCategory.AI,
},
{
key: 'llm_url',
title: $localize`LLM URL`,
type: ConfigOptionType.String,
config_key: 'PAPERLESS_LLM_URL',
category: ConfigCategory.AI,
},
]
export interface PaperlessConfig extends ObjectWithId {
@@ -198,4 +241,9 @@ export interface PaperlessConfig extends ObjectWithId {
user_args: object
app_logo: string
app_title: string
ai_enabled: boolean
llm_backend: string
llm_model: string
llm_api_key: string
llm_url: string
}

View File

@@ -73,6 +73,7 @@ export const SETTINGS_KEYS = {
GMAIL_OAUTH_URL: 'gmail_oauth_url',
OUTLOOK_OAUTH_URL: 'outlook_oauth_url',
EMAIL_ENABLED: 'email_enabled',
AI_ENABLED: 'ai_enabled',
}
export const SETTINGS: UiSetting[] = [
@@ -276,4 +277,9 @@ export const SETTINGS: UiSetting[] = [
type: 'string',
default: 'page-width', // ZoomSetting from 'document-detail.component'
},
{
key: SETTINGS_KEYS.AI_ENABLED,
type: 'boolean',
default: false,
},
]

View File

@@ -118,6 +118,7 @@ import {
sliders2Vertical,
sortAlphaDown,
sortAlphaUpAlt,
stars,
tag,
tagFill,
tags,
@@ -324,6 +325,7 @@ const icons = {
sliders2Vertical,
sortAlphaDown,
sortAlphaUpAlt,
stars,
tagFill,
tag,
tags,

View File

@@ -115,6 +115,56 @@ def refresh_suggestions_cache(
cache.touch(doc_key, timeout)
def get_llm_suggestion_cache(
document_id: int,
backend: str,
) -> SuggestionCacheData | None:
doc_key = get_suggestion_cache_key(document_id)
data: SuggestionCacheData = cache.get(doc_key)
if data and data.classifier_hash == backend:
return data
return None
def set_llm_suggestions_cache(
document_id: int,
suggestions: dict,
*,
backend: str,
timeout: int = CACHE_50_MINUTES,
) -> None:
"""
Cache LLM-generated suggestions using a backend-specific identifier (e.g. 'openai:gpt-4').
"""
from documents.caching import SuggestionCacheData
doc_key = get_suggestion_cache_key(document_id)
cache.set(
doc_key,
SuggestionCacheData(
classifier_version=1000, # Unique marker for LLM-based suggestion
classifier_hash=backend,
suggestions=suggestions,
),
timeout,
)
def invalidate_llm_suggestions_cache(
document_id: int,
) -> None:
"""
Invalidate the LLM suggestions cache for a specific document and backend.
"""
doc_key = get_suggestion_cache_key(document_id)
data: SuggestionCacheData = cache.get(doc_key)
if data:
cache.delete(doc_key)
def get_metadata_cache_key(document_id: int) -> str:
"""
Returns the basic key for a document's metadata

View File

@@ -25,6 +25,7 @@ from guardian.shortcuts import remove_perm
from documents import matching
from documents.caching import clear_document_caches
from documents.caching import invalidate_llm_suggestions_cache
from documents.file_handling import create_source_path_directory
from documents.file_handling import delete_empty_directories
from documents.file_handling import generate_unique_filename
@@ -524,6 +525,15 @@ def update_filename_and_move_files(
)
@receiver(models.signals.post_save, sender=Document)
def update_llm_suggestions_cache(sender, instance, **kwargs):
"""
Invalidate the LLM suggestions cache when a document is saved.
"""
# Invalidate the cache for the document
invalidate_llm_suggestions_cache(instance.pk)
# should be disabled in /src/documents/management/commands/document_importer.py handle
@receiver(models.signals.post_save, sender=CustomField)
def check_paths_and_prune_custom_fields(sender, instance: CustomField, **kwargs):

View File

@@ -31,29 +31,32 @@ class TestApiAppConfig(DirectoriesMixin, APITestCase):
response = self.client.get(self.ENDPOINT, format="json")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(
json.dumps(response.data[0]),
json.dumps(
{
"id": 1,
"user_args": None,
"output_type": None,
"pages": None,
"language": None,
"mode": None,
"skip_archive_file": None,
"image_dpi": None,
"unpaper_clean": None,
"deskew": None,
"rotate_pages": None,
"rotate_pages_threshold": None,
"max_image_pixels": None,
"color_conversion_strategy": None,
"app_title": None,
"app_logo": None,
},
),
self.maxDiff = None
self.assertDictEqual(
response.data[0],
{
"id": 1,
"user_args": None,
"output_type": None,
"pages": None,
"language": None,
"mode": None,
"skip_archive_file": None,
"image_dpi": None,
"unpaper_clean": None,
"deskew": None,
"rotate_pages": None,
"rotate_pages_threshold": None,
"max_image_pixels": None,
"color_conversion_strategy": None,
"app_title": None,
"app_logo": None,
"ai_enabled": False,
"llm_backend": None,
"llm_model": None,
"llm_api_key": None,
"llm_url": None,
},
)
def test_api_get_ui_settings_with_config(self):

View File

@@ -47,6 +47,7 @@ class TestApiUiSettings(DirectoriesMixin, APITestCase):
"backend_setting": "default",
},
"email_enabled": False,
"ai_enabled": False,
},
)

View File

@@ -1,6 +1,8 @@
import tempfile
from datetime import timedelta
from pathlib import Path
from unittest.mock import MagicMock
from unittest.mock import patch
from django.conf import settings
from django.contrib.auth.models import Permission
@@ -10,8 +12,15 @@ from django.test import override_settings
from django.utils import timezone
from rest_framework import status
from documents.caching import get_llm_suggestion_cache
from documents.caching import set_llm_suggestions_cache
from documents.models import Correspondent
from documents.models import Document
from documents.models import DocumentType
from documents.models import ShareLink
from documents.models import StoragePath
from documents.models import Tag
from documents.signals.handlers import update_llm_suggestions_cache
from documents.tests.utils import DirectoriesMixin
from paperless.models import ApplicationConfiguration
@@ -154,3 +163,104 @@ class TestViews(DirectoriesMixin, TestCase):
response.render()
self.assertEqual(response.request["PATH_INFO"], "/accounts/login/")
self.assertContains(response, b"Share link has expired")
class TestAISuggestions(DirectoriesMixin, TestCase):
def setUp(self):
self.user = User.objects.create_superuser(username="testuser")
self.document = Document.objects.create(
title="Test Document",
filename="test.pdf",
mime_type="application/pdf",
)
self.tag1 = Tag.objects.create(name="tag1")
self.correspondent1 = Correspondent.objects.create(name="correspondent1")
self.document_type1 = DocumentType.objects.create(name="type1")
self.path1 = StoragePath.objects.create(name="path1")
super().setUp()
@patch("documents.views.get_llm_suggestion_cache")
@patch("documents.views.refresh_suggestions_cache")
@override_settings(
AI_ENABLED=True,
LLM_BACKEND="mock_backend",
)
def test_suggestions_with_cached_llm(self, mock_refresh_cache, mock_get_cache):
mock_get_cache.return_value = MagicMock(suggestions={"tags": ["tag1", "tag2"]})
self.client.force_login(user=self.user)
response = self.client.get(f"/api/documents/{self.document.pk}/suggestions/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(response.json(), {"tags": ["tag1", "tag2"]})
mock_refresh_cache.assert_called_once_with(self.document.pk)
@patch("documents.views.get_ai_document_classification")
@override_settings(
AI_ENABLED=True,
LLM_BACKEND="mock_backend",
)
def test_suggestions_with_ai_enabled(
self,
mock_get_ai_classification,
):
mock_get_ai_classification.return_value = {
"title": "AI Title",
"tags": ["tag1", "tag2"],
"correspondents": ["correspondent1"],
"document_types": ["type1"],
"storage_paths": ["path1"],
"dates": ["2023-01-01"],
}
self.client.force_login(user=self.user)
response = self.client.get(f"/api/documents/{self.document.pk}/suggestions/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
self.assertEqual(
response.json(),
{
"title": "AI Title",
"tags": [self.tag1.pk],
"suggested_tags": ["tag2"],
"correspondents": [self.correspondent1.pk],
"suggested_correspondents": [],
"document_types": [self.document_type1.pk],
"suggested_document_types": [],
"storage_paths": [self.path1.pk],
"suggested_storage_paths": [],
"dates": ["2023-01-01"],
},
)
def test_invalidate_suggestions_cache(self):
self.client.force_login(user=self.user)
suggestions = {
"title": "AI Title",
"tags": ["tag1", "tag2"],
"correspondents": ["correspondent1"],
"document_types": ["type1"],
"storage_paths": ["path1"],
"dates": ["2023-01-01"],
}
set_llm_suggestions_cache(
self.document.pk,
suggestions,
backend="mock_backend",
)
self.assertEqual(
get_llm_suggestion_cache(
self.document.pk,
backend="mock_backend",
).suggestions,
suggestions,
)
# post_save signal triggered
update_llm_suggestions_cache(
sender=None,
instance=self.document,
)
self.assertIsNone(
get_llm_suggestion_cache(
self.document.pk,
backend="mock_backend",
),
)

View File

@@ -80,10 +80,12 @@ from documents import index
from documents.bulk_download import ArchiveOnlyStrategy
from documents.bulk_download import OriginalAndArchiveStrategy
from documents.bulk_download import OriginalsOnlyStrategy
from documents.caching import get_llm_suggestion_cache
from documents.caching import get_metadata_cache
from documents.caching import get_suggestion_cache
from documents.caching import refresh_metadata_cache
from documents.caching import refresh_suggestions_cache
from documents.caching import set_llm_suggestions_cache
from documents.caching import set_metadata_cache
from documents.caching import set_suggestions_cache
from documents.classifier import load_classifier
@@ -168,7 +170,14 @@ from documents.tasks import sanity_check
from documents.tasks import train_classifier
from documents.templating.filepath import validate_filepath_template_and_render
from paperless import version
from paperless.ai.ai_classifier import get_ai_document_classification
from paperless.ai.matching import extract_unmatched_names
from paperless.ai.matching import match_correspondents_by_name
from paperless.ai.matching import match_document_types_by_name
from paperless.ai.matching import match_storage_paths_by_name
from paperless.ai.matching import match_tags_by_name
from paperless.celery import app as celery_app
from paperless.config import AIConfig
from paperless.config import GeneralConfig
from paperless.db import GnuPG
from paperless.serialisers import GroupSerializer
@@ -730,37 +739,103 @@ class DocumentViewSet(
):
return HttpResponseForbidden("Insufficient permissions")
document_suggestions = get_suggestion_cache(doc.pk)
ai_config = AIConfig()
if document_suggestions is not None:
refresh_suggestions_cache(doc.pk)
return Response(document_suggestions.suggestions)
classifier = load_classifier()
dates = []
if settings.NUMBER_OF_SUGGESTED_DATES > 0:
gen = parse_date_generator(doc.filename, doc.content)
dates = sorted(
{i for i in itertools.islice(gen, settings.NUMBER_OF_SUGGESTED_DATES)},
if ai_config.ai_enabled:
cached_llm_suggestions = get_llm_suggestion_cache(
doc.pk,
backend=ai_config.llm_backend,
)
resp_data = {
"correspondents": [
c.id for c in match_correspondents(doc, classifier, request.user)
],
"tags": [t.id for t in match_tags(doc, classifier, request.user)],
"document_types": [
dt.id for dt in match_document_types(doc, classifier, request.user)
],
"storage_paths": [
dt.id for dt in match_storage_paths(doc, classifier, request.user)
],
"dates": [date.strftime("%Y-%m-%d") for date in dates if date is not None],
}
if cached_llm_suggestions:
refresh_suggestions_cache(doc.pk)
return Response(cached_llm_suggestions.suggestions)
# Cache the suggestions and the classifier hash for later
set_suggestions_cache(doc.pk, resp_data, classifier)
llm_suggestions = get_ai_document_classification(doc)
matched_tags = match_tags_by_name(
llm_suggestions.get("tags", []),
request.user,
)
matched_correspondents = match_correspondents_by_name(
llm_suggestions.get("correspondents", []),
request.user,
)
matched_types = match_document_types_by_name(
llm_suggestions.get("document_types", []),
request.user,
)
matched_paths = match_storage_paths_by_name(
llm_suggestions.get("storage_paths", []),
request.user,
)
resp_data = {
"title": llm_suggestions.get("title"),
"tags": [t.id for t in matched_tags],
"suggested_tags": extract_unmatched_names(
llm_suggestions.get("tags", []),
matched_tags,
),
"correspondents": [c.id for c in matched_correspondents],
"suggested_correspondents": extract_unmatched_names(
llm_suggestions.get("correspondents", []),
matched_correspondents,
),
"document_types": [d.id for d in matched_types],
"suggested_document_types": extract_unmatched_names(
llm_suggestions.get("document_types", []),
matched_types,
),
"storage_paths": [s.id for s in matched_paths],
"suggested_storage_paths": extract_unmatched_names(
llm_suggestions.get("storage_paths", []),
matched_paths,
),
"dates": llm_suggestions.get("dates", []),
}
set_llm_suggestions_cache(doc.pk, resp_data, backend=ai_config.llm_backend)
else:
document_suggestions = get_suggestion_cache(doc.pk)
if document_suggestions is not None:
refresh_suggestions_cache(doc.pk)
return Response(document_suggestions.suggestions)
classifier = load_classifier()
dates = []
if settings.NUMBER_OF_SUGGESTED_DATES > 0:
gen = parse_date_generator(doc.filename, doc.content)
dates = sorted(
{
i
for i in itertools.islice(
gen,
settings.NUMBER_OF_SUGGESTED_DATES,
)
},
)
resp_data = {
"correspondents": [
c.id for c in match_correspondents(doc, classifier, request.user)
],
"tags": [t.id for t in match_tags(doc, classifier, request.user)],
"document_types": [
dt.id for dt in match_document_types(doc, classifier, request.user)
],
"storage_paths": [
dt.id for dt in match_storage_paths(doc, classifier, request.user)
],
"dates": [
date.strftime("%Y-%m-%d") for date in dates if date is not None
],
}
# Cache the suggestions and the classifier hash for later
set_suggestions_cache(doc.pk, resp_data, classifier)
return Response(resp_data)
@@ -2149,6 +2224,10 @@ class UiSettingsView(GenericAPIView):
ui_settings["email_enabled"] = settings.EMAIL_ENABLED
ai_config = AIConfig()
ui_settings["ai_enabled"] = ai_config.ai_enabled
user_resp = {
"id": user.id,
"username": user.username,

View File

View File

@@ -0,0 +1,86 @@
import json
import logging
from documents.models import Document
from paperless.ai.client import AIClient
logger = logging.getLogger("paperless.ai.ai_classifier")
def get_ai_document_classification(document: Document) -> dict:
"""
Returns classification suggestions for a given document using an LLM.
Output schema matches the API's expected DocumentClassificationSuggestions format.
"""
filename = document.filename or ""
content = document.content or ""
prompt = f"""
You are an assistant that extracts structured information from documents.
Only respond with the JSON object as described below.
Never ask for further information, additional content or ask questions. Never include any other text.
Suggested tags and document types must be strictly based on the content of the document.
Do not change the field names or the JSON structure, only provide the values. Use double quotes and proper JSON syntax.
The JSON object must contain the following fields:
- title: A short, descriptive title
- tags: A list of simple tags like ["insurance", "medical", "receipts"]
- correspondents: A list of names or organizations mentioned in the document
- document_types: The type/category of the document (e.g. "invoice", "medical record")
- storage_paths: Suggested folder paths (e.g. "Medical/Insurance")
- dates: List up to 3 relevant dates in YYYY-MM-DD format
The format of the JSON object is as follows:
{{
"title": "xxxxx",
"tags": ["xxxx", "xxxx"],
"correspondents": ["xxxx", "xxxx"],
"document_types": ["xxxx", "xxxx"],
"storage_paths": ["xxxx", "xxxx"],
"dates": ["YYYY-MM-DD", "YYYY-MM-DD", "YYYY-MM-DD"],
}}
---
FILENAME:
{filename}
CONTENT:
{content[:8000]} # Trim to safe size
"""
try:
client = AIClient()
result = client.run_llm_query(prompt)
suggestions = parse_ai_classification_response(result)
return suggestions or {}
except Exception:
logger.exception("Error during LLM classification: %s", exc_info=True)
return {}
def parse_ai_classification_response(text: str) -> dict:
"""
Parses LLM output and ensures it conforms to expected schema.
"""
try:
raw = json.loads(text)
return {
"title": raw.get("title"),
"tags": raw.get("tags", []),
"correspondents": [raw["correspondents"]]
if isinstance(raw.get("correspondents"), str)
else raw.get("correspondents", []),
"document_types": [raw["document_types"]]
if isinstance(raw.get("document_types"), str)
else raw.get("document_types", []),
"storage_paths": raw.get("storage_paths", []),
"dates": [d for d in raw.get("dates", []) if d],
}
except json.JSONDecodeError:
# fallback: try to extract JSON manually?
logger.exception(
"Failed to parse LLM classification response: %s",
text,
exc_info=True,
)
return {}

View File

@@ -0,0 +1,70 @@
import logging
import httpx
from paperless.config import AIConfig
logger = logging.getLogger("paperless.ai.client")
class AIClient:
"""
A client for interacting with an LLM backend.
"""
def __init__(self):
self.settings = AIConfig()
def run_llm_query(self, prompt: str) -> str:
logger.debug(
"Running LLM query against %s with model %s",
self.settings.llm_backend,
self.settings.llm_model,
)
match self.settings.llm_backend:
case "openai":
result = self._run_openai_query(prompt)
case "ollama":
result = self._run_ollama_query(prompt)
case _:
raise ValueError(
f"Unsupported LLM backend: {self.settings.llm_backend}",
)
logger.debug("LLM query result: %s", result)
return result
def _run_ollama_query(self, prompt: str) -> str:
url = self.settings.llm_url or "http://localhost:11434"
with httpx.Client(timeout=30.0) as client:
response = client.post(
f"{url}/api/chat",
json={
"model": self.settings.llm_model,
"messages": [{"role": "user", "content": prompt}],
"stream": False,
},
)
response.raise_for_status()
return response.json()["message"]["content"]
def _run_openai_query(self, prompt: str) -> str:
if not self.settings.llm_api_key:
raise RuntimeError("PAPERLESS_LLM_API_KEY is not set")
url = self.settings.llm_url or "https://api.openai.com"
with httpx.Client(timeout=30.0) as client:
response = client.post(
f"{url}/v1/chat/completions",
headers={
"Authorization": f"Bearer {self.settings.llm_api_key}",
"Content-Type": "application/json",
},
json={
"model": self.settings.llm_model,
"messages": [{"role": "user", "content": prompt}],
"temperature": 0.3,
},
)
response.raise_for_status()
return response.json()["choices"][0]["message"]["content"]

View File

@@ -0,0 +1,100 @@
import difflib
import logging
import re
from django.contrib.auth.models import User
from documents.models import Correspondent
from documents.models import DocumentType
from documents.models import StoragePath
from documents.models import Tag
from documents.permissions import get_objects_for_user_owner_aware
MATCH_THRESHOLD = 0.8
logger = logging.getLogger("paperless.ai.matching")
def match_tags_by_name(names: list[str], user: User) -> list[Tag]:
queryset = get_objects_for_user_owner_aware(
user,
["view_tag"],
Tag,
)
return _match_names_to_queryset(names, queryset, "name")
def match_correspondents_by_name(names: list[str], user: User) -> list[Correspondent]:
queryset = get_objects_for_user_owner_aware(
user,
["view_correspondent"],
Correspondent,
)
return _match_names_to_queryset(names, queryset, "name")
def match_document_types_by_name(names: list[str], user: User) -> list[DocumentType]:
queryset = get_objects_for_user_owner_aware(
user,
["view_documenttype"],
DocumentType,
)
return _match_names_to_queryset(names, queryset, "name")
def match_storage_paths_by_name(names: list[str], user: User) -> list[StoragePath]:
queryset = get_objects_for_user_owner_aware(
user,
["view_storagepath"],
StoragePath,
)
return _match_names_to_queryset(names, queryset, "name")
def _normalize(s: str) -> str:
s = s.lower()
s = re.sub(r"[^\w\s]", "", s) # remove punctuation
s = s.strip()
return s
def _match_names_to_queryset(names: list[str], queryset, attr: str):
results = []
objects = list(queryset)
object_names = [_normalize(getattr(obj, attr)) for obj in objects]
for name in names:
if not name:
continue
target = _normalize(name)
# First try exact match
if target in object_names:
index = object_names.index(target)
results.append(objects[index])
# Remove the matched name from the list to avoid fuzzy matching later
object_names.remove(target)
continue
# Fuzzy match fallback
matches = difflib.get_close_matches(
target,
object_names,
n=1,
cutoff=MATCH_THRESHOLD,
)
if matches:
index = object_names.index(matches[0])
results.append(objects[index])
else:
pass
return results
def extract_unmatched_names(
names: list[str],
matched_objects: list,
attr="name",
) -> list[str]:
matched_names = {getattr(obj, attr).lower() for obj in matched_objects}
return [name for name in names if name.lower() not in matched_names]

View File

@@ -114,3 +114,25 @@ class GeneralConfig(BaseConfig):
self.app_title = app_config.app_title or None
self.app_logo = app_config.app_logo.url if app_config.app_logo else None
@dataclasses.dataclass
class AIConfig(BaseConfig):
"""
AI related settings that require global scope
"""
ai_enabled: bool = dataclasses.field(init=False)
llm_backend: str = dataclasses.field(init=False)
llm_model: str = dataclasses.field(init=False)
llm_api_key: str = dataclasses.field(init=False)
llm_url: str = dataclasses.field(init=False)
def __post_init__(self) -> None:
app_config = self._get_config_instance()
self.ai_enabled = app_config.ai_enabled or settings.AI_ENABLED
self.llm_backend = app_config.llm_backend or settings.LLM_BACKEND
self.llm_model = app_config.llm_model or settings.LLM_MODEL
self.llm_api_key = app_config.llm_api_key or settings.LLM_API_KEY
self.llm_url = app_config.llm_url or settings.LLM_URL

View File

@@ -0,0 +1,63 @@
# Generated by Django 5.1.7 on 2025-04-24 02:09
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("paperless", "0003_alter_applicationconfiguration_max_image_pixels"),
]
operations = [
migrations.AddField(
model_name="applicationconfiguration",
name="ai_enabled",
field=models.BooleanField(
default=False,
null=True,
verbose_name="Enables AI features",
),
),
migrations.AddField(
model_name="applicationconfiguration",
name="llm_api_key",
field=models.CharField(
blank=True,
max_length=128,
null=True,
verbose_name="Sets the LLM API key",
),
),
migrations.AddField(
model_name="applicationconfiguration",
name="llm_backend",
field=models.CharField(
blank=True,
choices=[("openai", "OpenAI"), ("ollama", "Ollama")],
max_length=32,
null=True,
verbose_name="Sets the LLM backend",
),
),
migrations.AddField(
model_name="applicationconfiguration",
name="llm_model",
field=models.CharField(
blank=True,
max_length=32,
null=True,
verbose_name="Sets the LLM model",
),
),
migrations.AddField(
model_name="applicationconfiguration",
name="llm_url",
field=models.CharField(
blank=True,
max_length=128,
null=True,
verbose_name="Sets the LLM URL, optional",
),
),
]

View File

@@ -74,6 +74,15 @@ class ColorConvertChoices(models.TextChoices):
CMYK = ("CMYK", _("CMYK"))
class LLMBackend(models.TextChoices):
"""
Matches to --llm-backend
"""
OPENAI = ("openai", _("OpenAI"))
OLLAMA = ("ollama", _("Ollama"))
class ApplicationConfiguration(AbstractSingletonModel):
"""
Settings which are common across more than 1 parser
@@ -184,6 +193,45 @@ class ApplicationConfiguration(AbstractSingletonModel):
upload_to="logo/",
)
"""
AI related settings
"""
ai_enabled = models.BooleanField(
verbose_name=_("Enables AI features"),
null=True,
default=False,
)
llm_backend = models.CharField(
verbose_name=_("Sets the LLM backend"),
null=True,
blank=True,
max_length=32,
choices=LLMBackend.choices,
)
llm_model = models.CharField(
verbose_name=_("Sets the LLM model"),
null=True,
blank=True,
max_length=32,
)
llm_api_key = models.CharField(
verbose_name=_("Sets the LLM API key"),
null=True,
blank=True,
max_length=128,
)
llm_url = models.CharField(
verbose_name=_("Sets the LLM URL, optional"),
null=True,
blank=True,
max_length=128,
)
class Meta:
verbose_name = _("paperless application settings")

View File

@@ -185,6 +185,10 @@ class ProfileSerializer(serializers.ModelSerializer):
class ApplicationConfigurationSerializer(serializers.ModelSerializer):
user_args = serializers.JSONField(binary=True, allow_null=True)
llm_api_key = ObfuscatedPasswordField(
required=False,
allow_null=True,
)
def run_validation(self, data):
# Empty strings treated as None to avoid unexpected behavior

View File

@@ -1267,3 +1267,12 @@ OUTLOOK_OAUTH_ENABLED = bool(
and OUTLOOK_OAUTH_CLIENT_ID
and OUTLOOK_OAUTH_CLIENT_SECRET,
)
################################################################################
# AI Settings #
################################################################################
AI_ENABLED = __get_boolean("PAPERLESS_AI_ENABLED", "NO")
LLM_BACKEND = os.getenv("PAPERLESS_LLM_BACKEND", "openai") # or "ollama"
LLM_MODEL = os.getenv("PAPERLESS_LLM_MODEL")
LLM_API_KEY = os.getenv("PAPERLESS_LLM_API_KEY")
LLM_URL = os.getenv("PAPERLESS_LLM_URL")

View File

@@ -0,0 +1,100 @@
import json
from unittest.mock import patch
import pytest
from documents.models import Document
from paperless.ai.ai_classifier import get_ai_document_classification
from paperless.ai.ai_classifier import parse_ai_classification_response
@pytest.fixture
def mock_document():
return Document(filename="test.pdf", content="This is a test document content.")
@pytest.mark.django_db
@patch("paperless.ai.client.AIClient.run_llm_query")
def test_get_ai_document_classification_success(mock_run_llm_query, mock_document):
mock_response = json.dumps(
{
"title": "Test Title",
"tags": ["test", "document"],
"correspondents": ["John Doe"],
"document_types": ["report"],
"storage_paths": ["Reports"],
"dates": ["2023-01-01"],
},
)
mock_run_llm_query.return_value = mock_response
result = get_ai_document_classification(mock_document)
assert result["title"] == "Test Title"
assert result["tags"] == ["test", "document"]
assert result["correspondents"] == ["John Doe"]
assert result["document_types"] == ["report"]
assert result["storage_paths"] == ["Reports"]
assert result["dates"] == ["2023-01-01"]
@pytest.mark.django_db
@patch("paperless.ai.client.AIClient.run_llm_query")
def test_get_ai_document_classification_failure(mock_run_llm_query, mock_document):
mock_run_llm_query.side_effect = Exception("LLM query failed")
result = get_ai_document_classification(mock_document)
assert result == {}
def test_parse_llm_classification_response_valid():
mock_response = json.dumps(
{
"title": "Test Title",
"tags": ["test", "document"],
"correspondents": ["John Doe"],
"document_types": ["report"],
"storage_paths": ["Reports"],
"dates": ["2023-01-01"],
},
)
result = parse_ai_classification_response(mock_response)
assert result["title"] == "Test Title"
assert result["tags"] == ["test", "document"]
assert result["correspondents"] == ["John Doe"]
assert result["document_types"] == ["report"]
assert result["storage_paths"] == ["Reports"]
assert result["dates"] == ["2023-01-01"]
def test_parse_llm_classification_response_invalid_json():
mock_response = "Invalid JSON"
result = parse_ai_classification_response(mock_response)
assert result == {}
def test_parse_llm_classification_response_partial_data():
mock_response = json.dumps(
{
"title": "Partial Data",
"tags": ["partial"],
"correspondents": "Jane Doe",
"document_types": "note",
"storage_paths": [],
"dates": [],
},
)
result = parse_ai_classification_response(mock_response)
assert result["title"] == "Partial Data"
assert result["tags"] == ["partial"]
assert result["correspondents"] == ["Jane Doe"]
assert result["document_types"] == ["note"]
assert result["storage_paths"] == []
assert result["dates"] == []

View File

@@ -0,0 +1,95 @@
import json
from unittest.mock import patch
import pytest
from django.conf import settings
from paperless.ai.client import AIClient
@pytest.fixture
def mock_settings():
settings.LLM_BACKEND = "openai"
settings.LLM_MODEL = "gpt-3.5-turbo"
settings.LLM_API_KEY = "test-api-key"
yield settings
@pytest.mark.django_db
@patch("paperless.ai.client.AIClient._run_openai_query")
@patch("paperless.ai.client.AIClient._run_ollama_query")
def test_run_llm_query_openai(mock_ollama_query, mock_openai_query, mock_settings):
mock_settings.LLM_BACKEND = "openai"
mock_openai_query.return_value = "OpenAI response"
client = AIClient()
result = client.run_llm_query("Test prompt")
assert result == "OpenAI response"
mock_openai_query.assert_called_once_with("Test prompt")
mock_ollama_query.assert_not_called()
@pytest.mark.django_db
@patch("paperless.ai.client.AIClient._run_openai_query")
@patch("paperless.ai.client.AIClient._run_ollama_query")
def test_run_llm_query_ollama(mock_ollama_query, mock_openai_query, mock_settings):
mock_settings.LLM_BACKEND = "ollama"
mock_ollama_query.return_value = "Ollama response"
client = AIClient()
result = client.run_llm_query("Test prompt")
assert result == "Ollama response"
mock_ollama_query.assert_called_once_with("Test prompt")
mock_openai_query.assert_not_called()
@pytest.mark.django_db
def test_run_llm_query_unsupported_backend(mock_settings):
mock_settings.LLM_BACKEND = "unsupported"
client = AIClient()
with pytest.raises(ValueError, match="Unsupported LLM backend: unsupported"):
client.run_llm_query("Test prompt")
@pytest.mark.django_db
def test_run_openai_query(httpx_mock, mock_settings):
mock_settings.LLM_BACKEND = "openai"
httpx_mock.add_response(
url="https://api.openai.com/v1/chat/completions",
json={
"choices": [{"message": {"content": "OpenAI response"}}],
},
)
client = AIClient()
result = client.run_llm_query("Test prompt")
assert result == "OpenAI response"
request = httpx_mock.get_request()
assert request.method == "POST"
assert request.headers["Authorization"] == f"Bearer {mock_settings.LLM_API_KEY}"
assert request.headers["Content-Type"] == "application/json"
assert json.loads(request.content) == {
"model": mock_settings.LLM_MODEL,
"messages": [{"role": "user", "content": "Test prompt"}],
"temperature": 0.3,
}
@pytest.mark.django_db
def test_run_ollama_query(httpx_mock, mock_settings):
mock_settings.LLM_BACKEND = "ollama"
httpx_mock.add_response(
url="http://localhost:11434/api/chat",
json={"message": {"content": "Ollama response"}},
)
client = AIClient()
result = client.run_llm_query("Test prompt")
assert result == "Ollama response"
request = httpx_mock.get_request()
assert request.method == "POST"
assert json.loads(request.content) == {
"model": mock_settings.LLM_MODEL,
"messages": [{"role": "user", "content": "Test prompt"}],
"stream": False,
}

View File

@@ -0,0 +1,70 @@
from unittest.mock import patch
from django.test import TestCase
from documents.models import Correspondent
from documents.models import DocumentType
from documents.models import StoragePath
from documents.models import Tag
from paperless.ai.matching import extract_unmatched_names
from paperless.ai.matching import match_correspondents_by_name
from paperless.ai.matching import match_document_types_by_name
from paperless.ai.matching import match_storage_paths_by_name
from paperless.ai.matching import match_tags_by_name
class TestAIMatching(TestCase):
def setUp(self):
# Create test data for Tag
self.tag1 = Tag.objects.create(name="Test Tag 1")
self.tag2 = Tag.objects.create(name="Test Tag 2")
# Create test data for Correspondent
self.correspondent1 = Correspondent.objects.create(name="Test Correspondent 1")
self.correspondent2 = Correspondent.objects.create(name="Test Correspondent 2")
# Create test data for DocumentType
self.document_type1 = DocumentType.objects.create(name="Test Document Type 1")
self.document_type2 = DocumentType.objects.create(name="Test Document Type 2")
# Create test data for StoragePath
self.storage_path1 = StoragePath.objects.create(name="Test Storage Path 1")
self.storage_path2 = StoragePath.objects.create(name="Test Storage Path 2")
@patch("paperless.ai.matching.get_objects_for_user_owner_aware")
def test_match_tags_by_name(self, mock_get_objects):
mock_get_objects.return_value = Tag.objects.all()
names = ["Test Tag 1", "Nonexistent Tag"]
result = match_tags_by_name(names, user=None)
self.assertEqual(len(result), 1)
self.assertEqual(result[0].name, "Test Tag 1")
@patch("paperless.ai.matching.get_objects_for_user_owner_aware")
def test_match_correspondents_by_name(self, mock_get_objects):
mock_get_objects.return_value = Correspondent.objects.all()
names = ["Test Correspondent 1", "Nonexistent Correspondent"]
result = match_correspondents_by_name(names, user=None)
self.assertEqual(len(result), 1)
self.assertEqual(result[0].name, "Test Correspondent 1")
@patch("paperless.ai.matching.get_objects_for_user_owner_aware")
def test_match_document_types_by_name(self, mock_get_objects):
mock_get_objects.return_value = DocumentType.objects.all()
names = ["Test Document Type 1", "Nonexistent Document Type"]
result = match_document_types_by_name(names, user=None)
self.assertEqual(len(result), 1)
self.assertEqual(result[0].name, "Test Document Type 1")
@patch("paperless.ai.matching.get_objects_for_user_owner_aware")
def test_match_storage_paths_by_name(self, mock_get_objects):
mock_get_objects.return_value = StoragePath.objects.all()
names = ["Test Storage Path 1", "Nonexistent Storage Path"]
result = match_storage_paths_by_name(names, user=None)
self.assertEqual(len(result), 1)
self.assertEqual(result[0].name, "Test Storage Path 1")
def test_extract_unmatched_names(self):
llm_names = ["Test Tag 1", "Nonexistent Tag"]
matched_objects = [self.tag1]
unmatched_names = extract_unmatched_names(llm_names, matched_objects)
self.assertEqual(unmatched_names, ["Nonexistent Tag"])