Compare commits

..

19 Commits

Author SHA1 Message Date
shamoon
914c007103 Update docs 2025-10-08 08:50:48 -07:00
shamoon
f3e749511e Refactor workflow trigger conditions to filters 2025-10-08 08:42:21 -07:00
shamoon
e715a78b63 Update test_workflows.py 2025-10-07 14:07:01 -07:00
shamoon
1b8033209a 100% frontend coverage 2025-10-07 14:04:20 -07:00
shamoon
3828d07ec6 More backend coverage 2025-10-07 13:48:06 -07:00
shamoon
9c4d09c91c More frontend coverage 2025-10-07 13:45:04 -07:00
shamoon
ea6fdc78e6 Merge branch 'dev' into feature-wf-conditions 2025-10-07 13:35:57 -07:00
shamoon
979ccf4c51 Update workflow-edit-dialog.component.ts 2025-10-07 13:35:16 -07:00
shamoon
1c75c4d94b Good sonar 2025-10-07 13:31:17 -07:00
shamoon
3ac5efd86a Ok big refactor, also some subscription handling 2025-10-07 12:59:33 -07:00
shamoon
9dcb74fda0 Some more frontend coverage 2025-10-07 12:28:37 -07:00
shamoon
e759ca58c3 Add initial query atom 2025-10-07 12:16:11 -07:00
shamoon
88fcc5f339 Support CF queries! 2025-10-07 11:53:47 -07:00
shamoon
3d9cf696a7 Frontend coverage 2025-10-07 10:53:52 -07:00
shamoon
4cf9d7d567 Fix pattern required 2025-10-07 10:47:03 -07:00
shamoon
b323c180be Lots of cleanup, looking good. Simplify 2025-10-07 10:29:42 -07:00
shamoon
0fe5ca9b60 Update workflow-edit-dialog.component.html 2025-10-07 09:59:35 -07:00
shamoon
4965480958 Add negation for other things, universal query builder 2025-10-07 09:42:00 -07:00
shamoon
1fed785c7d Initial crack 2025-10-07 09:20:37 -07:00
30 changed files with 2237 additions and 526 deletions

View File

@@ -1805,23 +1805,3 @@ password. All of these options come from their similarly-named [Django settings]
#### [`PAPERLESS_EMAIL_USE_SSL=<bool>`](#PAPERLESS_EMAIL_USE_SSL) {#PAPERLESS_EMAIL_USE_SSL}
: Defaults to false.
## Remote OCR
#### [`PAPERLESS_REMOTE_OCR_ENGINE=<str>`](#PAPERLESS_REMOTE_OCR_ENGINE) {#PAPERLESS_REMOTE_OCR_ENGINE}
: The remote OCR engine to use. Currently only Azure AI is supported as "azureai".
Defaults to None, which disables remote OCR.
#### [`PAPERLESS_REMOTE_OCR_API_KEY=<str>`](#PAPERLESS_REMOTE_OCR_API_KEY) {#PAPERLESS_REMOTE_OCR_API_KEY}
: The API key to use for the remote OCR engine.
Defaults to None.
#### [`PAPERLESS_REMOTE_OCR_ENDPOINT=<str>`](#PAPERLESS_REMOTE_OCR_ENDPOINT) {#PAPERLESS_REMOTE_OCR_ENDPOINT}
: The endpoint to use for the remote OCR engine. This is required for Azure AI.
Defaults to None.

View File

@@ -25,10 +25,9 @@ physical documents into a searchable online archive so you can keep, well, _less
## Features
- **Organize and index** your scanned documents with tags, correspondents, types, and more.
- _Your_ data is stored locally on _your_ server and is never transmitted or shared in any way, unless you explicitly choose to do so.
- _Your_ data is stored locally on _your_ server and is never transmitted or shared in any way.
- Performs **OCR** on your documents, adding searchable and selectable text, even to documents scanned with only images.
- Utilizes the open-source Tesseract engine to recognize more than 100 languages.
- _New!_ Supports remote OCR with Azure AI (opt-in).
- Utilizes the open-source Tesseract engine to recognize more than 100 languages.
- Documents are saved as PDF/A format which is designed for long term storage, alongside the unaltered originals.
- Uses machine-learning to automatically add tags, correspondents and document types to your documents.
- Supports PDF documents, images, plain text files, Office documents (Word, Excel, PowerPoint, and LibreOffice equivalents)[^1] and more.

View File

@@ -462,15 +462,24 @@ flowchart TD
Workflows allow you to filter by:
- Source, e.g. documents uploaded via consume folder, API (& the web UI) and mail fetch
- File name, including wildcards e.g. \*.pdf will apply to all pdfs
- File name, including wildcards e.g. \*.pdf will apply to all pdfs.
- File path, including wildcards. Note that enabling `PAPERLESS_CONSUMER_RECURSIVE` would allow, for
example, automatically assigning documents to different owners based on the upload directory.
- Mail rule. Choosing this option will force 'mail fetch' to be the workflow source.
- Content matching (`Added`, `Updated` and `Scheduled` triggers only). Filter document content using the matching settings.
- Tags (`Added`, `Updated` and `Scheduled` triggers only). Filter for documents with any of the specified tags
- Document type (`Added`, `Updated` and `Scheduled` triggers only). Filter documents with this doc type
- Correspondent (`Added`, `Updated` and `Scheduled` triggers only). Filter documents with this correspondent
- Storage path (`Added`, `Updated` and `Scheduled` triggers only). Filter documents with this storage path
There are also 'advanced' filters available for `Added`, `Updated` and `Scheduled` triggers:
- Any Tags: Filter for documents with any of the specified tags.
- All Tags: Filter for documents with all of the specified tags.
- No Tags: Filter for documents with none of the specified tags.
- Document type: Filter documents with this document type.
- Not Document types: Filter documents without any of these document types.
- Correspondent: Filter documents with this correspondent.
- Not Correspondents: Filter documents without any of these correspondents.
- Storage path: Filter documents with this storage path.
- Not Storage paths: Filter documents without any of these storage paths.
- Custom field query: Filter documents with a custom field query (the same as used for the document list filters).
### Workflow Actions
@@ -882,21 +891,6 @@ how regularly you intend to scan documents and use paperless.
performed the task associated with the document, move it to the
inbox.
## Remote OCR
!!! important
This feature is disabled by default and will always remain strictly "opt-in".
Paperless-ngx supports performing OCR on documents using remote services. At the moment, this is limited to
[Microsoft's Azure "Document Intelligence" service](https://azure.microsoft.com/en-us/products/ai-services/ai-document-intelligence).
This is of course a paid service (with a free tier) which requires an Azure account and subscription. Azure AI is not affiliated with
Paperless-ngx in any way. When enabled, Paperless-ngx will automatically send appropriate documents to Azure for OCR processing, bypassing
the local OCR engine. See the [configuration](configuration.md#PAPERLESS_REMOTE_OCR_ENGINE) options for more details.
Additionally, when using a commercial service with this feature, consider both potential costs as well as any associated file size
or page limitations (e.g. with a free tier).
## Architecture
Paperless-ngx consists of the following components:

View File

@@ -15,7 +15,6 @@ classifiers = [
# This will allow testing to not install a webserver, mysql, etc
dependencies = [
"azure-ai-documentintelligence>=1.0.2",
"babel>=2.17",
"bleach~=6.2.0",
"celery[redis]~=5.5.1",
@@ -233,7 +232,6 @@ testpaths = [
"src/paperless_tesseract/tests/",
"src/paperless_tika/tests",
"src/paperless_text/tests/",
"src/paperless_remote/tests/",
]
addopts = [
"--pythonwarnings=all",

View File

@@ -1,28 +1,36 @@
<div class="btn-group w-100" role="group" ngbDropdown #dropdown="ngbDropdown" (openChange)="onOpenChange($event)" [popperOptions]="popperOptions">
<button class="btn btn-sm btn-outline-primary" id="dropdown_toggle" ngbDropdownToggle [disabled]="disabled">
<i-bs name="{{icon}}"></i-bs>
<div class="d-none d-sm-inline">&nbsp;{{title}}</div>
@if (isActive) {
<pngx-clearable-badge [selected]="isActive" (cleared)="reset()"></pngx-clearable-badge>
}
</button>
<div class="px-3 shadow" ngbDropdownMenu attr.aria-labelledby="dropdown_{{name}}">
<div class="list-group list-group-flush">
@for (element of selectionModel.queries; track element.id; let i = $index) {
<div class="list-group-item px-0 d-flex flex-nowrap">
@switch (element.type) {
@case (CustomFieldQueryComponentType.Atom) {
<ng-container *ngTemplateOutlet="queryAtom; context: { atom: element }"></ng-container>
}
@case (CustomFieldQueryComponentType.Expression) {
<ng-container *ngTemplateOutlet="queryExpression; context: { expression: element }"></ng-container>
}
}
</div>
@if (useDropdown) {
<div class="btn-group w-100" role="group" ngbDropdown #dropdown="ngbDropdown" (openChange)="onOpenChange($event)" [popperOptions]="popperOptions">
<button class="btn btn-sm btn-outline-primary" id="dropdown_toggle" ngbDropdownToggle [disabled]="disabled">
<i-bs name="{{icon}}"></i-bs>
<div class="d-none d-sm-inline">&nbsp;{{title}}</div>
@if (isActive) {
<pngx-clearable-badge [selected]="isActive" (cleared)="reset()"></pngx-clearable-badge>
}
</button>
<div class="px-3 shadow" ngbDropdownMenu attr.aria-labelledby="dropdown_{{name}}">
<ng-container *ngTemplateOutlet="list; context: { queries: selectionModel.queries }"></ng-container>
</div>
</div>
</div>
} @else {
<ng-container *ngTemplateOutlet="list; context: { queries: selectionModel.queries }"></ng-container>
}
<ng-template #list let-queries="queries">
<div class="list-group list-group-flush">
@for (element of queries; track element.id; let i = $index) {
<div class="list-group-item px-0 d-flex flex-nowrap">
@switch (element.type) {
@case (CustomFieldQueryComponentType.Atom) {
<ng-container *ngTemplateOutlet="queryAtom; context: { atom: element }"></ng-container>
}
@case (CustomFieldQueryComponentType.Expression) {
<ng-container *ngTemplateOutlet="queryExpression; context: { expression: element }"></ng-container>
}
}
</div>
}
</div>
</ng-template>
<ng-template #comparisonValueTemplate let-atom="atom">
@if (getCustomFieldByID(atom.field)?.data_type === CustomFieldDataType.Date) {

View File

@@ -120,6 +120,12 @@ export class CustomFieldQueriesModel {
})
}
addInitialAtom() {
this.addAtom(
new CustomFieldQueryAtom([null, CustomFieldQueryOperator.Exists, 'true'])
)
}
private findElement(
queryElement: CustomFieldQueryElement,
elements: any[]
@@ -206,6 +212,9 @@ export class CustomFieldsQueryDropdownComponent extends LoadingComponentWithPerm
@Input()
applyOnClose = false
@Input()
useDropdown: boolean = true
get name(): string {
return this.title ? this.title.replace(/\s/g, '_').toLowerCase() : null
}
@@ -258,13 +267,7 @@ export class CustomFieldsQueryDropdownComponent extends LoadingComponentWithPerm
public onOpenChange(open: boolean) {
if (open) {
if (this.selectionModel.queries.length === 0) {
this.selectionModel.addAtom(
new CustomFieldQueryAtom([
null,
CustomFieldQueryOperator.Exists,
'true',
])
)
this.selectionModel.addInitialAtom()
}
if (
this.selectionModel.queries.length === 1 &&

View File

@@ -156,31 +156,97 @@
<p class="small" i18n>Trigger for documents that match <em>all</em> filters specified below.</p>
<div class="row">
<div class="col">
<pngx-input-text i18n-title title="Filter filename" formControlName="filter_filename" i18n-hint hint="Apply to documents that match this filename. Wildcards such as *.pdf or *invoice* are allowed. Case insensitive." [error]="error?.filter_filename"></pngx-input-text>
<pngx-input-text i18n-title title="Filter filename" formControlName="filter_filename" horizontal="true" i18n-hint hint="Apply to documents that match this filename. Wildcards such as *.pdf or *invoice* are allowed. Case insensitive." [error]="error?.filter_filename"></pngx-input-text>
@if (formGroup.get('type').value === WorkflowTriggerType.Consumption) {
<pngx-input-select i18n-title title="Filter sources" [items]="sourceOptions" [multiple]="true" formControlName="sources" [error]="error?.sources"></pngx-input-select>
<pngx-input-text i18n-title title="Filter path" formControlName="filter_path" i18n-hint hint="Apply to documents that match this path. Wildcards specified as * are allowed. Case-normalized.</a>" [error]="error?.filter_path"></pngx-input-text>
<pngx-input-select i18n-title title="Filter mail rule" [items]="mailRules" [allowNull]="true" formControlName="filter_mailrule" i18n-hint hint="Apply to documents consumed via this mail rule." [error]="error?.filter_mailrule"></pngx-input-select>
<pngx-input-select i18n-title title="Filter sources" [items]="sourceOptions" horizontal="true" [multiple]="true" formControlName="sources" [error]="error?.sources"></pngx-input-select>
<pngx-input-text i18n-title title="Filter path" formControlName="filter_path" horizontal="true" i18n-hint hint="Apply to documents that match this path. Wildcards specified as * are allowed. Case-normalized.</a>" [error]="error?.filter_path"></pngx-input-text>
<pngx-input-select i18n-title title="Filter mail rule" [items]="mailRules" horizontal="true" [allowNull]="true" formControlName="filter_mailrule" i18n-hint hint="Apply to documents consumed via this mail rule." [error]="error?.filter_mailrule"></pngx-input-select>
}
@if (formGroup.get('type').value === WorkflowTriggerType.DocumentAdded || formGroup.get('type').value === WorkflowTriggerType.DocumentUpdated || formGroup.get('type').value === WorkflowTriggerType.Scheduled) {
<pngx-input-select i18n-title title="Content matching algorithm" [items]="getMatchingAlgorithms()" formControlName="matching_algorithm"></pngx-input-select>
@if (patternRequired) {
<pngx-input-text i18n-title title="Content matching pattern" formControlName="match" [error]="error?.match"></pngx-input-text>
<pngx-input-select i18n-title title="Content matching algorithm" horizontal="true" [items]="getMatchingAlgorithms()" formControlName="matching_algorithm"></pngx-input-select>
@if (matchingPatternRequired(formGroup)) {
<pngx-input-text i18n-title title="Content matching pattern" horizontal="true" formControlName="match" [error]="error?.match"></pngx-input-text>
}
@if (patternRequired) {
<pngx-input-check i18n-title title="Case insensitive" formControlName="is_insensitive"></pngx-input-check>
@if (matchingPatternRequired(formGroup)) {
<pngx-input-check i18n-title title="Case insensitive" horizontal="true" formControlName="is_insensitive"></pngx-input-check>
}
}
</div>
@if (formGroup.get('type').value === WorkflowTriggerType.DocumentAdded || formGroup.get('type').value === WorkflowTriggerType.DocumentUpdated || formGroup.get('type').value === WorkflowTriggerType.Scheduled) {
<div class="col-md-6">
<pngx-input-tags [allowCreate]="false" i18n-title title="Has any of tags" formControlName="filter_has_tags"></pngx-input-tags>
<pngx-input-select i18n-title title="Has correspondent" [items]="correspondents" [allowNull]="true" formControlName="filter_has_correspondent"></pngx-input-select>
<pngx-input-select i18n-title title="Has document type" [items]="documentTypes" [allowNull]="true" formControlName="filter_has_document_type"></pngx-input-select>
<pngx-input-select i18n-title title="Has storage path" [items]="storagePaths" [allowNull]="true" formControlName="filter_has_storage_path"></pngx-input-select>
</div>
}
</div>
@if (formGroup.get('type').value === WorkflowTriggerType.DocumentAdded || formGroup.get('type').value === WorkflowTriggerType.DocumentUpdated || formGroup.get('type').value === WorkflowTriggerType.Scheduled) {
<div class="row mt-3">
<div class="col">
<div class="trigger-filters mb-3">
<div class="d-flex align-items-center">
<label class="form-label mb-0" i18n>Advanced Filters</label>
<button
type="button"
class="btn btn-sm btn-outline-primary ms-auto"
(click)="addFilter(formGroup)"
[disabled]="!canAddFilter(formGroup)"
>
<i-bs name="plus-circle"></i-bs>&nbsp;<span i18n>Add filter</span>
</button>
</div>
<ul class="mt-2 list-group filters" formArrayName="filters">
@if (getFiltersFormArray(formGroup).length === 0) {
<p class="text-muted small" i18n>No advanced workflow filters defined.</p>
}
@for (filter of getFiltersFormArray(formGroup).controls; track filter; let filterIndex = $index) {
<li [formGroupName]="filterIndex" class="list-group-item">
<div class="d-flex align-items-center gap-2">
<div class="w-25">
<pngx-input-select
i18n-title
[items]="getFilterTypeOptions(formGroup, filterIndex)"
formControlName="type"
[allowNull]="false"
></pngx-input-select>
</div>
<div class="flex-grow-1">
@if (isTagsFilter(filter.get('type').value)) {
<pngx-input-tags
[allowCreate]="false"
[title]="null"
formControlName="values"
></pngx-input-tags>
} @else if (
isCustomFieldQueryFilter(filter.get('type').value)
) {
<pngx-custom-fields-query-dropdown
[selectionModel]="getCustomFieldQueryModel(filter)"
(selectionModelChange)="onCustomFieldQuerySelectionChange(filter, $event)"
[useDropdown]="false"
></pngx-custom-fields-query-dropdown>
@if (!isCustomFieldQueryValid(filter)) {
<div class="text-danger small" i18n>
Complete the custom field query configuration.
</div>
}
} @else {
<pngx-input-select
[items]="getFilterSelectItems(filter.get('type').value)"
[allowNull]="true"
[multiple]="isSelectMultiple(filter.get('type').value)"
formControlName="values"
></pngx-input-select>
}
</div>
<button
type="button"
class="btn btn-link text-danger p-0"
(click)="removeFilter(formGroup, filterIndex)"
>
<i-bs name="trash"></i-bs><span class="ms-1" i18n>Delete</span>
</button>
</div>
</li>
}
</ul>
</div>
</div>
</div>
}
</div>
</ng-template>

View File

@@ -7,3 +7,7 @@
.accordion-button {
font-size: 1rem;
}
:host ::ng-deep .filters .paperless-input-select.mb-3 {
margin-bottom: 0 !important;
}

View File

@@ -11,8 +11,14 @@ import {
import { NgbActiveModal, NgbModule } from '@ng-bootstrap/ng-bootstrap'
import { NgSelectModule } from '@ng-select/ng-select'
import { of } from 'rxjs'
import { CustomFieldQueriesModel } from 'src/app/components/common/custom-fields-query-dropdown/custom-fields-query-dropdown.component'
import { CustomFieldDataType } from 'src/app/data/custom-field'
import { MATCHING_ALGORITHMS, MATCH_AUTO } from 'src/app/data/matching-model'
import { CustomFieldQueryLogicalOperator } from 'src/app/data/custom-field-query'
import {
MATCHING_ALGORITHMS,
MATCH_AUTO,
MATCH_NONE,
} from 'src/app/data/matching-model'
import { Workflow } from 'src/app/data/workflow'
import {
WorkflowAction,
@@ -31,6 +37,7 @@ import { DocumentTypeService } from 'src/app/services/rest/document-type.service
import { MailRuleService } from 'src/app/services/rest/mail-rule.service'
import { StoragePathService } from 'src/app/services/rest/storage-path.service'
import { SettingsService } from 'src/app/services/settings.service'
import { CustomFieldQueryExpression } from 'src/app/utils/custom-field-query-element'
import { ConfirmButtonComponent } from '../../confirm-button/confirm-button.component'
import { NumberComponent } from '../../input/number/number.component'
import { PermissionsGroupComponent } from '../../input/permissions/permissions-group/permissions-group.component'
@@ -43,6 +50,7 @@ import { EditDialogMode } from '../edit-dialog.component'
import {
DOCUMENT_SOURCE_OPTIONS,
SCHEDULE_DATE_FIELD_OPTIONS,
TriggerFilterType,
WORKFLOW_ACTION_OPTIONS,
WORKFLOW_TYPE_OPTIONS,
WorkflowEditDialogComponent,
@@ -375,6 +383,562 @@ describe('WorkflowEditDialogComponent', () => {
expect(component.objectForm.get('actions').value[0].webhook).toBeNull()
})
it('should require matching pattern when algorithm is not none', () => {
const triggerGroup = new FormGroup({
matching_algorithm: new FormControl(MATCH_AUTO),
match: new FormControl(''),
})
expect(component.matchingPatternRequired(triggerGroup)).toBe(true)
triggerGroup.get('matching_algorithm').setValue(MATCHING_ALGORITHMS[0].id)
expect(component.matchingPatternRequired(triggerGroup)).toBe(true)
triggerGroup.get('matching_algorithm').setValue(MATCH_NONE)
expect(component.matchingPatternRequired(triggerGroup)).toBe(false)
})
it('should map filter builder values into trigger filters on save', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0)
component.addFilter(triggerGroup as FormGroup)
component.addFilter(triggerGroup as FormGroup)
component.addFilter(triggerGroup as FormGroup)
const filters = component.getFiltersFormArray(triggerGroup as FormGroup)
expect(filters.length).toBe(3)
filters.at(0).get('values').setValue([1])
filters.at(1).get('values').setValue([2, 3])
filters.at(2).get('values').setValue([4])
const addFilterOfType = (type: TriggerFilterType) => {
const newFilter = component.addFilter(triggerGroup as FormGroup)
newFilter.get('type').setValue(type)
return newFilter
}
const correspondentIs = addFilterOfType(TriggerFilterType.CorrespondentIs)
correspondentIs.get('values').setValue(1)
const correspondentNot = addFilterOfType(TriggerFilterType.CorrespondentNot)
correspondentNot.get('values').setValue([1])
const documentTypeIs = addFilterOfType(TriggerFilterType.DocumentTypeIs)
documentTypeIs.get('values').setValue(1)
const documentTypeNot = addFilterOfType(TriggerFilterType.DocumentTypeNot)
documentTypeNot.get('values').setValue([1])
const storagePathIs = addFilterOfType(TriggerFilterType.StoragePathIs)
storagePathIs.get('values').setValue(1)
const storagePathNot = addFilterOfType(TriggerFilterType.StoragePathNot)
storagePathNot.get('values').setValue([1])
const customFieldFilter = addFilterOfType(
TriggerFilterType.CustomFieldQuery
)
const customFieldQuery = JSON.stringify(['AND', [[1, 'exact', 'test']]])
customFieldFilter.get('values').setValue(customFieldQuery)
const formValues = component['getFormValues']()
expect(formValues.triggers[0].filter_has_tags).toEqual([1])
expect(formValues.triggers[0].filter_has_all_tags).toEqual([2, 3])
expect(formValues.triggers[0].filter_has_not_tags).toEqual([4])
expect(formValues.triggers[0].filter_has_correspondent).toEqual(1)
expect(formValues.triggers[0].filter_has_not_correspondents).toEqual([1])
expect(formValues.triggers[0].filter_has_document_type).toEqual(1)
expect(formValues.triggers[0].filter_has_not_document_types).toEqual([1])
expect(formValues.triggers[0].filter_has_storage_path).toEqual(1)
expect(formValues.triggers[0].filter_has_not_storage_paths).toEqual([1])
expect(formValues.triggers[0].filter_custom_field_query).toEqual(
customFieldQuery
)
expect(formValues.triggers[0].filters).toBeUndefined()
})
it('should ignore empty and null filter values when mapping filters', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
const tagsFilter = component.addFilter(triggerGroup)
tagsFilter.get('type').setValue(TriggerFilterType.TagsAny)
tagsFilter.get('values').setValue([])
const correspondentFilter = component.addFilter(triggerGroup)
correspondentFilter.get('type').setValue(TriggerFilterType.CorrespondentIs)
correspondentFilter.get('values').setValue(null)
const formValues = component['getFormValues']()
expect(formValues.triggers[0].filter_has_tags).toEqual([])
expect(formValues.triggers[0].filter_has_correspondent).toBeNull()
})
it('should derive single select filters from array values', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
const addFilterOfType = (type: TriggerFilterType, value: any) => {
const filter = component.addFilter(triggerGroup)
filter.get('type').setValue(type)
filter.get('values').setValue(value)
}
addFilterOfType(TriggerFilterType.CorrespondentIs, [5])
addFilterOfType(TriggerFilterType.DocumentTypeIs, [6])
addFilterOfType(TriggerFilterType.StoragePathIs, [7])
const formValues = component['getFormValues']()
expect(formValues.triggers[0].filter_has_correspondent).toEqual(5)
expect(formValues.triggers[0].filter_has_document_type).toEqual(6)
expect(formValues.triggers[0].filter_has_storage_path).toEqual(7)
})
it('should convert multi-value filter values when aggregating filters', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
const setFilter = (type: TriggerFilterType, value: number): void => {
const filter = component.addFilter(triggerGroup) as FormGroup
filter.get('type').setValue(type)
filter.get('values').setValue(value)
}
setFilter(TriggerFilterType.TagsAll, 11)
setFilter(TriggerFilterType.TagsNone, 12)
setFilter(TriggerFilterType.CorrespondentNot, 13)
setFilter(TriggerFilterType.DocumentTypeNot, 14)
setFilter(TriggerFilterType.StoragePathNot, 15)
const formValues = component['getFormValues']()
expect(formValues.triggers[0].filter_has_all_tags).toEqual([11])
expect(formValues.triggers[0].filter_has_not_tags).toEqual([12])
expect(formValues.triggers[0].filter_has_not_correspondents).toEqual([13])
expect(formValues.triggers[0].filter_has_not_document_types).toEqual([14])
expect(formValues.triggers[0].filter_has_not_storage_paths).toEqual([15])
})
it('should reuse filter type options and update disabled state', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
component.addFilter(triggerGroup)
const optionsFirst = component.getFilterTypeOptions(triggerGroup, 0)
const optionsSecond = component.getFilterTypeOptions(triggerGroup, 0)
expect(optionsFirst).toBe(optionsSecond)
// to force disabled flag
component.addFilter(triggerGroup)
const filterArray = component.getFiltersFormArray(triggerGroup)
const firstFilter = filterArray.at(0)
firstFilter.get('type').setValue(TriggerFilterType.CorrespondentIs)
component.addFilter(triggerGroup)
const updatedFilters = component.getFiltersFormArray(triggerGroup)
const secondFilter = updatedFilters.at(1)
const options = component.getFilterTypeOptions(triggerGroup, 1)
const correspondentIsOption = options.find(
(option) => option.id === TriggerFilterType.CorrespondentIs
)
expect(correspondentIsOption.disabled).toBe(true)
firstFilter.get('type').setValue(TriggerFilterType.DocumentTypeNot)
secondFilter.get('type').setValue(TriggerFilterType.TagsAll)
const postChangeOptions = component.getFilterTypeOptions(triggerGroup, 1)
const correspondentOptionAfter = postChangeOptions.find(
(option) => option.id === TriggerFilterType.CorrespondentIs
)
expect(correspondentOptionAfter.disabled).toBe(false)
})
it('should keep multi-entry filter options enabled and allow duplicates', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
component.filterDefinitions = [
{
id: TriggerFilterType.TagsAny,
name: 'Any tags',
inputType: 'tags',
allowMultipleEntries: true,
allowMultipleValues: true,
} as any,
{
id: TriggerFilterType.CorrespondentIs,
name: 'Correspondent is',
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: false,
selectItems: 'correspondents',
} as any,
]
const firstFilter = component.addFilter(triggerGroup)
firstFilter.get('type').setValue(TriggerFilterType.TagsAny)
const secondFilter = component.addFilter(triggerGroup)
expect(secondFilter).not.toBeNull()
const options = component.getFilterTypeOptions(triggerGroup, 1)
const multiEntryOption = options.find(
(option) => option.id === TriggerFilterType.TagsAny
)
expect(multiEntryOption.disabled).toBe(false)
expect(component.canAddFilter(triggerGroup)).toBe(true)
})
it('should return null when no filter definitions remain available', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
component.filterDefinitions = [
{
id: TriggerFilterType.TagsAny,
name: 'Any tags',
inputType: 'tags',
allowMultipleEntries: false,
allowMultipleValues: true,
} as any,
{
id: TriggerFilterType.CorrespondentIs,
name: 'Correspondent is',
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: false,
selectItems: 'correspondents',
} as any,
]
const firstFilter = component.addFilter(triggerGroup)
firstFilter.get('type').setValue(TriggerFilterType.TagsAny)
const secondFilter = component.addFilter(triggerGroup)
secondFilter.get('type').setValue(TriggerFilterType.CorrespondentIs)
expect(component.canAddFilter(triggerGroup)).toBe(false)
expect(component.addFilter(triggerGroup)).toBeNull()
})
it('should skip filter definitions without handlers when building form array', () => {
const originalDefinitions = component.filterDefinitions
component.filterDefinitions = [
{
id: 999,
name: 'Unsupported',
inputType: 'text',
allowMultipleEntries: false,
allowMultipleValues: false,
} as any,
]
const trigger = {
filter_has_tags: [],
filter_has_all_tags: [],
filter_has_not_tags: [],
filter_has_not_correspondents: [],
filter_has_not_document_types: [],
filter_has_not_storage_paths: [],
filter_has_correspondent: null,
filter_has_document_type: null,
filter_has_storage_path: null,
filter_custom_field_query: null,
} as any
const filters = component['buildFiltersFormArray'](trigger)
expect(filters.length).toBe(0)
component.filterDefinitions = originalDefinitions
})
it('should return null when adding filter for unknown trigger form group', () => {
expect(component.addFilter(new FormGroup({}) as any)).toBeNull()
})
it('should ignore remove filter calls for unknown trigger form group', () => {
expect(() =>
component.removeFilter(new FormGroup({}) as any, 0)
).not.toThrow()
})
it('should teardown custom field query model when removing a custom field filter', () => {
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
component.addFilter(triggerGroup)
const filters = component.getFiltersFormArray(triggerGroup)
const filterGroup = filters.at(0) as FormGroup
filterGroup.get('type').setValue(TriggerFilterType.CustomFieldQuery)
const model = component.getCustomFieldQueryModel(filterGroup)
expect(model).toBeDefined()
expect(
component['getStoredCustomFieldQueryModel'](filterGroup as any)
).toBe(model)
component.removeFilter(triggerGroup, 0)
expect(
component['getStoredCustomFieldQueryModel'](filterGroup as any)
).toBeNull()
})
it('should return readable filter names', () => {
expect(component.getFilterName(TriggerFilterType.TagsAny)).toBe(
'Has any of these tags'
)
expect(component.getFilterName(999 as any)).toBe('')
})
it('should build filter form array from existing trigger filters', () => {
const trigger = workflow.triggers[0]
trigger.filter_has_tags = [1]
trigger.filter_has_all_tags = [2, 3]
trigger.filter_has_not_tags = [4]
trigger.filter_has_correspondent = 5 as any
trigger.filter_has_not_correspondents = [6] as any
trigger.filter_has_document_type = 7 as any
trigger.filter_has_not_document_types = [8] as any
trigger.filter_has_storage_path = 9 as any
trigger.filter_has_not_storage_paths = [10] as any
trigger.filter_custom_field_query = JSON.stringify([
'AND',
[[1, 'exact', 'value']],
]) as any
component.object = workflow
component.ngOnInit()
const triggerGroup = component.triggerFields.at(0) as FormGroup
const filters = component.getFiltersFormArray(triggerGroup)
expect(filters.length).toBe(10)
const customFieldFilter = filters.at(9) as FormGroup
expect(customFieldFilter.get('type').value).toBe(
TriggerFilterType.CustomFieldQuery
)
const model = component.getCustomFieldQueryModel(customFieldFilter)
expect(model.isValid()).toBe(true)
})
it('should expose select metadata helpers', () => {
expect(component.isSelectMultiple(TriggerFilterType.CorrespondentNot)).toBe(
true
)
expect(component.isSelectMultiple(TriggerFilterType.CorrespondentIs)).toBe(
false
)
component.correspondents = [{ id: 1, name: 'C1' } as any]
component.documentTypes = [{ id: 2, name: 'DT' } as any]
component.storagePaths = [{ id: 3, name: 'SP' } as any]
expect(
component.getFilterSelectItems(TriggerFilterType.CorrespondentIs)
).toEqual(component.correspondents)
expect(
component.getFilterSelectItems(TriggerFilterType.DocumentTypeIs)
).toEqual(component.documentTypes)
expect(
component.getFilterSelectItems(TriggerFilterType.StoragePathIs)
).toEqual(component.storagePaths)
expect(component.getFilterSelectItems(TriggerFilterType.TagsAll)).toEqual(
[]
)
expect(
component.isCustomFieldQueryFilter(TriggerFilterType.CustomFieldQuery)
).toBe(true)
})
it('should return empty select items when definition is missing', () => {
const originalDefinitions = component.filterDefinitions
component.filterDefinitions = []
expect(
component.getFilterSelectItems(TriggerFilterType.CorrespondentIs)
).toEqual([])
component.filterDefinitions = originalDefinitions
})
it('should return empty select items when definition has unknown source', () => {
const originalDefinitions = component.filterDefinitions
component.filterDefinitions = [
{
id: TriggerFilterType.CorrespondentIs,
name: 'Correspondent is',
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: false,
selectItems: 'unknown',
} as any,
]
expect(
component.getFilterSelectItems(TriggerFilterType.CorrespondentIs)
).toEqual([])
component.filterDefinitions = originalDefinitions
})
it('should handle custom field query selection change and validation states', () => {
const formGroup = new FormGroup({
values: new FormControl(null),
})
const model = new CustomFieldQueriesModel()
const changeSpy = jest.spyOn(
component as any,
'onCustomFieldQueryModelChanged'
)
component.onCustomFieldQuerySelectionChange(formGroup, model)
expect(changeSpy).toHaveBeenCalledWith(formGroup, model)
expect(component.isCustomFieldQueryValid(formGroup)).toBe(true)
component['setCustomFieldQueryModel'](formGroup as any, model as any)
const validSpy = jest.spyOn(model, 'isValid').mockReturnValue(false)
const emptySpy = jest.spyOn(model, 'isEmpty').mockReturnValue(false)
expect(component.isCustomFieldQueryValid(formGroup)).toBe(false)
expect(validSpy).toHaveBeenCalled()
validSpy.mockReturnValue(true)
emptySpy.mockReturnValue(true)
expect(component.isCustomFieldQueryValid(formGroup)).toBe(true)
emptySpy.mockReturnValue(false)
expect(component.isCustomFieldQueryValid(formGroup)).toBe(true)
component['clearCustomFieldQueryModel'](formGroup as any)
})
it('should recover from invalid custom field query json and update control on changes', () => {
const filterGroup = new FormGroup({
values: new FormControl('not-json'),
})
component['ensureCustomFieldQueryModel'](filterGroup, 'not-json')
const model = component['getStoredCustomFieldQueryModel'](
filterGroup as any
)
expect(model).toBeDefined()
expect(model.queries.length).toBeGreaterThan(0)
const valuesControl = filterGroup.get('values')
expect(valuesControl.value).toBeNull()
const expression = new CustomFieldQueryExpression([
CustomFieldQueryLogicalOperator.And,
[[1, 'exact', 'value']],
])
model.queries = [expression]
jest.spyOn(model, 'isValid').mockReturnValue(true)
jest.spyOn(model, 'isEmpty').mockReturnValue(false)
model.changed.next(model)
expect(valuesControl.value).toEqual(JSON.stringify(expression.serialize()))
component['clearCustomFieldQueryModel'](filterGroup as any)
})
it('should handle custom field query model change edge cases', () => {
const groupWithoutControl = new FormGroup({})
const dummyModel = {
isValid: jest.fn().mockReturnValue(true),
isEmpty: jest.fn().mockReturnValue(false),
}
expect(() =>
component['onCustomFieldQueryModelChanged'](
groupWithoutControl as any,
dummyModel as any
)
).not.toThrow()
const groupWithControl = new FormGroup({
values: new FormControl('initial'),
})
const emptyModel = {
isValid: jest.fn().mockReturnValue(true),
isEmpty: jest.fn().mockReturnValue(true),
}
component['onCustomFieldQueryModelChanged'](
groupWithControl as any,
emptyModel as any
)
expect(groupWithControl.get('values').value).toBeNull()
})
it('should normalize filter values for single and multi selects', () => {
expect(
component['normalizeFilterValue'](TriggerFilterType.TagsAny)
).toEqual([])
expect(
component['normalizeFilterValue'](TriggerFilterType.TagsAny, 5)
).toEqual([5])
expect(
component['normalizeFilterValue'](TriggerFilterType.TagsAny, [5, 6])
).toEqual([5, 6])
expect(
component['normalizeFilterValue'](TriggerFilterType.CorrespondentIs, [7])
).toEqual(7)
expect(
component['normalizeFilterValue'](TriggerFilterType.CorrespondentIs, 8)
).toEqual(8)
const customFieldJson = JSON.stringify(['AND', [[1, 'exact', 'test']]])
expect(
component['normalizeFilterValue'](
TriggerFilterType.CustomFieldQuery,
customFieldJson
)
).toEqual(customFieldJson)
const customFieldObject = ['AND', [[1, 'exact', 'other']]]
expect(
component['normalizeFilterValue'](
TriggerFilterType.CustomFieldQuery,
customFieldObject
)
).toEqual(JSON.stringify(customFieldObject))
expect(
component['normalizeFilterValue'](
TriggerFilterType.CustomFieldQuery,
false
)
).toBeNull()
})
it('should add and remove filter form groups', () => {
component['changeDetector'] = { detectChanges: jest.fn() } as any
component.object = undefined
component.addTrigger()
const triggerGroup = component.triggerFields.at(0) as FormGroup
component.addFilter(triggerGroup)
component.removeFilter(triggerGroup, 0)
expect(component.getFiltersFormArray(triggerGroup).length).toBe(0)
component.addFilter(triggerGroup)
const filterArrayAfterAdd = component.getFiltersFormArray(triggerGroup)
filterArrayAfterAdd.at(0).get('type').setValue(TriggerFilterType.TagsAll)
expect(component.getFiltersFormArray(triggerGroup).length).toBe(1)
})
it('should remove selected custom field from the form group', () => {
const formGroup = new FormGroup({
assign_custom_fields: new FormControl([1, 2, 3]),

View File

@@ -6,6 +6,7 @@ import {
import { NgTemplateOutlet } from '@angular/common'
import { Component, OnInit, inject } from '@angular/core'
import {
AbstractControl,
FormArray,
FormControl,
FormGroup,
@@ -14,7 +15,7 @@ import {
} from '@angular/forms'
import { NgbAccordionModule } from '@ng-bootstrap/ng-bootstrap'
import { NgxBootstrapIconsModule } from 'ngx-bootstrap-icons'
import { first } from 'rxjs'
import { Subscription, first, takeUntil } from 'rxjs'
import { Correspondent } from 'src/app/data/correspondent'
import { CustomField, CustomFieldDataType } from 'src/app/data/custom-field'
import { DocumentType } from 'src/app/data/document-type'
@@ -45,7 +46,12 @@ import { StoragePathService } from 'src/app/services/rest/storage-path.service'
import { UserService } from 'src/app/services/rest/user.service'
import { WorkflowService } from 'src/app/services/rest/workflow.service'
import { SettingsService } from 'src/app/services/settings.service'
import { CustomFieldQueryExpression } from 'src/app/utils/custom-field-query-element'
import { ConfirmButtonComponent } from '../../confirm-button/confirm-button.component'
import {
CustomFieldQueriesModel,
CustomFieldsQueryDropdownComponent,
} from '../../custom-fields-query-dropdown/custom-fields-query-dropdown.component'
import { CheckComponent } from '../../input/check/check.component'
import { CustomFieldsValuesComponent } from '../../input/custom-fields-values/custom-fields-values.component'
import { EntriesComponent } from '../../input/entries/entries.component'
@@ -135,10 +141,235 @@ export const WORKFLOW_ACTION_OPTIONS = [
},
]
export enum TriggerFilterType {
TagsAny = 'tags_any',
TagsAll = 'tags_all',
TagsNone = 'tags_none',
CorrespondentIs = 'correspondent_is',
CorrespondentNot = 'correspondent_not',
DocumentTypeIs = 'document_type_is',
DocumentTypeNot = 'document_type_not',
StoragePathIs = 'storage_path_is',
StoragePathNot = 'storage_path_not',
CustomFieldQuery = 'custom_field_query',
}
interface TriggerFilterDefinition {
id: TriggerFilterType
name: string
inputType: 'tags' | 'select' | 'customFieldQuery'
allowMultipleEntries: boolean
allowMultipleValues: boolean
selectItems?: 'correspondents' | 'documentTypes' | 'storagePaths'
disabled?: boolean
}
type TriggerFilterOption = TriggerFilterDefinition & {
disabled?: boolean
}
type TriggerFilterAggregate = {
filter_has_tags: number[]
filter_has_all_tags: number[]
filter_has_not_tags: number[]
filter_has_not_correspondents: number[]
filter_has_not_document_types: number[]
filter_has_not_storage_paths: number[]
filter_has_correspondent: number | null
filter_has_document_type: number | null
filter_has_storage_path: number | null
filter_custom_field_query: string | null
}
interface FilterHandler {
apply: (aggregate: TriggerFilterAggregate, values: any) => void
extract: (trigger: WorkflowTrigger) => any
hasValue: (value: any) => boolean
}
const CUSTOM_FIELD_QUERY_MODEL_KEY = Symbol('customFieldQueryModel')
const CUSTOM_FIELD_QUERY_SUBSCRIPTION_KEY = Symbol(
'customFieldQuerySubscription'
)
type CustomFieldFilterGroup = FormGroup & {
[CUSTOM_FIELD_QUERY_MODEL_KEY]?: CustomFieldQueriesModel
[CUSTOM_FIELD_QUERY_SUBSCRIPTION_KEY]?: Subscription
}
const TRIGGER_FILTER_DEFINITIONS: TriggerFilterDefinition[] = [
{
id: TriggerFilterType.TagsAny,
name: $localize`Has any of these tags`,
inputType: 'tags',
allowMultipleEntries: false,
allowMultipleValues: true,
},
{
id: TriggerFilterType.TagsAll,
name: $localize`Has all of these tags`,
inputType: 'tags',
allowMultipleEntries: false,
allowMultipleValues: true,
},
{
id: TriggerFilterType.TagsNone,
name: $localize`Does not have these tags`,
inputType: 'tags',
allowMultipleEntries: false,
allowMultipleValues: true,
},
{
id: TriggerFilterType.CorrespondentIs,
name: $localize`Has correspondent`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: false,
selectItems: 'correspondents',
},
{
id: TriggerFilterType.CorrespondentNot,
name: $localize`Does not have correspondents`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: true,
selectItems: 'correspondents',
},
{
id: TriggerFilterType.DocumentTypeIs,
name: $localize`Has document type`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: false,
selectItems: 'documentTypes',
},
{
id: TriggerFilterType.DocumentTypeNot,
name: $localize`Does not have document types`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: true,
selectItems: 'documentTypes',
},
{
id: TriggerFilterType.StoragePathIs,
name: $localize`Has storage path`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: false,
selectItems: 'storagePaths',
},
{
id: TriggerFilterType.StoragePathNot,
name: $localize`Does not have storage paths`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: true,
selectItems: 'storagePaths',
},
{
id: TriggerFilterType.CustomFieldQuery,
name: $localize`Matches custom field query`,
inputType: 'customFieldQuery',
allowMultipleEntries: false,
allowMultipleValues: false,
},
]
const TRIGGER_MATCHING_ALGORITHMS = MATCHING_ALGORITHMS.filter(
(a) => a.id !== MATCH_AUTO
)
const FILTER_HANDLERS: Record<TriggerFilterType, FilterHandler> = {
[TriggerFilterType.TagsAny]: {
apply: (aggregate, values) => {
aggregate.filter_has_tags = Array.isArray(values) ? [...values] : [values]
},
extract: (trigger) => trigger.filter_has_tags,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.TagsAll]: {
apply: (aggregate, values) => {
aggregate.filter_has_all_tags = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_all_tags,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.TagsNone]: {
apply: (aggregate, values) => {
aggregate.filter_has_not_tags = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_not_tags,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.CorrespondentIs]: {
apply: (aggregate, values) => {
aggregate.filter_has_correspondent = Array.isArray(values)
? (values[0] ?? null)
: values
},
extract: (trigger) => trigger.filter_has_correspondent,
hasValue: (value) => value !== null && value !== undefined,
},
[TriggerFilterType.CorrespondentNot]: {
apply: (aggregate, values) => {
aggregate.filter_has_not_correspondents = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_not_correspondents,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.DocumentTypeIs]: {
apply: (aggregate, values) => {
aggregate.filter_has_document_type = Array.isArray(values)
? (values[0] ?? null)
: values
},
extract: (trigger) => trigger.filter_has_document_type,
hasValue: (value) => value !== null && value !== undefined,
},
[TriggerFilterType.DocumentTypeNot]: {
apply: (aggregate, values) => {
aggregate.filter_has_not_document_types = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_not_document_types,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.StoragePathIs]: {
apply: (aggregate, values) => {
aggregate.filter_has_storage_path = Array.isArray(values)
? (values[0] ?? null)
: values
},
extract: (trigger) => trigger.filter_has_storage_path,
hasValue: (value) => value !== null && value !== undefined,
},
[TriggerFilterType.StoragePathNot]: {
apply: (aggregate, values) => {
aggregate.filter_has_not_storage_paths = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_not_storage_paths,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.CustomFieldQuery]: {
apply: (aggregate, values) => {
aggregate.filter_custom_field_query = values as string
},
extract: (trigger) => trigger.filter_custom_field_query,
hasValue: (value) =>
typeof value === 'string' && value !== null && value.trim().length > 0,
},
}
@Component({
selector: 'pngx-workflow-edit-dialog',
templateUrl: './workflow-edit-dialog.component.html',
@@ -153,6 +384,7 @@ const TRIGGER_MATCHING_ALGORITHMS = MATCHING_ALGORITHMS.filter(
TextAreaComponent,
TagsComponent,
CustomFieldsValuesComponent,
CustomFieldsQueryDropdownComponent,
PermissionsGroupComponent,
PermissionsUserComponent,
ConfirmButtonComponent,
@@ -170,6 +402,8 @@ export class WorkflowEditDialogComponent
{
public WorkflowTriggerType = WorkflowTriggerType
public WorkflowActionType = WorkflowActionType
public TriggerFilterType = TriggerFilterType
public filterDefinitions = TRIGGER_FILTER_DEFINITIONS
private correspondentService: CorrespondentService
private documentTypeService: DocumentTypeService
@@ -189,6 +423,11 @@ export class WorkflowEditDialogComponent
private allowedActionTypes = []
private readonly triggerFilterOptionsMap = new WeakMap<
FormArray,
TriggerFilterOption[]
>()
constructor() {
super()
this.service = inject(WorkflowService)
@@ -390,6 +629,416 @@ export class WorkflowEditDialogComponent
return this.objectForm.get('actions') as FormArray
}
protected override getFormValues(): any {
const formValues = super.getFormValues()
if (formValues?.triggers?.length) {
formValues.triggers = formValues.triggers.map(
(trigger: any, index: number) => {
const triggerFormGroup = this.triggerFields.at(index) as FormGroup
const filters = this.getFiltersFormArray(triggerFormGroup)
const aggregate: TriggerFilterAggregate = {
filter_has_tags: [],
filter_has_all_tags: [],
filter_has_not_tags: [],
filter_has_not_correspondents: [],
filter_has_not_document_types: [],
filter_has_not_storage_paths: [],
filter_has_correspondent: null,
filter_has_document_type: null,
filter_has_storage_path: null,
filter_custom_field_query: null,
}
for (const control of filters.controls) {
const type = control.get('type').value as TriggerFilterType
const values = control.get('values').value
if (values === null || values === undefined) {
continue
}
if (Array.isArray(values) && values.length === 0) {
continue
}
const handler = FILTER_HANDLERS[type]
handler?.apply(aggregate, values)
}
trigger.filter_has_tags = aggregate.filter_has_tags
trigger.filter_has_all_tags = aggregate.filter_has_all_tags
trigger.filter_has_not_tags = aggregate.filter_has_not_tags
trigger.filter_has_not_correspondents =
aggregate.filter_has_not_correspondents
trigger.filter_has_not_document_types =
aggregate.filter_has_not_document_types
trigger.filter_has_not_storage_paths =
aggregate.filter_has_not_storage_paths
trigger.filter_has_correspondent =
aggregate.filter_has_correspondent ?? null
trigger.filter_has_document_type =
aggregate.filter_has_document_type ?? null
trigger.filter_has_storage_path =
aggregate.filter_has_storage_path ?? null
trigger.filter_custom_field_query =
aggregate.filter_custom_field_query ?? null
delete trigger.filters
return trigger
}
)
}
return formValues
}
public matchingPatternRequired(formGroup: FormGroup): boolean {
return formGroup.get('matching_algorithm').value !== MATCH_NONE
}
private createFilterFormGroup(
type: TriggerFilterType,
initialValue?: any
): FormGroup {
const group = new FormGroup({
type: new FormControl(type),
values: new FormControl(this.normalizeFilterValue(type, initialValue)),
})
group.get('type').valueChanges.subscribe((newType: TriggerFilterType) => {
if (newType === TriggerFilterType.CustomFieldQuery) {
this.ensureCustomFieldQueryModel(group)
} else {
this.clearCustomFieldQueryModel(group)
group.get('values').setValue(this.getDefaultFilterValue(newType), {
emitEvent: false,
})
}
})
if (type === TriggerFilterType.CustomFieldQuery) {
this.ensureCustomFieldQueryModel(group, initialValue)
}
return group
}
private buildFiltersFormArray(trigger: WorkflowTrigger): FormArray {
const filters = new FormArray([])
for (const definition of this.filterDefinitions) {
const handler = FILTER_HANDLERS[definition.id]
if (!handler) {
continue
}
const value = handler.extract(trigger)
if (!handler.hasValue(value)) {
continue
}
filters.push(this.createFilterFormGroup(definition.id, value))
}
return filters
}
getFiltersFormArray(formGroup: FormGroup): FormArray {
return formGroup.get('filters') as FormArray
}
getFilterTypeOptions(formGroup: FormGroup, filterIndex: number) {
const filters = this.getFiltersFormArray(formGroup)
const options = this.getFilterTypeOptionsForArray(filters)
const currentType = filters.at(filterIndex).get('type')
.value as TriggerFilterType
const usedTypes = new Set(
filters.controls.map(
(control) => control.get('type').value as TriggerFilterType
)
)
for (const option of options) {
if (option.allowMultipleEntries) {
option.disabled = false
continue
}
option.disabled = usedTypes.has(option.id) && option.id !== currentType
}
return options
}
canAddFilter(formGroup: FormGroup): boolean {
const filters = this.getFiltersFormArray(formGroup)
const usedTypes = new Set(
filters.controls.map(
(control) => control.get('type').value as TriggerFilterType
)
)
return this.filterDefinitions.some((definition) => {
if (definition.allowMultipleEntries) {
return true
}
return !usedTypes.has(definition.id)
})
}
addFilter(triggerFormGroup: FormGroup): FormGroup | null {
const triggerIndex = this.triggerFields.controls.indexOf(triggerFormGroup)
if (triggerIndex === -1) {
return null
}
const filters = this.getFiltersFormArray(triggerFormGroup)
const availableDefinition = this.filterDefinitions.find((definition) => {
if (definition.allowMultipleEntries) {
return true
}
return !filters.controls.some(
(control) => control.get('type').value === definition.id
)
})
if (!availableDefinition) {
return null
}
filters.push(this.createFilterFormGroup(availableDefinition.id))
triggerFormGroup.markAsDirty()
triggerFormGroup.markAsTouched()
return filters.at(-1) as FormGroup
}
removeFilter(triggerFormGroup: FormGroup, filterIndex: number) {
const triggerIndex = this.triggerFields.controls.indexOf(triggerFormGroup)
if (triggerIndex === -1) {
return
}
const filters = this.getFiltersFormArray(triggerFormGroup)
const filterGroup = filters.at(filterIndex) as FormGroup
if (filterGroup?.get('type').value === TriggerFilterType.CustomFieldQuery) {
this.clearCustomFieldQueryModel(filterGroup)
}
filters.removeAt(filterIndex)
triggerFormGroup.markAsDirty()
triggerFormGroup.markAsTouched()
}
getFilterDefinition(
type: TriggerFilterType
): TriggerFilterDefinition | undefined {
return this.filterDefinitions.find((definition) => definition.id === type)
}
getFilterName(type: TriggerFilterType): string {
return this.getFilterDefinition(type)?.name ?? ''
}
isTagsFilter(type: TriggerFilterType): boolean {
return this.getFilterDefinition(type)?.inputType === 'tags'
}
isCustomFieldQueryFilter(type: TriggerFilterType): boolean {
return this.getFilterDefinition(type)?.inputType === 'customFieldQuery'
}
isMultiValueFilter(type: TriggerFilterType): boolean {
switch (type) {
case TriggerFilterType.TagsAny:
case TriggerFilterType.TagsAll:
case TriggerFilterType.TagsNone:
case TriggerFilterType.CorrespondentNot:
case TriggerFilterType.DocumentTypeNot:
case TriggerFilterType.StoragePathNot:
return true
default:
return false
}
}
isSelectMultiple(type: TriggerFilterType): boolean {
return !this.isTagsFilter(type) && this.isMultiValueFilter(type)
}
getFilterSelectItems(type: TriggerFilterType) {
const definition = this.getFilterDefinition(type)
if (!definition || definition.inputType !== 'select') {
return []
}
switch (definition.selectItems) {
case 'correspondents':
return this.correspondents
case 'documentTypes':
return this.documentTypes
case 'storagePaths':
return this.storagePaths
default:
return []
}
}
getCustomFieldQueryModel(control: AbstractControl): CustomFieldQueriesModel {
return this.ensureCustomFieldQueryModel(control as FormGroup)
}
onCustomFieldQuerySelectionChange(
control: AbstractControl,
model: CustomFieldQueriesModel
) {
this.onCustomFieldQueryModelChanged(control as FormGroup, model)
}
isCustomFieldQueryValid(control: AbstractControl): boolean {
const model = this.getStoredCustomFieldQueryModel(control as FormGroup)
if (!model) {
return true
}
return model.isEmpty() || model.isValid()
}
private getFilterTypeOptionsForArray(
filters: FormArray
): TriggerFilterOption[] {
let cached = this.triggerFilterOptionsMap.get(filters)
if (!cached) {
cached = this.filterDefinitions.map((definition) => ({
...definition,
disabled: false,
}))
this.triggerFilterOptionsMap.set(filters, cached)
}
return cached
}
private ensureCustomFieldQueryModel(
filterGroup: FormGroup,
initialValue?: any
): CustomFieldQueriesModel {
const existingModel = this.getStoredCustomFieldQueryModel(filterGroup)
if (existingModel) {
return existingModel
}
const model = new CustomFieldQueriesModel()
this.setCustomFieldQueryModel(filterGroup, model)
const rawValue =
typeof initialValue === 'string'
? initialValue
: (filterGroup.get('values').value as string)
if (rawValue) {
try {
const parsed = JSON.parse(rawValue)
const expression = new CustomFieldQueryExpression(parsed)
model.queries = [expression]
} catch {
model.clear(false)
model.addInitialAtom()
}
}
const subscription = model.changed
.pipe(takeUntil(this.unsubscribeNotifier))
.subscribe(() => {
this.onCustomFieldQueryModelChanged(filterGroup, model)
})
filterGroup[CUSTOM_FIELD_QUERY_SUBSCRIPTION_KEY]?.unsubscribe()
filterGroup[CUSTOM_FIELD_QUERY_SUBSCRIPTION_KEY] = subscription
this.onCustomFieldQueryModelChanged(filterGroup, model)
return model
}
private clearCustomFieldQueryModel(filterGroup: FormGroup) {
const group = filterGroup as CustomFieldFilterGroup
group[CUSTOM_FIELD_QUERY_SUBSCRIPTION_KEY]?.unsubscribe()
delete group[CUSTOM_FIELD_QUERY_SUBSCRIPTION_KEY]
delete group[CUSTOM_FIELD_QUERY_MODEL_KEY]
}
private getStoredCustomFieldQueryModel(
filterGroup: FormGroup
): CustomFieldQueriesModel | null {
return (
(filterGroup as CustomFieldFilterGroup)[CUSTOM_FIELD_QUERY_MODEL_KEY] ??
null
)
}
private setCustomFieldQueryModel(
filterGroup: FormGroup,
model: CustomFieldQueriesModel
) {
const group = filterGroup as CustomFieldFilterGroup
group[CUSTOM_FIELD_QUERY_MODEL_KEY] = model
}
private onCustomFieldQueryModelChanged(
filterGroup: FormGroup,
model: CustomFieldQueriesModel
) {
const control = filterGroup.get('values')
if (!control) {
return
}
if (!model.isValid()) {
control.setValue(null, { emitEvent: false })
return
}
if (model.isEmpty()) {
control.setValue(null, { emitEvent: false })
return
}
const serialized = JSON.stringify(model.queries[0].serialize())
control.setValue(serialized, { emitEvent: false })
}
private getDefaultFilterValue(type: TriggerFilterType) {
if (type === TriggerFilterType.CustomFieldQuery) {
return null
}
return this.isMultiValueFilter(type) ? [] : null
}
private normalizeFilterValue(type: TriggerFilterType, value?: any) {
if (value === undefined || value === null) {
return this.getDefaultFilterValue(type)
}
if (type === TriggerFilterType.CustomFieldQuery) {
if (typeof value === 'string') {
return value
}
return value ? JSON.stringify(value) : null
}
if (this.isMultiValueFilter(type)) {
return Array.isArray(value) ? [...value] : [value]
}
if (Array.isArray(value)) {
return value.length > 0 ? value[0] : null
}
return value
}
private createTriggerField(
trigger: WorkflowTrigger,
emitEvent: boolean = false
@@ -405,16 +1054,7 @@ export class WorkflowEditDialogComponent
matching_algorithm: new FormControl(trigger.matching_algorithm),
match: new FormControl(trigger.match),
is_insensitive: new FormControl(trigger.is_insensitive),
filter_has_tags: new FormControl(trigger.filter_has_tags),
filter_has_correspondent: new FormControl(
trigger.filter_has_correspondent
),
filter_has_document_type: new FormControl(
trigger.filter_has_document_type
),
filter_has_storage_path: new FormControl(
trigger.filter_has_storage_path
),
filters: this.buildFiltersFormArray(trigger),
schedule_offset_days: new FormControl(trigger.schedule_offset_days),
schedule_is_recurring: new FormControl(trigger.schedule_is_recurring),
schedule_recurring_interval_days: new FormControl(
@@ -537,6 +1177,12 @@ export class WorkflowEditDialogComponent
filter_path: null,
filter_mailrule: null,
filter_has_tags: [],
filter_has_all_tags: [],
filter_has_not_tags: [],
filter_has_not_correspondents: [],
filter_has_not_document_types: [],
filter_has_not_storage_paths: [],
filter_custom_field_query: null,
filter_has_correspondent: null,
filter_has_document_type: null,
filter_has_storage_path: null,

View File

@@ -1,66 +1,68 @@
<div class="mb-3 paperless-input-select" [class.disabled]="disabled">
<div class="row">
<div class="d-flex align-items-center position-relative hidden-button-container" [class.col-md-3]="horizontal">
@if (title) {
<label class="form-label" [class.mb-md-0]="horizontal" [for]="inputId">{{title}}</label>
}
@if (removable) {
<button type="button" class="btn btn-sm btn-danger position-absolute left-0" (click)="removed.emit(this)">
<i-bs name="x"></i-bs>&nbsp;<ng-container i18n>Remove</ng-container>
@if (title || removable) {
<div class="d-flex align-items-center position-relative hidden-button-container" [class.col-md-3]="horizontal">
@if (title) {
<label class="form-label" [class.mb-md-0]="horizontal" [for]="inputId">{{title}}</label>
}
@if (removable) {
<button type="button" class="btn btn-sm btn-danger position-absolute left-0" (click)="removed.emit(this)">
<i-bs name="x"></i-bs>&nbsp;<ng-container i18n>Remove</ng-container>
</button>
}
</div>
}
<div [class.col-md-9]="horizontal">
<div [class.input-group]="allowCreateNew || showFilter" [class.is-invalid]="error">
<ng-select name="inputId" [(ngModel)]="value"
[disabled]="disabled"
[style.color]="textColor"
[style.background]="backgroundColor"
[class.private]="isPrivate"
[clearable]="allowNull"
[items]="items"
[addTag]="allowCreateNew && addItemRef"
addTagText="Add item"
i18n-addTagText="Used for both types, correspondents, storage paths"
[placeholder]="placeholder"
[notFoundText]="notFoundText"
[multiple]="multiple"
[bindLabel]="bindLabel"
bindValue="id"
(change)="onChange(value)"
(search)="onSearch($event)"
(focus)="clearLastSearchTerm()"
(clear)="clearLastSearchTerm()"
(blur)="onBlur()">
<ng-template ng-option-tmp let-item="item">
<span [title]="item[bindLabel]">{{item[bindLabel]}}</span>
</ng-template>
</ng-select>
@if (allowCreateNew && !hideAddButton) {
<button class="btn btn-outline-secondary" type="button" (click)="addItem()" [disabled]="disabled">
<i-bs width="1.2em" height="1.2em" name="plus"></i-bs>
</button>
}
@if (showFilter) {
<button class="btn btn-outline-secondary" type="button" (click)="onFilterDocuments()" [disabled]="isPrivate || this.value === null" title="{{ filterButtonTitle }}">
<i-bs width="1.2em" height="1.2em" name="filter"></i-bs>
</button>
}
</div>
<div [class.col-md-9]="horizontal">
<div [class.input-group]="allowCreateNew || showFilter" [class.is-invalid]="error">
<ng-select name="inputId" [(ngModel)]="value"
[disabled]="disabled"
[style.color]="textColor"
[style.background]="backgroundColor"
[class.private]="isPrivate"
[clearable]="allowNull"
[items]="items"
[addTag]="allowCreateNew && addItemRef"
addTagText="Add item"
i18n-addTagText="Used for both types, correspondents, storage paths"
[placeholder]="placeholder"
[notFoundText]="notFoundText"
[multiple]="multiple"
[bindLabel]="bindLabel"
bindValue="id"
(change)="onChange(value)"
(search)="onSearch($event)"
(focus)="clearLastSearchTerm()"
(clear)="clearLastSearchTerm()"
(blur)="onBlur()">
<ng-template ng-option-tmp let-item="item">
<span [title]="item[bindLabel]">{{item[bindLabel]}}</span>
</ng-template>
</ng-select>
@if (allowCreateNew && !hideAddButton) {
<button class="btn btn-outline-secondary" type="button" (click)="addItem()" [disabled]="disabled">
<i-bs width="1.2em" height="1.2em" name="plus"></i-bs>
</button>
}
@if (showFilter) {
<button class="btn btn-outline-secondary" type="button" (click)="onFilterDocuments()" [disabled]="isPrivate || this.value === null" title="{{ filterButtonTitle }}">
<i-bs width="1.2em" height="1.2em" name="filter"></i-bs>
</button>
}
</div>
<div class="invalid-feedback">
{{error}}
</div>
@if (hint) {
<small class="form-text text-muted">{{hint}}</small>
}
@if (getSuggestions().length > 0) {
<small>
<span i18n>Suggestions:</span>&nbsp;
@for (s of getSuggestions(); track s) {
<a (click)="value = s.id; onChange(value)" [routerLink]="[]">{{s.name}}</a>&nbsp;
}
</small>
}
<div class="invalid-feedback">
{{error}}
</div>
@if (hint) {
<small class="form-text text-muted">{{hint}}</small>
}
@if (getSuggestions().length > 0) {
<small>
<span i18n>Suggestions:</span>&nbsp;
@for (s of getSuggestions(); track s) {
<a (click)="value = s.id; onChange(value)" [routerLink]="[]">{{s.name}}</a>&nbsp;
}
</small>
}
</div>
</div>
</div>

View File

@@ -1,8 +1,10 @@
<div class="mb-3 paperless-input-select paperless-input-tags" [class.disabled]="disabled" [class.pb-3]="getSuggestions().length > 0">
<div class="row">
<div class="d-flex align-items-center" [class.col-md-3]="horizontal">
<label class="form-label" [class.mb-md-0]="horizontal" for="tags">{{title}}</label>
</div>
@if (title) {
<div class="d-flex align-items-center" [class.col-md-3]="horizontal">
<label class="form-label" [class.mb-md-0]="horizontal" for="tags">{{title}}</label>
</div>
}
<div class="position-relative" [class.col-md-9]="horizontal">
<div class="input-group flex-nowrap">
<ng-select #tagSelect name="tags" [items]="tags" bindLabel="name" bindValue="id" [(ngModel)]="value"

View File

@@ -40,6 +40,18 @@ export interface WorkflowTrigger extends ObjectWithId {
filter_has_tags?: number[] // Tag.id[]
filter_has_all_tags?: number[] // Tag.id[]
filter_has_not_tags?: number[] // Tag.id[]
filter_has_not_correspondents?: number[] // Correspondent.id[]
filter_has_not_document_types?: number[] // DocumentType.id[]
filter_has_not_storage_paths?: number[] // StoragePath.id[]
filter_custom_field_query?: string
filter_has_correspondent?: number // Correspondent.id
filter_has_document_type?: number // DocumentType.id

View File

@@ -6,8 +6,11 @@ from fnmatch import fnmatch
from fnmatch import translate as fnmatch_translate
from typing import TYPE_CHECKING
from rest_framework import serializers
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentSource
from documents.filters import CustomFieldQueryParser
from documents.models import Correspondent
from documents.models import Document
from documents.models import DocumentType
@@ -360,10 +363,9 @@ def existing_document_matches_workflow(
)
trigger_matched = False
# Document tags vs trigger has_tags
if (
trigger.filter_has_tags.all().count() > 0
and document.tags.filter(
# Document tags vs trigger has_tags (any of)
if trigger.filter_has_tags.all().count() > 0 and (
document.tags.filter(
id__in=trigger.filter_has_tags.all().values_list("id"),
).count()
== 0
@@ -374,35 +376,126 @@ def existing_document_matches_workflow(
)
trigger_matched = False
# Document correspondent vs trigger has_correspondent
# Document tags vs trigger has_all_tags (all of)
if trigger.filter_has_all_tags.all().count() > 0 and trigger_matched:
required_tag_ids = set(
trigger.filter_has_all_tags.all().values_list("id", flat=True),
)
document_tag_ids = set(
document.tags.all().values_list("id", flat=True),
)
missing_tags = required_tag_ids - document_tag_ids
if missing_tags:
reason = (
f"Document tags {document.tags.all()} do not contain all of"
f" {trigger.filter_has_all_tags.all()}",
)
trigger_matched = False
# Document tags vs trigger has_not_tags (none of)
if (
trigger.filter_has_correspondent is not None
and document.correspondent != trigger.filter_has_correspondent
trigger.filter_has_not_tags.all().count() > 0
and trigger_matched
and document.tags.filter(
id__in=trigger.filter_has_not_tags.all().values_list("id"),
).exists()
):
reason = (
f"Document correspondent {document.correspondent} does not match {trigger.filter_has_correspondent}",
f"Document tags {document.tags.all()} include excluded tags"
f" {trigger.filter_has_not_tags.all()}",
)
trigger_matched = False
# Document correspondent vs trigger has_correspondent
if trigger_matched:
if (
trigger.filter_has_correspondent is not None
and document.correspondent != trigger.filter_has_correspondent
):
reason = (
f"Document correspondent {document.correspondent} does not match {trigger.filter_has_correspondent}",
)
trigger_matched = False
if (
trigger.filter_has_not_correspondents.all().count() > 0
and document.correspondent
and trigger.filter_has_not_correspondents.filter(
id=document.correspondent_id,
).exists()
):
reason = (
f"Document correspondent {document.correspondent} is excluded by"
f" {trigger.filter_has_not_correspondents.all()}",
)
trigger_matched = False
# Document document_type vs trigger has_document_type
if (
trigger.filter_has_document_type is not None
and document.document_type != trigger.filter_has_document_type
):
reason = (
f"Document doc type {document.document_type} does not match {trigger.filter_has_document_type}",
)
trigger_matched = False
if trigger_matched:
if (
trigger.filter_has_document_type is not None
and document.document_type != trigger.filter_has_document_type
):
reason = (
f"Document doc type {document.document_type} does not match {trigger.filter_has_document_type}",
)
trigger_matched = False
if (
trigger.filter_has_not_document_types.all().count() > 0
and document.document_type
and trigger.filter_has_not_document_types.filter(
id=document.document_type_id,
).exists()
):
reason = (
f"Document doc type {document.document_type} is excluded by"
f" {trigger.filter_has_not_document_types.all()}",
)
trigger_matched = False
# Document storage_path vs trigger has_storage_path
if (
trigger.filter_has_storage_path is not None
and document.storage_path != trigger.filter_has_storage_path
):
reason = (
f"Document storage path {document.storage_path} does not match {trigger.filter_has_storage_path}",
)
trigger_matched = False
if trigger_matched:
if (
trigger.filter_has_storage_path is not None
and document.storage_path != trigger.filter_has_storage_path
):
reason = (
f"Document storage path {document.storage_path} does not match {trigger.filter_has_storage_path}",
)
trigger_matched = False
if (
trigger.filter_has_not_storage_paths.all().count() > 0
and document.storage_path
and trigger.filter_has_not_storage_paths.filter(
id=document.storage_path_id,
).exists()
):
reason = (
f"Document storage path {document.storage_path} is excluded by"
f" {trigger.filter_has_not_storage_paths.all()}",
)
trigger_matched = False
if trigger_matched and trigger.filter_custom_field_query:
parser = CustomFieldQueryParser("filter_custom_field_query")
try:
custom_field_q, annotations = parser.parse(
trigger.filter_custom_field_query,
)
except serializers.ValidationError:
reason = "Invalid custom field query configuration"
trigger_matched = False
else:
qs = (
Document.objects.filter(id=document.id)
.annotate(**annotations)
.filter(custom_field_q)
)
if not qs.exists():
reason = "Document custom fields do not match the configured custom field query"
trigger_matched = False
# Document original_filename vs trigger filename
if (
@@ -438,21 +531,57 @@ def prefilter_documents_by_workflowtrigger(
tags__in=trigger.filter_has_tags.all(),
).distinct()
if trigger.filter_has_all_tags.all().count() > 0:
for tag_id in trigger.filter_has_all_tags.all().values_list("id", flat=True):
documents = documents.filter(tags__id=tag_id)
documents = documents.distinct()
if trigger.filter_has_not_tags.all().count() > 0:
documents = documents.exclude(
tags__in=trigger.filter_has_not_tags.all(),
).distinct()
if trigger.filter_has_correspondent is not None:
documents = documents.filter(
correspondent=trigger.filter_has_correspondent,
)
if trigger.filter_has_not_correspondents.all().count() > 0:
documents = documents.exclude(
correspondent__in=trigger.filter_has_not_correspondents.all(),
)
if trigger.filter_has_document_type is not None:
documents = documents.filter(
document_type=trigger.filter_has_document_type,
)
if trigger.filter_has_not_document_types.all().count() > 0:
documents = documents.exclude(
document_type__in=trigger.filter_has_not_document_types.all(),
)
if trigger.filter_has_storage_path is not None:
documents = documents.filter(
storage_path=trigger.filter_has_storage_path,
)
if trigger.filter_has_not_storage_paths.all().count() > 0:
documents = documents.exclude(
storage_path__in=trigger.filter_has_not_storage_paths.all(),
)
if trigger.filter_custom_field_query:
parser = CustomFieldQueryParser("filter_custom_field_query")
try:
custom_field_q, annotations = parser.parse(
trigger.filter_custom_field_query,
)
except serializers.ValidationError:
return documents.none()
documents = documents.annotate(**annotations).filter(custom_field_q)
if trigger.filter_filename is not None and len(trigger.filter_filename) > 0:
# the true fnmatch will actually run later so we just want a loose filter here
regex = fnmatch_translate(trigger.filter_filename).lstrip("^").rstrip("$")

View File

@@ -0,0 +1,73 @@
# Generated by Django 5.2.6 on 2025-10-07 18:52
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1071_tag_tn_ancestors_count_tag_tn_ancestors_pks_and_more"),
]
operations = [
migrations.AddField(
model_name="workflowtrigger",
name="filter_custom_field_query",
field=models.TextField(
blank=True,
help_text="JSON-encoded custom field query expression.",
null=True,
verbose_name="filter custom field query",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_all_tags",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_all",
to="documents.tag",
verbose_name="has all of these tag(s)",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_not_correspondents",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_not_correspondent",
to="documents.correspondent",
verbose_name="does not have these correspondent(s)",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_not_document_types",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_not_document_type",
to="documents.documenttype",
verbose_name="does not have these document type(s)",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_not_storage_paths",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_not_storage_path",
to="documents.storagepath",
verbose_name="does not have these storage path(s)",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_not_tags",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_not",
to="documents.tag",
verbose_name="does not have these tag(s)",
),
),
]

View File

@@ -1065,6 +1065,20 @@ class WorkflowTrigger(models.Model):
verbose_name=_("has these tag(s)"),
)
filter_has_all_tags = models.ManyToManyField(
Tag,
blank=True,
related_name="workflowtriggers_has_all",
verbose_name=_("has all of these tag(s)"),
)
filter_has_not_tags = models.ManyToManyField(
Tag,
blank=True,
related_name="workflowtriggers_has_not",
verbose_name=_("does not have these tag(s)"),
)
filter_has_document_type = models.ForeignKey(
DocumentType,
null=True,
@@ -1073,6 +1087,13 @@ class WorkflowTrigger(models.Model):
verbose_name=_("has this document type"),
)
filter_has_not_document_types = models.ManyToManyField(
DocumentType,
blank=True,
related_name="workflowtriggers_has_not_document_type",
verbose_name=_("does not have these document type(s)"),
)
filter_has_correspondent = models.ForeignKey(
Correspondent,
null=True,
@@ -1081,6 +1102,13 @@ class WorkflowTrigger(models.Model):
verbose_name=_("has this correspondent"),
)
filter_has_not_correspondents = models.ManyToManyField(
Correspondent,
blank=True,
related_name="workflowtriggers_has_not_correspondent",
verbose_name=_("does not have these correspondent(s)"),
)
filter_has_storage_path = models.ForeignKey(
StoragePath,
null=True,
@@ -1089,6 +1117,20 @@ class WorkflowTrigger(models.Model):
verbose_name=_("has this storage path"),
)
filter_has_not_storage_paths = models.ManyToManyField(
StoragePath,
blank=True,
related_name="workflowtriggers_has_not_storage_path",
verbose_name=_("does not have these storage path(s)"),
)
filter_custom_field_query = models.TextField(
_("filter custom field query"),
null=True,
blank=True,
help_text=_("JSON-encoded custom field query expression."),
)
schedule_offset_days = models.IntegerField(
_("schedule offset days"),
default=0,

View File

@@ -43,6 +43,7 @@ if settings.AUDIT_LOG_ENABLED:
from documents import bulk_edit
from documents.data_models import DocumentSource
from documents.filters import CustomFieldQueryParser
from documents.models import Correspondent
from documents.models import CustomField
from documents.models import CustomFieldInstance
@@ -2194,6 +2195,12 @@ class WorkflowTriggerSerializer(serializers.ModelSerializer):
"match",
"is_insensitive",
"filter_has_tags",
"filter_has_all_tags",
"filter_has_not_tags",
"filter_custom_field_query",
"filter_has_not_correspondents",
"filter_has_not_document_types",
"filter_has_not_storage_paths",
"filter_has_correspondent",
"filter_has_document_type",
"filter_has_storage_path",
@@ -2219,6 +2226,20 @@ class WorkflowTriggerSerializer(serializers.ModelSerializer):
):
attrs["filter_path"] = None
if (
"filter_custom_field_query" in attrs
and attrs["filter_custom_field_query"] is not None
and len(attrs["filter_custom_field_query"]) == 0
):
attrs["filter_custom_field_query"] = None
if (
"filter_custom_field_query" in attrs
and attrs["filter_custom_field_query"] is not None
):
parser = CustomFieldQueryParser("filter_custom_field_query")
parser.parse(attrs["filter_custom_field_query"])
trigger_type = attrs.get("type", getattr(self.instance, "type", None))
if (
trigger_type == WorkflowTrigger.WorkflowTriggerType.CONSUMPTION
@@ -2414,6 +2435,20 @@ class WorkflowSerializer(serializers.ModelSerializer):
if triggers is not None and triggers is not serializers.empty:
for trigger in triggers:
filter_has_tags = trigger.pop("filter_has_tags", None)
filter_has_all_tags = trigger.pop("filter_has_all_tags", None)
filter_has_not_tags = trigger.pop("filter_has_not_tags", None)
filter_has_not_correspondents = trigger.pop(
"filter_has_not_correspondents",
None,
)
filter_has_not_document_types = trigger.pop(
"filter_has_not_document_types",
None,
)
filter_has_not_storage_paths = trigger.pop(
"filter_has_not_storage_paths",
None,
)
# Convert sources to strings to handle django-multiselectfield v1.0 changes
WorkflowTriggerSerializer.normalize_workflow_trigger_sources(trigger)
trigger_instance, _ = WorkflowTrigger.objects.update_or_create(
@@ -2422,6 +2457,22 @@ class WorkflowSerializer(serializers.ModelSerializer):
)
if filter_has_tags is not None:
trigger_instance.filter_has_tags.set(filter_has_tags)
if filter_has_all_tags is not None:
trigger_instance.filter_has_all_tags.set(filter_has_all_tags)
if filter_has_not_tags is not None:
trigger_instance.filter_has_not_tags.set(filter_has_not_tags)
if filter_has_not_correspondents is not None:
trigger_instance.filter_has_not_correspondents.set(
filter_has_not_correspondents,
)
if filter_has_not_document_types is not None:
trigger_instance.filter_has_not_document_types.set(
filter_has_not_document_types,
)
if filter_has_not_storage_paths is not None:
trigger_instance.filter_has_not_storage_paths.set(
filter_has_not_storage_paths,
)
set_triggers.append(trigger_instance)
if actions is not None and actions is not serializers.empty:

View File

@@ -184,6 +184,17 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
"filter_filename": "*",
"filter_path": "*/samples/*",
"filter_has_tags": [self.t1.id],
"filter_has_all_tags": [self.t2.id],
"filter_has_not_tags": [self.t3.id],
"filter_has_not_correspondents": [self.c2.id],
"filter_has_not_document_types": [self.dt2.id],
"filter_has_not_storage_paths": [self.sp2.id],
"filter_custom_field_query": json.dumps(
[
"AND",
[[self.cf1.id, "exact", "value"]],
],
),
"filter_has_document_type": self.dt.id,
"filter_has_correspondent": self.c.id,
"filter_has_storage_path": self.sp.id,
@@ -223,6 +234,36 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
)
self.assertEqual(response.status_code, status.HTTP_201_CREATED)
self.assertEqual(Workflow.objects.count(), 2)
workflow = Workflow.objects.get(name="Workflow 2")
trigger = workflow.triggers.first()
self.assertSetEqual(
set(trigger.filter_has_tags.values_list("id", flat=True)),
{self.t1.id},
)
self.assertSetEqual(
set(trigger.filter_has_all_tags.values_list("id", flat=True)),
{self.t2.id},
)
self.assertSetEqual(
set(trigger.filter_has_not_tags.values_list("id", flat=True)),
{self.t3.id},
)
self.assertSetEqual(
set(trigger.filter_has_not_correspondents.values_list("id", flat=True)),
{self.c2.id},
)
self.assertSetEqual(
set(trigger.filter_has_not_document_types.values_list("id", flat=True)),
{self.dt2.id},
)
self.assertSetEqual(
set(trigger.filter_has_not_storage_paths.values_list("id", flat=True)),
{self.sp2.id},
)
self.assertEqual(
trigger.filter_custom_field_query,
json.dumps(["AND", [[self.cf1.id, "exact", "value"]]]),
)
def test_api_create_invalid_workflow_trigger(self):
"""
@@ -376,6 +417,14 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
{
"type": WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
"filter_has_tags": [self.t1.id],
"filter_has_all_tags": [self.t2.id],
"filter_has_not_tags": [self.t3.id],
"filter_has_not_correspondents": [self.c2.id],
"filter_has_not_document_types": [self.dt2.id],
"filter_has_not_storage_paths": [self.sp2.id],
"filter_custom_field_query": json.dumps(
["AND", [[self.cf1.id, "exact", "value"]]],
),
"filter_has_correspondent": self.c.id,
"filter_has_document_type": self.dt.id,
},
@@ -393,6 +442,30 @@ class TestApiWorkflows(DirectoriesMixin, APITestCase):
workflow = Workflow.objects.get(id=response.data["id"])
self.assertEqual(workflow.name, "Workflow Updated")
self.assertEqual(workflow.triggers.first().filter_has_tags.first(), self.t1)
self.assertEqual(
workflow.triggers.first().filter_has_all_tags.first(),
self.t2,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_tags.first(),
self.t3,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_correspondents.first(),
self.c2,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_document_types.first(),
self.dt2,
)
self.assertEqual(
workflow.triggers.first().filter_has_not_storage_paths.first(),
self.sp2,
)
self.assertEqual(
workflow.triggers.first().filter_custom_field_query,
json.dumps(["AND", [[self.cf1.id, "exact", "value"]]]),
)
self.assertEqual(workflow.actions.first().assign_title, "Action New Title")
def test_api_update_workflow_no_trigger_actions(self):

View File

@@ -1,4 +1,5 @@
import datetime
import json
import shutil
import socket
from datetime import timedelta
@@ -31,6 +32,7 @@ from documents import tasks
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentSource
from documents.matching import document_matches_workflow
from documents.matching import existing_document_matches_workflow
from documents.matching import prefilter_documents_by_workflowtrigger
from documents.models import Correspondent
from documents.models import CustomField
@@ -46,6 +48,7 @@ from documents.models import WorkflowActionEmail
from documents.models import WorkflowActionWebhook
from documents.models import WorkflowRun
from documents.models import WorkflowTrigger
from documents.serialisers import WorkflowTriggerSerializer
from documents.signals import document_consumption_finished
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import DummyProgressManager
@@ -1083,6 +1086,406 @@ class TestWorkflows(
expected_str = f"Document tags {doc.tags.all()} do not include {trigger.filter_has_tags.all()}"
self.assertIn(expected_str, cm.output[1])
def test_document_added_no_match_all_tags(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
)
trigger.filter_has_all_tags.set([self.t1, self.t2])
action = WorkflowAction.objects.create(
assign_title="Doc assign owner",
assign_owner=self.user2,
)
w = Workflow.objects.create(
name="Workflow 1",
order=0,
)
w.triggers.add(trigger)
w.actions.add(action)
w.save()
doc = Document.objects.create(
title="sample test",
correspondent=self.c,
original_filename="sample.pdf",
)
doc.tags.set([self.t1])
doc.save()
with self.assertLogs("paperless.matching", level="DEBUG") as cm:
document_consumption_finished.send(
sender=self.__class__,
document=doc,
)
expected_str = f"Document did not match {w}"
self.assertIn(expected_str, cm.output[0])
expected_str = (
f"Document tags {doc.tags.all()} do not contain all of"
f" {trigger.filter_has_all_tags.all()}"
)
self.assertIn(expected_str, cm.output[1])
def test_document_added_excluded_tags(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
)
trigger.filter_has_not_tags.set([self.t3])
action = WorkflowAction.objects.create(
assign_title="Doc assign owner",
assign_owner=self.user2,
)
w = Workflow.objects.create(
name="Workflow 1",
order=0,
)
w.triggers.add(trigger)
w.actions.add(action)
w.save()
doc = Document.objects.create(
title="sample test",
correspondent=self.c,
original_filename="sample.pdf",
)
doc.tags.set([self.t3])
doc.save()
with self.assertLogs("paperless.matching", level="DEBUG") as cm:
document_consumption_finished.send(
sender=self.__class__,
document=doc,
)
expected_str = f"Document did not match {w}"
self.assertIn(expected_str, cm.output[0])
expected_str = (
f"Document tags {doc.tags.all()} include excluded tags"
f" {trigger.filter_has_not_tags.all()}"
)
self.assertIn(expected_str, cm.output[1])
def test_document_added_excluded_correspondent(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
)
trigger.filter_has_not_correspondents.set([self.c])
action = WorkflowAction.objects.create(
assign_title="Doc assign owner",
assign_owner=self.user2,
)
w = Workflow.objects.create(
name="Workflow 1",
order=0,
)
w.triggers.add(trigger)
w.actions.add(action)
w.save()
doc = Document.objects.create(
title="sample test",
correspondent=self.c,
original_filename="sample.pdf",
)
with self.assertLogs("paperless.matching", level="DEBUG") as cm:
document_consumption_finished.send(
sender=self.__class__,
document=doc,
)
expected_str = f"Document did not match {w}"
self.assertIn(expected_str, cm.output[0])
expected_str = (
f"Document correspondent {doc.correspondent} is excluded by"
f" {trigger.filter_has_not_correspondents.all()}"
)
self.assertIn(expected_str, cm.output[1])
def test_document_added_excluded_document_types(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
)
trigger.filter_has_not_document_types.set([self.dt])
action = WorkflowAction.objects.create(
assign_title="Doc assign owner",
assign_owner=self.user2,
)
w = Workflow.objects.create(
name="Workflow 1",
order=0,
)
w.triggers.add(trigger)
w.actions.add(action)
w.save()
doc = Document.objects.create(
title="sample test",
document_type=self.dt,
original_filename="sample.pdf",
)
with self.assertLogs("paperless.matching", level="DEBUG") as cm:
document_consumption_finished.send(
sender=self.__class__,
document=doc,
)
expected_str = f"Document did not match {w}"
self.assertIn(expected_str, cm.output[0])
expected_str = (
f"Document doc type {doc.document_type} is excluded by"
f" {trigger.filter_has_not_document_types.all()}"
)
self.assertIn(expected_str, cm.output[1])
def test_document_added_excluded_storage_paths(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
)
trigger.filter_has_not_storage_paths.set([self.sp])
action = WorkflowAction.objects.create(
assign_title="Doc assign owner",
assign_owner=self.user2,
)
w = Workflow.objects.create(
name="Workflow 1",
order=0,
)
w.triggers.add(trigger)
w.actions.add(action)
w.save()
doc = Document.objects.create(
title="sample test",
storage_path=self.sp,
original_filename="sample.pdf",
)
with self.assertLogs("paperless.matching", level="DEBUG") as cm:
document_consumption_finished.send(
sender=self.__class__,
document=doc,
)
expected_str = f"Document did not match {w}"
self.assertIn(expected_str, cm.output[0])
expected_str = (
f"Document storage path {doc.storage_path} is excluded by"
f" {trigger.filter_has_not_storage_paths.all()}"
)
self.assertIn(expected_str, cm.output[1])
def test_document_added_custom_field_query_no_match(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
filter_custom_field_query=json.dumps(
[
"AND",
[[self.cf1.id, "exact", "expected"]],
],
),
)
action = WorkflowAction.objects.create(
assign_title="Doc assign owner",
assign_owner=self.user2,
)
workflow = Workflow.objects.create(name="Workflow 1", order=0)
workflow.triggers.add(trigger)
workflow.actions.add(action)
workflow.save()
doc = Document.objects.create(
title="sample test",
correspondent=self.c,
original_filename="sample.pdf",
)
CustomFieldInstance.objects.create(
document=doc,
field=self.cf1,
value_text="other",
)
with self.assertLogs("paperless.matching", level="DEBUG") as cm:
document_consumption_finished.send(
sender=self.__class__,
document=doc,
)
expected_str = f"Document did not match {workflow}"
self.assertIn(expected_str, cm.output[0])
self.assertIn(
"Document custom fields do not match the configured custom field query",
cm.output[1],
)
def test_document_added_custom_field_query_match(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
filter_custom_field_query=json.dumps(
[
"AND",
[[self.cf1.id, "exact", "expected"]],
],
),
)
doc = Document.objects.create(
title="sample test",
correspondent=self.c,
original_filename="sample.pdf",
)
CustomFieldInstance.objects.create(
document=doc,
field=self.cf1,
value_text="expected",
)
matched, reason = existing_document_matches_workflow(doc, trigger)
self.assertTrue(matched)
self.assertEqual(reason, "")
def test_prefilter_documents_custom_field_query(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
filter_custom_field_query=json.dumps(
[
"AND",
[[self.cf1.id, "exact", "match"]],
],
),
)
doc1 = Document.objects.create(
title="doc 1",
correspondent=self.c,
original_filename="doc1.pdf",
checksum="checksum1",
)
CustomFieldInstance.objects.create(
document=doc1,
field=self.cf1,
value_text="match",
)
doc2 = Document.objects.create(
title="doc 2",
correspondent=self.c,
original_filename="doc2.pdf",
checksum="checksum2",
)
CustomFieldInstance.objects.create(
document=doc2,
field=self.cf1,
value_text="different",
)
filtered = prefilter_documents_by_workflowtrigger(
Document.objects.all(),
trigger,
)
self.assertIn(doc1, filtered)
self.assertNotIn(doc2, filtered)
def test_consumption_trigger_requires_filter_configuration(self):
serializer = WorkflowTriggerSerializer(
data={
"type": WorkflowTrigger.WorkflowTriggerType.CONSUMPTION,
},
)
self.assertFalse(serializer.is_valid())
errors = serializer.errors.get("non_field_errors", [])
self.assertIn(
"File name, path or mail rule filter are required",
[str(error) for error in errors],
)
def test_workflow_trigger_serializer_clears_empty_custom_field_query(self):
serializer = WorkflowTriggerSerializer(
data={
"type": WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
"filter_custom_field_query": "",
},
)
self.assertTrue(serializer.is_valid(), serializer.errors)
self.assertIsNone(serializer.validated_data.get("filter_custom_field_query"))
def test_existing_document_invalid_custom_field_query_configuration(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
filter_custom_field_query="{ not json",
)
document = Document.objects.create(
title="doc invalid query",
original_filename="invalid.pdf",
checksum="checksum-invalid-query",
)
matched, reason = existing_document_matches_workflow(document, trigger)
self.assertFalse(matched)
self.assertEqual(reason, "Invalid custom field query configuration")
def test_prefilter_documents_returns_none_for_invalid_custom_field_query(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
filter_custom_field_query="{ not json",
)
Document.objects.create(
title="doc",
original_filename="doc.pdf",
checksum="checksum-prefilter-invalid",
)
filtered = prefilter_documents_by_workflowtrigger(
Document.objects.all(),
trigger,
)
self.assertEqual(list(filtered), [])
def test_prefilter_documents_applies_all_filters(self):
other_document_type = DocumentType.objects.create(name="Other Type")
other_storage_path = StoragePath.objects.create(
name="Blocked path",
path="/blocked/",
)
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,
filter_has_correspondent=self.c,
filter_has_document_type=self.dt,
filter_has_storage_path=self.sp,
)
trigger.filter_has_tags.set([self.t1])
trigger.filter_has_all_tags.set([self.t1, self.t2])
trigger.filter_has_not_tags.set([self.t3])
trigger.filter_has_not_correspondents.set([self.c2])
trigger.filter_has_not_document_types.set([other_document_type])
trigger.filter_has_not_storage_paths.set([other_storage_path])
allowed_document = Document.objects.create(
title="allowed",
correspondent=self.c,
document_type=self.dt,
storage_path=self.sp,
original_filename="allow.pdf",
checksum="checksum-prefilter-allowed",
)
allowed_document.tags.set([self.t1, self.t2])
blocked_document = Document.objects.create(
title="blocked",
correspondent=self.c2,
document_type=other_document_type,
storage_path=other_storage_path,
original_filename="block.pdf",
checksum="checksum-prefilter-blocked",
)
blocked_document.tags.set([self.t1, self.t3])
filtered = prefilter_documents_by_workflowtrigger(
Document.objects.all(),
trigger,
)
self.assertIn(allowed_document, filtered)
self.assertNotIn(blocked_document, filtered)
def test_document_added_no_match_doctype(self):
trigger = WorkflowTrigger.objects.create(
type=WorkflowTrigger.WorkflowTriggerType.DOCUMENT_ADDED,

View File

@@ -322,7 +322,6 @@ INSTALLED_APPS = [
"paperless_tesseract.apps.PaperlessTesseractConfig",
"paperless_text.apps.PaperlessTextConfig",
"paperless_mail.apps.PaperlessMailConfig",
"paperless_remote.apps.PaperlessRemoteParserConfig",
"django.contrib.admin",
"rest_framework",
"rest_framework.authtoken",
@@ -1402,10 +1401,3 @@ WEBHOOKS_ALLOW_INTERNAL_REQUESTS = __get_boolean(
"PAPERLESS_WEBHOOKS_ALLOW_INTERNAL_REQUESTS",
"true",
)
###############################################################################
# Remote Parser #
###############################################################################
REMOTE_OCR_ENGINE = os.getenv("PAPERLESS_REMOTE_OCR_ENGINE")
REMOTE_OCR_API_KEY = os.getenv("PAPERLESS_REMOTE_OCR_API_KEY")
REMOTE_OCR_ENDPOINT = os.getenv("PAPERLESS_REMOTE_OCR_ENDPOINT")

View File

@@ -1,4 +0,0 @@
# this is here so that django finds the checks.
from paperless_remote.checks import check_remote_parser_configured
__all__ = ["check_remote_parser_configured"]

View File

@@ -1,14 +0,0 @@
from django.apps import AppConfig
from paperless_remote.signals import remote_consumer_declaration
class PaperlessRemoteParserConfig(AppConfig):
name = "paperless_remote"
def ready(self):
from documents.signals import document_consumer_declaration
document_consumer_declaration.connect(remote_consumer_declaration)
AppConfig.ready(self)

View File

@@ -1,17 +0,0 @@
from django.conf import settings
from django.core.checks import Error
from django.core.checks import register
@register()
def check_remote_parser_configured(app_configs, **kwargs):
if settings.REMOTE_OCR_ENGINE == "azureai" and not (
settings.REMOTE_OCR_ENDPOINT and settings.REMOTE_OCR_API_KEY
):
return [
Error(
"Azure AI remote parser requires endpoint and API key to be configured.",
),
]
return []

View File

@@ -1,113 +0,0 @@
from pathlib import Path
from django.conf import settings
from paperless_tesseract.parsers import RasterisedDocumentParser
class RemoteEngineConfig:
def __init__(
self,
engine: str,
api_key: str | None = None,
endpoint: str | None = None,
):
self.engine = engine
self.api_key = api_key
self.endpoint = endpoint
def engine_is_valid(self):
valid = self.engine in ["azureai"] and self.api_key is not None
if self.engine == "azureai":
valid = valid and self.endpoint is not None
return valid
class RemoteDocumentParser(RasterisedDocumentParser):
"""
This parser uses a remote OCR engine to parse documents. Currently, it supports Azure AI Vision
as this is the only service that provides a remote OCR API with text-embedded PDF output.
"""
logging_name = "paperless.parsing.remote"
def get_settings(self) -> RemoteEngineConfig:
"""
Returns the configuration for the remote OCR engine, loaded from Django settings.
"""
return RemoteEngineConfig(
engine=settings.REMOTE_OCR_ENGINE,
api_key=settings.REMOTE_OCR_API_KEY,
endpoint=settings.REMOTE_OCR_ENDPOINT,
)
def supported_mime_types(self):
if self.settings.engine_is_valid():
return {
"application/pdf": ".pdf",
"image/png": ".png",
"image/jpeg": ".jpg",
"image/tiff": ".tiff",
"image/bmp": ".bmp",
"image/gif": ".gif",
"image/webp": ".webp",
}
else:
return {}
def azure_ai_vision_parse(
self,
file: Path,
) -> str | None:
"""
Uses Azure AI Vision to parse the document and return the text content.
It requests a searchable PDF output with embedded text.
The PDF is saved to the archive_path attribute.
Returns the text content extracted from the document.
If the parsing fails, it returns None.
"""
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest
from azure.ai.documentintelligence.models import AnalyzeOutputOption
from azure.ai.documentintelligence.models import DocumentContentFormat
from azure.core.credentials import AzureKeyCredential
client = DocumentIntelligenceClient(
endpoint=self.settings.endpoint,
credential=AzureKeyCredential(self.settings.api_key),
)
with file.open("rb") as f:
analyze_request = AnalyzeDocumentRequest(bytes_source=f.read())
poller = client.begin_analyze_document(
model_id="prebuilt-read",
body=analyze_request,
output_content_format=DocumentContentFormat.TEXT,
output=[AnalyzeOutputOption.PDF], # request searchable PDF output
content_type="application/json",
)
poller.wait()
result_id = poller.details["operation_id"]
result = poller.result()
# Download the PDF with embedded text
self.archive_path = self.tempdir / "archive.pdf"
with self.archive_path.open("wb") as f:
for chunk in client.get_analyze_result_pdf(
model_id="prebuilt-read",
result_id=result_id,
):
f.write(chunk)
client.close()
return result.content
def parse(self, document_path: Path, mime_type, file_name=None):
if not self.settings.engine_is_valid():
self.log.warning(
"No valid remote parser engine is configured, content will be empty.",
)
self.text = ""
elif self.settings.engine == "azureai":
self.text = self.azure_ai_vision_parse(document_path)

View File

@@ -1,18 +0,0 @@
def get_parser(*args, **kwargs):
from paperless_remote.parsers import RemoteDocumentParser
return RemoteDocumentParser(*args, **kwargs)
def get_supported_mime_types():
from paperless_remote.parsers import RemoteDocumentParser
return RemoteDocumentParser(None).supported_mime_types()
def remote_consumer_declaration(sender, **kwargs):
return {
"parser": get_parser,
"weight": 5,
"mime_types": get_supported_mime_types(),
}

View File

@@ -1,24 +0,0 @@
from unittest import TestCase
from django.test import override_settings
from paperless_remote import check_remote_parser_configured
class TestChecks(TestCase):
@override_settings(REMOTE_OCR_ENGINE=None)
def test_no_engine(self):
msgs = check_remote_parser_configured(None)
self.assertEqual(len(msgs), 0)
@override_settings(REMOTE_OCR_ENGINE="azureai")
@override_settings(REMOTE_OCR_API_KEY="somekey")
@override_settings(REMOTE_OCR_ENDPOINT=None)
def test_azure_no_endpoint(self):
msgs = check_remote_parser_configured(None)
self.assertEqual(len(msgs), 1)
self.assertTrue(
msgs[0].msg.startswith(
"Azure AI remote parser requires endpoint and API key to be configured.",
),
)

View File

@@ -1,101 +0,0 @@
import uuid
from pathlib import Path
from unittest import mock
from django.test import TestCase
from django.test import override_settings
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import FileSystemAssertsMixin
from paperless_remote.parsers import RemoteDocumentParser
from paperless_remote.signals import get_parser
class TestParser(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
SAMPLE_FILES = Path(__file__).resolve().parent / "samples"
def assertContainsStrings(self, content: str, strings: list[str]):
# Asserts that all strings appear in content, in the given order.
indices = []
for s in strings:
if s in content:
indices.append(content.index(s))
else:
self.fail(f"'{s}' is not in '{content}'")
self.assertListEqual(indices, sorted(indices))
@mock.patch("paperless_tesseract.parsers.run_subprocess")
@mock.patch("azure.ai.documentintelligence.DocumentIntelligenceClient")
def test_get_text_with_azure(self, mock_client_cls, mock_subprocess):
# Arrange mock Azure client
mock_client = mock.Mock()
mock_client_cls.return_value = mock_client
# Simulate poller result and its `.details`
mock_poller = mock.Mock()
mock_poller.wait.return_value = None
mock_poller.details = {"operation_id": "fake-op-id"}
mock_client.begin_analyze_document.return_value = mock_poller
mock_poller.result.return_value.content = "This is a test document."
# Return dummy PDF bytes
mock_client.get_analyze_result_pdf.return_value = [
b"%PDF-",
b"1.7 ",
b"FAKEPDF",
]
# Simulate pdftotext by writing dummy text to sidecar file
def fake_run(cmd, *args, **kwargs):
with Path(cmd[-1]).open("w", encoding="utf-8") as f:
f.write("This is a test document.")
mock_subprocess.side_effect = fake_run
with override_settings(
REMOTE_OCR_ENGINE="azureai",
REMOTE_OCR_API_KEY="somekey",
REMOTE_OCR_ENDPOINT="https://endpoint.cognitiveservices.azure.com",
):
parser = get_parser(uuid.uuid4())
parser.parse(
self.SAMPLE_FILES / "simple-digital.pdf",
"application/pdf",
)
self.assertContainsStrings(
parser.text.strip(),
["This is a test document."],
)
@override_settings(
REMOTE_OCR_ENGINE="azureai",
REMOTE_OCR_API_KEY="key",
REMOTE_OCR_ENDPOINT="https://endpoint.cognitiveservices.azure.com",
)
def test_supported_mime_types_valid_config(self):
parser = RemoteDocumentParser(uuid.uuid4())
expected_types = {
"application/pdf": ".pdf",
"image/png": ".png",
"image/jpeg": ".jpg",
"image/tiff": ".tiff",
"image/bmp": ".bmp",
"image/gif": ".gif",
"image/webp": ".webp",
}
self.assertEqual(parser.supported_mime_types(), expected_types)
def test_supported_mime_types_invalid_config(self):
parser = get_parser(uuid.uuid4())
self.assertEqual(parser.supported_mime_types(), {})
@override_settings(
REMOTE_OCR_ENGINE=None,
REMOTE_OCR_API_KEY=None,
REMOTE_OCR_ENDPOINT=None,
)
def test_parse_with_invalid_config(self):
parser = get_parser(uuid.uuid4())
parser.parse(self.SAMPLE_FILES / "simple-digital.pdf", "application/pdf")
self.assertEqual(parser.text, "")

39
uv.lock generated
View File

@@ -95,34 +95,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/02/ff/1175b0b7371e46244032d43a56862d0af455823b5280a50c63d99cc50f18/automat-25.4.16-py3-none-any.whl", hash = "sha256:04e9bce696a8d5671ee698005af6e5a9fa15354140a87f4870744604dcdd3ba1", size = 42842, upload-time = "2025-04-16T20:12:14.447Z" },
]
[[package]]
name = "azure-ai-documentintelligence"
version = "1.0.2"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "azure-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "isodate", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/44/7b/8115cd713e2caa5e44def85f2b7ebd02a74ae74d7113ba20bdd41fd6dd80/azure_ai_documentintelligence-1.0.2.tar.gz", hash = "sha256:4d75a2513f2839365ebabc0e0e1772f5601b3a8c9a71e75da12440da13b63484", size = 170940 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/d9/75/c9ec040f23082f54ffb1977ff8f364c2d21c79a640a13d1c1809e7fd6b1a/azure_ai_documentintelligence-1.0.2-py3-none-any.whl", hash = "sha256:e1fb446abbdeccc9759d897898a0fe13141ed29f9ad11fc705f951925822ed59", size = 106005 },
]
[[package]]
name = "azure-core"
version = "1.33.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "requests", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "six", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/75/aa/7c9db8edd626f1a7d99d09ef7926f6f4fb34d5f9fa00dc394afdfe8e2a80/azure_core-1.33.0.tar.gz", hash = "sha256:f367aa07b5e3005fec2c1e184b882b0b039910733907d001c20fb08ebb8c0eb9", size = 295633 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/07/b7/76b7e144aa53bd206bf1ce34fa75350472c3f69bf30e5c8c18bc9881035d/azure_core-1.33.0-py3-none-any.whl", hash = "sha256:9b5b6d0223a1d38c37500e6971118c1e0f13f54951e6893968b38910bc9cda8f", size = 207071 },
]
[[package]]
name = "babel"
version = "2.17.0"
@@ -1479,15 +1451,6 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/c7/fc/4e5a141c3f7c7bed550ac1f69e599e92b6be449dd4677ec09f325cad0955/inotifyrecursive-0.3.5-py3-none-any.whl", hash = "sha256:7e5f4a2e1dc2bef0efa3b5f6b339c41fb4599055a2b54909d020e9e932cc8d2f", size = 8009, upload-time = "2020-11-20T12:38:46.981Z" },
]
[[package]]
name = "isodate"
version = "0.7.2"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/54/4d/e940025e2ce31a8ce1202635910747e5a87cc3a6a6bb2d00973375014749/isodate-0.7.2.tar.gz", hash = "sha256:4cd1aa0f43ca76f4a6c6c0292a85f40b35ec2e43e315b59f06e6d32171a953e6", size = 29705 }
wheels = [
{ url = "https://files.pythonhosted.org/packages/15/aa/0aca39a37d3c7eb941ba736ede56d689e7be91cab5d9ca846bde3999eba6/isodate-0.7.2-py3-none-any.whl", hash = "sha256:28009937d8031054830160fce6d409ed342816b543597cece116d966c6d99e15", size = 22320 },
]
[[package]]
name = "jinja2"
version = "3.1.6"
@@ -2156,7 +2119,6 @@ name = "paperless-ngx"
version = "2.18.4"
source = { virtual = "." }
dependencies = [
{ name = "azure-ai-documentintelligence", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "babel", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "bleach", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "celery", extra = ["redis"], marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -2293,7 +2255,6 @@ typing = [
[package.metadata]
requires-dist = [
{ name = "azure-ai-documentintelligence", specifier = ">=1.0.2" },
{ name = "babel", specifier = ">=2.17" },
{ name = "bleach", specifier = "~=6.2.0" },
{ name = "celery", extras = ["redis"], specifier = "~=5.5.1" },