Compare commits

...

22 Commits

Author SHA1 Message Date
shamoon
496a4035cd Testing 2026-01-26 15:31:00 -08:00
shamoon
761044c0d3 Oops circular import 2026-01-26 15:21:13 -08:00
shamoon
1b7e4cc286 Add LLM index update queuing and improve error handling 2026-01-26 15:21:13 -08:00
GitHub Actions
6997a2ab8b Auto translate strings 2026-01-26 20:58:22 +00:00
Jan Kleine
f82f31f383 Enhancement: improve relative dates in date filter (#11899) 2026-01-26 12:56:29 -08:00
GitHub Actions
ac76710296 Auto translate strings 2026-01-26 20:12:45 +00:00
Antoine Mérino
df07b8a03e Performance: faster statistics panel on dashboard (#11760) 2026-01-26 12:10:57 -08:00
GitHub Actions
cac1b721b9 Auto translate strings 2026-01-26 18:57:50 +00:00
shamoon
4428354150 Feature: allow duplicates with warnings, UI for discovery (#11815) 2026-01-26 18:55:08 +00:00
GitHub Actions
df1aa13551 Auto translate strings 2026-01-26 18:32:50 +00:00
Gabgobie
e9e138e62c Enhancement: configurable SSO groups claim (#11841)
---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-01-26 18:31:01 +00:00
GitHub Actions
cafb0f2022 Auto translate strings 2026-01-26 17:51:20 +00:00
shamoon
1d2e3393ac Enhancement: support select all for management lists (#11889) 2026-01-26 09:49:16 -08:00
shamoon
857aaca493 Merge branch 'release/v2.20.x' into dev 2026-01-26 09:25:58 -08:00
shamoon
891f4a2faf Fix: correctly extract all ids for nested tags (#11888) 2026-01-26 09:12:03 -08:00
GitHub Actions
ae816a01b2 Auto translate strings 2026-01-26 16:32:52 +00:00
shamoon
b6531aed2f Tweakhancement: display document id, with copy (#11896) 2026-01-26 08:30:43 -08:00
GitHub Actions
991d3cef88 Auto translate strings 2026-01-26 08:31:35 +00:00
Paul Gessinger
f2bb6c9725 Enhancement: Add support for app oidc (#11756)
---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-01-26 00:29:36 -08:00
shamoon
2312314aa7 Performance: improve treenode inefficiencies (#11606) 2026-01-25 21:47:08 -08:00
shamoon
72e8b73108 Fix test 2026-01-25 17:08:15 -08:00
shamoon
5c9ff367e3 Fixhancement: change date calculation for 'this year' to include future documents (#11884) 2026-01-25 16:56:51 -08:00
44 changed files with 1530 additions and 681 deletions

View File

@@ -8,7 +8,7 @@ Further documentation is provided here for some endpoints and features.
## Authorization
The REST api provides four different forms of authentication.
The REST api provides five different forms of authentication.
1. Basic authentication
@@ -52,6 +52,14 @@ The REST api provides four different forms of authentication.
[configuration](configuration.md#PAPERLESS_ENABLE_HTTP_REMOTE_USER_API)),
you can authenticate against the API using Remote User auth.
5. Headless OIDC via [`django-allauth`](https://codeberg.org/allauth/django-allauth)
`django-allauth` exposes API endpoints under `api/auth/` which enable tools
like third-party apps to authenticate with social accounts that are
configured. See
[here](advanced_usage.md#openid-connect-and-social-authentication) for more
information on social accounts.
## Searching for documents
Full text searching is available on the `/api/documents/` endpoint. Two

View File

@@ -659,7 +659,7 @@ system. See the corresponding
: Sync groups from the third party authentication system (e.g. OIDC) to Paperless-ngx. When enabled, users will be added or removed from groups based on their group membership in the third party authentication system. Groups must already exist in Paperless-ngx and have the same name as in the third party authentication system. Groups are updated upon logging in via the third party authentication system, see the corresponding [django-allauth documentation](https://docs.allauth.org/en/dev/socialaccount/signals.html).
: In order to pass groups from the authentication system you will need to update your [PAPERLESS_SOCIALACCOUNT_PROVIDERS](#PAPERLESS_SOCIALACCOUNT_PROVIDERS) setting by adding a top-level "SCOPES" setting which includes "groups", e.g.:
: In order to pass groups from the authentication system you will need to update your [PAPERLESS_SOCIALACCOUNT_PROVIDERS](#PAPERLESS_SOCIALACCOUNT_PROVIDERS) setting by adding a top-level "SCOPES" setting which includes "groups", or the custom groups claim configured in [`PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM`](#PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM) e.g.:
```json
{"openid_connect":{"SCOPE": ["openid","profile","email","groups"]...
@@ -667,6 +667,12 @@ system. See the corresponding
Defaults to False
#### [`PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM=<str>`](#PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM) {#PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM}
: Allows you to define a custom groups claim. See [PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS](#PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS) which is required for this setting to take effect.
Defaults to "groups"
#### [`PAPERLESS_SOCIAL_ACCOUNT_DEFAULT_GROUPS=<comma-separated-list>`](#PAPERLESS_SOCIAL_ACCOUNT_DEFAULT_GROUPS) {#PAPERLESS_SOCIAL_ACCOUNT_DEFAULT_GROUPS}
: A list of group names that users who signup via social accounts will be added to upon signup. Groups listed here must already exist.
@@ -1146,8 +1152,9 @@ via the consumption directory, you can disable the consumer to save resources.
#### [`PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool>`](#PAPERLESS_CONSUMER_DELETE_DUPLICATES) {#PAPERLESS_CONSUMER_DELETE_DUPLICATES}
: When the consumer detects a duplicate document, it will not touch
the original document. This default behavior can be changed here.
: As of version 3.0 Paperless-ngx allows duplicate documents to be consumed by default, _except_ when
this setting is enabled. When enabled, Paperless will check if a document with the same hash already
exists in the system and delete the duplicate file from the consumption directory without consuming it.
Defaults to false.

File diff suppressed because it is too large Load Diff

View File

@@ -97,6 +97,12 @@
<br/><em>(<ng-container i18n>click for full output</ng-container>)</em>
}
</ng-template>
@if (task.duplicate_documents?.length > 0) {
<div class="small text-warning-emphasis d-flex align-items-center gap-1">
<i-bs class="lh-1" width="1em" height="1em" name="exclamation-triangle"></i-bs>
<span i18n>Duplicate(s) detected</span>
</div>
}
</td>
}
<td class="d-lg-none">

View File

@@ -164,9 +164,11 @@
{{ item.name }}
<span class="ms-auto text-muted small">
@if (item.dateEnd) {
{{ item.date | customDate:'MMM d' }} &ndash; {{ item.dateEnd | customDate:'mediumDate' }}
{{ item.date | customDate:'mediumDate' }} &ndash; {{ item.dateEnd | customDate:'mediumDate' }}
} @else if (item.dateTilNow) {
{{ item.dateTilNow | customDate:'mediumDate' }} &ndash; <ng-container i18n>now</ng-container>
} @else {
{{ item.date | customDate:'mediumDate' }} &ndash; <ng-container i18n>now</ng-container>
{{ item.date | customDate:'mediumDate' }}
}
</span>
</div>

View File

@@ -79,32 +79,34 @@ export class DatesDropdownComponent implements OnInit, OnDestroy {
{
id: RelativeDate.WITHIN_1_WEEK,
name: $localize`Within 1 week`,
date: new Date().setDate(new Date().getDate() - 7),
dateTilNow: new Date().setDate(new Date().getDate() - 7),
},
{
id: RelativeDate.WITHIN_1_MONTH,
name: $localize`Within 1 month`,
date: new Date().setMonth(new Date().getMonth() - 1),
dateTilNow: new Date().setMonth(new Date().getMonth() - 1),
},
{
id: RelativeDate.WITHIN_3_MONTHS,
name: $localize`Within 3 months`,
date: new Date().setMonth(new Date().getMonth() - 3),
dateTilNow: new Date().setMonth(new Date().getMonth() - 3),
},
{
id: RelativeDate.WITHIN_1_YEAR,
name: $localize`Within 1 year`,
date: new Date().setFullYear(new Date().getFullYear() - 1),
dateTilNow: new Date().setFullYear(new Date().getFullYear() - 1),
},
{
id: RelativeDate.THIS_YEAR,
name: $localize`This year`,
date: new Date('1/1/' + new Date().getFullYear()),
dateEnd: new Date('12/31/' + new Date().getFullYear()),
},
{
id: RelativeDate.THIS_MONTH,
name: $localize`This month`,
date: new Date().setDate(1),
dateEnd: new Date(new Date().getFullYear(), new Date().getMonth() + 1, 0),
},
{
id: RelativeDate.TODAY,

View File

@@ -1,9 +1,18 @@
<div class="row pt-3 pb-3 pb-md-2 align-items-center">
<div class="col-md text-truncate">
<h3 class="text-truncate" style="line-height: 1.4">
{{title}}
<h3 class="d-flex align-items-center mb-1" style="line-height: 1.4">
<span class="text-truncate">{{title}}</span>
@if (id) {
<span class="badge bg-primary text-primary-text-contrast ms-3 small fs-normal cursor-pointer" (click)="copyID()">
@if (copied) {
<i-bs width="1em" height="1em" name="clipboard-check"></i-bs>&nbsp;<ng-container i18n>Copied!</ng-container>
} @else {
ID: {{id}}
}
</span>
}
@if (subTitle) {
<span class="h6 mb-0 d-block d-md-inline fw-normal ms-md-3 text-truncate" style="line-height: 1.4">{{subTitle}}</span>
<span class="h6 mb-0 mt-1 d-block d-md-inline fw-normal ms-md-3 text-truncate" style="line-height: 1.4">{{subTitle}}</span>
}
@if (info) {
<button class="btn btn-sm btn-link text-muted me-auto p-0 p-md-2" title="What's this?" i18n-title type="button" [ngbPopover]="infoPopover" [autoClose]="true">

View File

@@ -1,5 +1,10 @@
h3 {
min-height: calc(1.325rem + 0.9vw);
.badge {
font-size: 0.65rem;
line-height: 1;
}
}
@media (min-width: 1200px) {

View File

@@ -1,3 +1,4 @@
import { Clipboard } from '@angular/cdk/clipboard'
import { ComponentFixture, TestBed } from '@angular/core/testing'
import { Title } from '@angular/platform-browser'
import { environment } from 'src/environments/environment'
@@ -7,6 +8,7 @@ describe('PageHeaderComponent', () => {
let component: PageHeaderComponent
let fixture: ComponentFixture<PageHeaderComponent>
let titleService: Title
let clipboard: Clipboard
beforeEach(async () => {
TestBed.configureTestingModule({
@@ -15,6 +17,7 @@ describe('PageHeaderComponent', () => {
}).compileComponents()
titleService = TestBed.inject(Title)
clipboard = TestBed.inject(Clipboard)
fixture = TestBed.createComponent(PageHeaderComponent)
component = fixture.componentInstance
fixture.detectChanges()
@@ -24,7 +27,8 @@ describe('PageHeaderComponent', () => {
component.title = 'Foo'
component.subTitle = 'Bar'
fixture.detectChanges()
expect(fixture.nativeElement.textContent).toContain('Foo Bar')
expect(fixture.nativeElement.textContent).toContain('Foo')
expect(fixture.nativeElement.textContent).toContain('Bar')
})
it('should set html title', () => {
@@ -32,4 +36,16 @@ describe('PageHeaderComponent', () => {
component.title = 'Foo Bar'
expect(titleSpy).toHaveBeenCalledWith(`Foo Bar - ${environment.appTitle}`)
})
it('should copy id to clipboard, reset after 3 seconds', () => {
jest.useFakeTimers()
component.id = 42 as any
jest.spyOn(clipboard, 'copy').mockReturnValue(true)
component.copyID()
expect(clipboard.copy).toHaveBeenCalledWith('42')
expect(component.copied).toBe(true)
jest.advanceTimersByTime(3000)
expect(component.copied).toBe(false)
})
})

View File

@@ -1,3 +1,4 @@
import { Clipboard } from '@angular/cdk/clipboard'
import { Component, Input, inject } from '@angular/core'
import { Title } from '@angular/platform-browser'
import { NgbPopoverModule } from '@ng-bootstrap/ng-bootstrap'
@@ -13,8 +14,11 @@ import { environment } from 'src/environments/environment'
})
export class PageHeaderComponent {
private titleService = inject(Title)
private clipboard = inject(Clipboard)
_title = ''
private _title = ''
public copied: boolean = false
private copyTimeout: any
@Input()
set title(title: string) {
@@ -26,6 +30,9 @@ export class PageHeaderComponent {
return this._title
}
@Input()
id: number
@Input()
subTitle: string = ''
@@ -34,4 +41,12 @@ export class PageHeaderComponent {
@Input()
infoLink: string
public copyID() {
this.copied = this.clipboard.copy(this.id.toString())
clearTimeout(this.copyTimeout)
this.copyTimeout = setTimeout(() => {
this.copied = false
}, 3000)
}
}

View File

@@ -1,4 +1,4 @@
<pngx-page-header [(title)]="title">
<pngx-page-header [(title)]="title" [id]="documentId">
@if (archiveContentRenderType === ContentRenderType.PDF && !useNativePdfViewer) {
@if (previewNumPages) {
<div class="input-group input-group-sm d-none d-md-flex">
@@ -370,6 +370,37 @@
</ng-template>
</li>
}
@if (document?.duplicate_documents?.length) {
<li [ngbNavItem]="DocumentDetailNavIDs.Duplicates">
<a class="text-nowrap" ngbNavLink i18n>
Duplicates
<span class="badge text-bg-secondary ms-1">{{ document.duplicate_documents.length }}</span>
</a>
<ng-template ngbNavContent>
<div class="d-flex flex-column gap-2">
<div class="fst-italic" i18n>Duplicate documents detected:</div>
<div class="list-group">
@for (duplicate of document.duplicate_documents; track duplicate.id) {
<a
class="list-group-item list-group-item-action d-flex justify-content-between align-items-center"
[routerLink]="['/documents', duplicate.id, 'details']"
[class.disabled]="duplicate.deleted_at"
>
<span class="d-flex align-items-center gap-2">
<span>{{ duplicate.title || ('#' + duplicate.id) }}</span>
@if (duplicate.deleted_at) {
<span class="badge text-bg-secondary" i18n>In trash</span>
}
</span>
<span class="text-secondary">#{{ duplicate.id }}</span>
</a>
}
</div>
</div>
</ng-template>
</li>
}
</ul>
<div [ngbNavOutlet]="nav" class="mt-3"></div>

View File

@@ -301,16 +301,16 @@ describe('DocumentDetailComponent', () => {
.spyOn(openDocumentsService, 'openDocument')
.mockReturnValueOnce(of(true))
fixture.detectChanges()
expect(component.activeNavID).toEqual(5) // DocumentDetailNavIDs.Notes
expect(component.activeNavID).toEqual(component.DocumentDetailNavIDs.Notes)
})
it('should change url on tab switch', () => {
initNormally()
const navigateSpy = jest.spyOn(router, 'navigate')
component.nav.select(5)
component.nav.select(component.DocumentDetailNavIDs.Notes)
component.nav.navChange.next({
activeId: 1,
nextId: 5,
nextId: component.DocumentDetailNavIDs.Notes,
preventDefault: () => {},
})
fixture.detectChanges()
@@ -352,6 +352,18 @@ describe('DocumentDetailComponent', () => {
expect(component.document).toEqual(doc)
})
it('should fall back to details tab when duplicates tab is active but no duplicates', () => {
initNormally()
component.activeNavID = component.DocumentDetailNavIDs.Duplicates
const noDupDoc = { ...doc, duplicate_documents: [] }
component.updateComponent(noDupDoc)
expect(component.activeNavID).toEqual(
component.DocumentDetailNavIDs.Details
)
})
it('should load already-opened document via param', () => {
initNormally()
jest.spyOn(documentService, 'get').mockReturnValueOnce(of(doc))
@@ -367,6 +379,38 @@ describe('DocumentDetailComponent', () => {
expect(component.document).toEqual(doc)
})
it('should update cached open document duplicates when reloading an open doc', () => {
const openDoc = { ...doc, duplicate_documents: [{ id: 1, title: 'Old' }] }
const updatedDuplicates = [
{ id: 2, title: 'Newer duplicate', deleted_at: null },
]
jest
.spyOn(activatedRoute, 'paramMap', 'get')
.mockReturnValue(of(convertToParamMap({ id: 3, section: 'details' })))
jest.spyOn(documentService, 'get').mockReturnValue(
of({
...doc,
modified: new Date('2024-01-02T00:00:00Z'),
duplicate_documents: updatedDuplicates,
})
)
jest.spyOn(openDocumentsService, 'getOpenDocument').mockReturnValue(openDoc)
const saveSpy = jest.spyOn(openDocumentsService, 'save')
jest.spyOn(openDocumentsService, 'openDocument').mockReturnValue(of(true))
jest.spyOn(customFieldsService, 'listAll').mockReturnValue(
of({
count: customFields.length,
all: customFields.map((f) => f.id),
results: customFields,
})
)
fixture.detectChanges()
expect(openDoc.duplicate_documents).toEqual(updatedDuplicates)
expect(saveSpy).toHaveBeenCalled()
})
it('should disable form if user cannot edit', () => {
currentUserHasObjectPermissions = false
initNormally()

View File

@@ -8,7 +8,7 @@ import {
FormsModule,
ReactiveFormsModule,
} from '@angular/forms'
import { ActivatedRoute, Router } from '@angular/router'
import { ActivatedRoute, Router, RouterModule } from '@angular/router'
import {
NgbDateStruct,
NgbDropdownModule,
@@ -124,6 +124,7 @@ enum DocumentDetailNavIDs {
Notes = 5,
Permissions = 6,
History = 7,
Duplicates = 8,
}
enum ContentRenderType {
@@ -181,6 +182,7 @@ export enum ZoomSetting {
NgxBootstrapIconsModule,
PdfViewerModule,
TextAreaComponent,
RouterModule,
],
})
export class DocumentDetailComponent
@@ -454,6 +456,11 @@ export class DocumentDetailComponent
const openDocument = this.openDocumentService.getOpenDocument(
this.documentId
)
// update duplicate documents if present
if (openDocument && doc?.duplicate_documents) {
openDocument.duplicate_documents = doc.duplicate_documents
this.openDocumentService.save()
}
const useDoc = openDocument || doc
if (openDocument) {
if (
@@ -704,6 +711,13 @@ export class DocumentDetailComponent
}
this.title = this.documentTitlePipe.transform(doc.title)
this.prepareForm(doc)
if (
this.activeNavID === DocumentDetailNavIDs.Duplicates &&
!doc?.duplicate_documents?.length
) {
this.activeNavID = DocumentDetailNavIDs.Details
}
}
get customFieldFormFields(): FormArray {

View File

@@ -14,6 +14,7 @@ import { SortableDirective } from 'src/app/directives/sortable.directive'
import { CustomDatePipe } from 'src/app/pipes/custom-date.pipe'
import { PermissionType } from 'src/app/services/permissions.service'
import { CorrespondentService } from 'src/app/services/rest/correspondent.service'
import { ClearableBadgeComponent } from '../../common/clearable-badge/clearable-badge.component'
import { CorrespondentEditDialogComponent } from '../../common/edit-dialog/correspondent-edit-dialog/correspondent-edit-dialog.component'
import { PageHeaderComponent } from '../../common/page-header/page-header.component'
import { ManagementListComponent } from '../management-list/management-list.component'
@@ -36,6 +37,7 @@ import { ManagementListComponent } from '../management-list/management-list.comp
NgbDropdownModule,
NgbPaginationModule,
NgxBootstrapIconsModule,
ClearableBadgeComponent,
],
})
export class CorrespondentListComponent extends ManagementListComponent<Correspondent> {

View File

@@ -13,6 +13,7 @@ import { IfPermissionsDirective } from 'src/app/directives/if-permissions.direct
import { SortableDirective } from 'src/app/directives/sortable.directive'
import { PermissionType } from 'src/app/services/permissions.service'
import { DocumentTypeService } from 'src/app/services/rest/document-type.service'
import { ClearableBadgeComponent } from '../../common/clearable-badge/clearable-badge.component'
import { DocumentTypeEditDialogComponent } from '../../common/edit-dialog/document-type-edit-dialog/document-type-edit-dialog.component'
import { PageHeaderComponent } from '../../common/page-header/page-header.component'
import { ManagementListComponent } from '../management-list/management-list.component'
@@ -34,6 +35,7 @@ import { ManagementListComponent } from '../management-list/management-list.comp
NgbDropdownModule,
NgbPaginationModule,
NgxBootstrapIconsModule,
ClearableBadgeComponent,
],
})
export class DocumentTypeListComponent extends ManagementListComponent<DocumentType> {

View File

@@ -1,8 +1,39 @@
<pngx-page-header title="{{ typeNamePlural | titlecase }}" info="View, add, edit and delete {{ typeNamePlural }}." infoLink="usage/#terms-and-definitions">
<button class="btn btn-sm btn-outline-secondary" (click)="clearSelection()" [hidden]="selectedObjects.size === 0">
<i-bs name="x"></i-bs>&nbsp;<ng-container i18n>Clear selection</ng-container>
<div ngbDropdown class="btn-group flex-fill d-sm-none">
<button class="btn btn-sm btn-outline-primary" id="dropdownSelectMobile" ngbDropdownToggle>
<i-bs name="text-indent-left"></i-bs>
<div class="d-none d-sm-inline">&nbsp;<ng-container i18n>Select</ng-container></div>
@if (selectedObjects.size > 0) {
<pngx-clearable-badge [selected]="selectedObjects.size > 0" [number]="selectedObjects.size" (cleared)="selectNone()"></pngx-clearable-badge><span class="visually-hidden">selected</span>
}
</button>
<div ngbDropdownMenu aria-labelledby="dropdownSelectMobile" class="shadow">
<button ngbDropdownItem (click)="selectNone()" i18n>Select none</button>
<button ngbDropdownItem (click)="selectPage(true)" i18n>Select page</button>
<button ngbDropdownItem (click)="selectAll()" i18n>Select all</button>
</div>
</div>
<div class="d-none d-sm-flex flex-fill me-3">
<div class="input-group input-group-sm">
<span class="input-group-text border-0" i18n>Select:</span>
</div>
<div class="btn-group btn-group-sm flex-nowrap">
@if (selectedObjects.size > 0) {
<button class="btn btn-sm btn-outline-secondary" (click)="selectNone()">
<i-bs name="slash-circle"></i-bs>&nbsp;<ng-container i18n>None</ng-container>
</button>
}
<button class="btn btn-sm btn-outline-primary" (click)="selectPage(true)">
<i-bs name="file-earmark-check"></i-bs>&nbsp;<ng-container i18n>Page</ng-container>
</button>
<button class="btn btn-sm btn-outline-primary" (click)="selectAll()">
<i-bs name="check-all"></i-bs>&nbsp;<ng-container i18n>All</ng-container>
</button>
</div>
</div>
<button type="button" class="btn btn-sm btn-outline-primary" (click)="setPermissions()" [disabled]="!userCanBulkEdit(PermissionAction.Change) || selectedObjects.size === 0">
<i-bs name="person-fill-lock"></i-bs>&nbsp;<ng-container i18n>Permissions</ng-container>
</button>
@@ -31,7 +62,7 @@
<tr>
<th scope="col">
<div class="form-check m-0 ms-2 me-n2">
<input type="checkbox" class="form-check-input" id="all-objects" [(ngModel)]="togggleAll" [disabled]="data.length === 0" (click)="toggleAll($event); $event.stopPropagation();">
<input type="checkbox" class="form-check-input" id="all-objects" [(ngModel)]="togggleAll" [disabled]="data.length === 0" (change)="selectPage($event.target.checked); $event.stopPropagation();">
<label class="form-check-label" for="all-objects"></label>
</div>
</th>

View File

@@ -163,8 +163,7 @@ describe('ManagementListComponent', () => {
const toastInfoSpy = jest.spyOn(toastService, 'showInfo')
const reloadSpy = jest.spyOn(component, 'reloadData')
const createButton = fixture.debugElement.queryAll(By.css('button'))[4]
createButton.triggerEventHandler('click')
component.openCreateDialog()
expect(modal).not.toBeUndefined()
const editDialog = modal.componentInstance as EditDialogComponent<Tag>
@@ -187,8 +186,7 @@ describe('ManagementListComponent', () => {
const toastInfoSpy = jest.spyOn(toastService, 'showInfo')
const reloadSpy = jest.spyOn(component, 'reloadData')
const editButton = fixture.debugElement.queryAll(By.css('button'))[7]
editButton.triggerEventHandler('click')
component.openEditDialog(tags[0])
expect(modal).not.toBeUndefined()
const editDialog = modal.componentInstance as EditDialogComponent<Tag>
@@ -212,8 +210,7 @@ describe('ManagementListComponent', () => {
const deleteSpy = jest.spyOn(tagService, 'delete')
const reloadSpy = jest.spyOn(component, 'reloadData')
const deleteButton = fixture.debugElement.queryAll(By.css('button'))[8]
deleteButton.triggerEventHandler('click')
component.openDeleteDialog(tags[0])
expect(modal).not.toBeUndefined()
const editDialog = modal.componentInstance as ConfirmDialogComponent
@@ -230,6 +227,21 @@ describe('ManagementListComponent', () => {
expect(reloadSpy).toHaveBeenCalled()
})
it('should use the all list length for collection size when provided', fakeAsync(() => {
jest.spyOn(tagService, 'listFiltered').mockReturnValueOnce(
of({
count: 1,
all: [1, 2, 3],
results: tags.slice(0, 1),
})
)
component.reloadData()
tick(100)
expect(component.collectionSize).toBe(3)
}))
it('should support quick filter for objects', () => {
const expectedUrl = documentListViewService.getQuickFilterUrl([
{ rule_type: FILTER_HAS_TAGS_ALL, value: tags[0].id.toString() },
@@ -264,19 +276,84 @@ describe('ManagementListComponent', () => {
expect(component.page).toEqual(1)
})
it('should support toggle all items in view', () => {
it('should support toggle select page in vew', () => {
expect(component.selectedObjects.size).toEqual(0)
const toggleAllSpy = jest.spyOn(component, 'toggleAll')
const selectPageSpy = jest.spyOn(component, 'selectPage')
const checkButton = fixture.debugElement.queryAll(
By.css('input.form-check-input')
)[0]
checkButton.nativeElement.dispatchEvent(new Event('click'))
checkButton.nativeElement.dispatchEvent(new Event('change'))
checkButton.nativeElement.checked = true
checkButton.nativeElement.dispatchEvent(new Event('click'))
expect(toggleAllSpy).toHaveBeenCalled()
checkButton.nativeElement.dispatchEvent(new Event('change'))
expect(selectPageSpy).toHaveBeenCalled()
expect(component.selectedObjects.size).toEqual(tags.length)
})
it('selectNone should clear selection and reset toggle flag', () => {
component.selectedObjects = new Set([tags[0].id, tags[1].id])
component.togggleAll = true
component.selectNone()
expect(component.selectedObjects.size).toBe(0)
expect(component.togggleAll).toBe(false)
})
it('selectPage should select current page items or clear selection', () => {
component.selectPage(true)
expect(component.selectedObjects).toEqual(new Set(tags.map((t) => t.id)))
expect(component.togggleAll).toBe(true)
component.togggleAll = true
component.selectPage(false)
expect(component.selectedObjects.size).toBe(0)
expect(component.togggleAll).toBe(false)
})
it('selectAll should use all IDs when collection size exists', () => {
;(component as any).allIDs = [1, 2, 3, 4]
component.collectionSize = 4
component.selectAll()
expect(component.selectedObjects).toEqual(new Set([1, 2, 3, 4]))
expect(component.togggleAll).toBe(true)
})
it('selectAll should clear selection when collection size is zero', () => {
component.selectedObjects = new Set([1])
component.collectionSize = 0
component.togggleAll = true
component.selectAll()
expect(component.selectedObjects.size).toBe(0)
expect(component.togggleAll).toBe(false)
})
it('toggleSelected should toggle object selection and update toggle state', () => {
component.toggleSelected(tags[0])
expect(component.selectedObjects.has(tags[0].id)).toBe(true)
expect(component.togggleAll).toBe(false)
component.toggleSelected(tags[1])
component.toggleSelected(tags[2])
expect(component.togggleAll).toBe(true)
component.toggleSelected(tags[1])
expect(component.selectedObjects.has(tags[1].id)).toBe(false)
expect(component.togggleAll).toBe(false)
})
it('areAllPageItemsSelected should return false when page has no selectable items', () => {
component.data = []
component.selectedObjects.clear()
expect((component as any).areAllPageItemsSelected()).toBe(false)
component.data = tags
})
it('should support bulk edit permissions', () => {
const bulkEditPermsSpy = jest.spyOn(tagService, 'bulk_edit_objects')
component.toggleSelected(tags[0])

View File

@@ -84,6 +84,7 @@ export abstract class ManagementListComponent<T extends MatchingModel>
public data: T[] = []
private unfilteredData: T[] = []
private allIDs: number[] = []
public page = 1
@@ -171,7 +172,8 @@ export abstract class ManagementListComponent<T extends MatchingModel>
tap((c) => {
this.unfilteredData = c.results
this.data = this.filterData(c.results)
this.collectionSize = c.count
this.collectionSize = c.all?.length ?? c.count
this.allIDs = c.all
}),
delay(100)
)
@@ -300,16 +302,6 @@ export abstract class ManagementListComponent<T extends MatchingModel>
return ownsAll
}
toggleAll(event: PointerEvent) {
const checked = (event.target as HTMLInputElement).checked
this.togggleAll = checked
if (checked) {
this.selectedObjects = new Set(this.getSelectableIDs(this.data))
} else {
this.clearSelection()
}
}
protected getSelectableIDs(objects: T[]): number[] {
return objects.map((o) => o.id)
}
@@ -319,10 +311,38 @@ export abstract class ManagementListComponent<T extends MatchingModel>
this.selectedObjects.clear()
}
selectNone() {
this.clearSelection()
}
selectPage(select: boolean) {
if (select) {
this.selectedObjects = new Set(this.getSelectableIDs(this.data))
this.togggleAll = this.areAllPageItemsSelected()
} else {
this.clearSelection()
}
}
selectAll() {
if (!this.collectionSize) {
this.clearSelection()
return
}
this.selectedObjects = new Set(this.allIDs)
this.togggleAll = this.areAllPageItemsSelected()
}
toggleSelected(object) {
this.selectedObjects.has(object.id)
? this.selectedObjects.delete(object.id)
: this.selectedObjects.add(object.id)
this.togggleAll = this.areAllPageItemsSelected()
}
protected areAllPageItemsSelected(): boolean {
const ids = this.getSelectableIDs(this.data)
return ids.length > 0 && ids.every((id) => this.selectedObjects.has(id))
}
setPermissions() {

View File

@@ -13,6 +13,7 @@ import { IfPermissionsDirective } from 'src/app/directives/if-permissions.direct
import { SortableDirective } from 'src/app/directives/sortable.directive'
import { PermissionType } from 'src/app/services/permissions.service'
import { StoragePathService } from 'src/app/services/rest/storage-path.service'
import { ClearableBadgeComponent } from '../../common/clearable-badge/clearable-badge.component'
import { StoragePathEditDialogComponent } from '../../common/edit-dialog/storage-path-edit-dialog/storage-path-edit-dialog.component'
import { PageHeaderComponent } from '../../common/page-header/page-header.component'
import { ManagementListComponent } from '../management-list/management-list.component'
@@ -34,6 +35,7 @@ import { ManagementListComponent } from '../management-list/management-list.comp
NgbDropdownModule,
NgbPaginationModule,
NgxBootstrapIconsModule,
ClearableBadgeComponent,
],
})
export class StoragePathListComponent extends ManagementListComponent<StoragePath> {

View File

@@ -138,16 +138,12 @@ describe('TagListComponent', () => {
}
component.data = [parent as any]
const selectEvent = { target: { checked: true } } as unknown as PointerEvent
component.toggleAll(selectEvent)
component.selectPage(true)
expect(component.selectedObjects.has(10)).toBe(true)
expect(component.selectedObjects.has(11)).toBe(true)
const deselectEvent = {
target: { checked: false },
} as unknown as PointerEvent
component.toggleAll(deselectEvent)
component.selectPage(false)
expect(component.selectedObjects.size).toBe(0)
})
})

View File

@@ -13,6 +13,7 @@ import { IfPermissionsDirective } from 'src/app/directives/if-permissions.direct
import { SortableDirective } from 'src/app/directives/sortable.directive'
import { PermissionType } from 'src/app/services/permissions.service'
import { TagService } from 'src/app/services/rest/tag.service'
import { ClearableBadgeComponent } from '../../common/clearable-badge/clearable-badge.component'
import { TagEditDialogComponent } from '../../common/edit-dialog/tag-edit-dialog/tag-edit-dialog.component'
import { PageHeaderComponent } from '../../common/page-header/page-header.component'
import { ManagementListComponent } from '../management-list/management-list.component'
@@ -34,6 +35,7 @@ import { ManagementListComponent } from '../management-list/management-list.comp
NgbDropdownModule,
NgbPaginationModule,
NgxBootstrapIconsModule,
ClearableBadgeComponent,
],
})
export class TagListComponent extends ManagementListComponent<Tag> {

View File

@@ -159,6 +159,8 @@ export interface Document extends ObjectWithPermissions {
page_count?: number
duplicate_documents?: Document[]
// Frontend only
__changedFields?: string[]
}

View File

@@ -1,3 +1,4 @@
import { Document } from './document'
import { ObjectWithId } from './object-with-id'
export enum PaperlessTaskType {
@@ -42,5 +43,7 @@ export interface PaperlessTask extends ObjectWithId {
related_document?: number
duplicate_documents?: Document[]
owner?: number
}

View File

@@ -779,18 +779,44 @@ class ConsumerPreflightPlugin(
Q(checksum=checksum) | Q(archive_checksum=checksum),
)
if existing_doc.exists():
msg = ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS
log_msg = f"Not consuming {self.filename}: It is a duplicate of {existing_doc.get().title} (#{existing_doc.get().pk})."
existing_doc = existing_doc.order_by("-created")
duplicates_in_trash = existing_doc.filter(deleted_at__isnull=False)
log_msg = (
f"Consuming duplicate {self.filename}: "
f"{existing_doc.count()} existing document(s) share the same content."
)
if existing_doc.first().deleted_at is not None:
msg = ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS_IN_TRASH
log_msg += " Note: existing document is in the trash."
if duplicates_in_trash.exists():
log_msg += " Note: at least one existing document is in the trash."
self.log.warning(log_msg)
if settings.CONSUMER_DELETE_DUPLICATES:
duplicate = existing_doc.first()
duplicate_label = (
duplicate.title
or duplicate.original_filename
or (Path(duplicate.filename).name if duplicate.filename else None)
or str(duplicate.pk)
)
Path(self.input_doc.original_file).unlink()
failure_msg = (
f"Not consuming {self.filename}: "
f"It is a duplicate of {duplicate_label} (#{duplicate.pk})"
)
status_msg = ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS
if duplicates_in_trash.exists():
status_msg = (
ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS_IN_TRASH
)
failure_msg += " Note: existing document is in the trash."
self._fail(
msg,
log_msg,
status_msg,
failure_msg,
)
def pre_check_directories(self):

View File

@@ -602,7 +602,7 @@ def rewrite_natural_date_keywords(query_string: str) -> str:
case "this year":
start = datetime(local_now.year, 1, 1, 0, 0, 0, tzinfo=tz)
end = datetime.combine(today, time.max, tzinfo=tz)
end = datetime(local_now.year, 12, 31, 23, 59, 59, tzinfo=tz)
case "previous week":
days_since_monday = local_now.weekday()

View File

@@ -0,0 +1,23 @@
# Generated by Django 5.2.7 on 2026-01-14 17:45
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0005_workflowtrigger_filter_has_any_correspondents_and_more"),
]
operations = [
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
max_length=32,
verbose_name="checksum",
help_text="The checksum of the original document.",
),
),
]

View File

@@ -0,0 +1,25 @@
# Generated by Django 5.2.6 on 2026-01-24 07:33
import django.db.models.functions.text
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0006_alter_document_checksum_unique"),
]
operations = [
migrations.AddField(
model_name="document",
name="content_length",
field=models.GeneratedField(
db_persist=True,
expression=django.db.models.functions.text.Length("content"),
null=False,
help_text="Length of the content field in characters. Automatically maintained by the database for faster statistics computation.",
output_field=models.PositiveIntegerField(default=0),
),
),
]

View File

@@ -20,7 +20,9 @@ if settings.AUDIT_LOG_ENABLED:
from auditlog.registry import auditlog
from django.db.models import Case
from django.db.models import PositiveIntegerField
from django.db.models.functions import Cast
from django.db.models.functions import Length
from django.db.models.functions import Substr
from django_softdelete.models import SoftDeleteModel
@@ -192,6 +194,15 @@ class Document(SoftDeleteModel, ModelWithOwner):
),
)
content_length = models.GeneratedField(
expression=Length("content"),
output_field=PositiveIntegerField(default=0),
db_persist=True,
null=False,
serialize=False,
help_text="Length of the content field in characters. Automatically maintained by the database for faster statistics computation.",
)
mime_type = models.CharField(_("mime type"), max_length=256, editable=False)
tags = models.ManyToManyField(
@@ -205,7 +216,6 @@ class Document(SoftDeleteModel, ModelWithOwner):
_("checksum"),
max_length=32,
editable=False,
unique=True,
help_text=_("The checksum of the original document."),
)
@@ -946,7 +956,7 @@ if settings.AUDIT_LOG_ENABLED:
auditlog.register(
Document,
m2m_fields={"tags"},
exclude_fields=["modified"],
exclude_fields=["content_length", "modified"],
)
auditlog.register(Correspondent)
auditlog.register(Tag)

View File

@@ -148,13 +148,29 @@ def get_document_count_filter_for_user(user):
)
def get_objects_for_user_owner_aware(user, perms, Model) -> QuerySet:
objects_owned = Model.objects.filter(owner=user)
objects_unowned = Model.objects.filter(owner__isnull=True)
def get_objects_for_user_owner_aware(
user,
perms,
Model,
*,
include_deleted=False,
) -> QuerySet:
"""
Returns objects the user owns, are unowned, or has explicit perms.
When include_deleted is True, soft-deleted items are also included.
"""
manager = (
Model.global_objects
if include_deleted and hasattr(Model, "global_objects")
else Model.objects
)
objects_owned = manager.filter(owner=user)
objects_unowned = manager.filter(owner__isnull=True)
objects_with_perms = get_objects_for_user(
user=user,
perms=perms,
klass=Model,
klass=manager.all(),
accept_global_perms=False,
)
return objects_owned | objects_unowned | objects_with_perms

View File

@@ -23,6 +23,7 @@ from django.core.validators import MinValueValidator
from django.core.validators import RegexValidator
from django.core.validators import integer_validator
from django.db.models import Count
from django.db.models import Q
from django.db.models.functions import Lower
from django.utils.crypto import get_random_string
from django.utils.dateparse import parse_datetime
@@ -72,6 +73,7 @@ from documents.models import WorkflowTrigger
from documents.parsers import is_mime_type_supported
from documents.permissions import get_document_count_filter_for_user
from documents.permissions import get_groups_with_only_permission
from documents.permissions import get_objects_for_user_owner_aware
from documents.permissions import set_permissions_for_object
from documents.regex import validate_regex_pattern
from documents.templating.filepath import validate_filepath_template_and_render
@@ -82,6 +84,9 @@ from documents.validators import url_validator
if TYPE_CHECKING:
from collections.abc import Iterable
from django.db.models.query import QuerySet
logger = logging.getLogger("paperless.serializers")
@@ -1014,6 +1019,32 @@ class NotesSerializer(serializers.ModelSerializer):
return ret
def _get_viewable_duplicates(
document: Document,
user: User | None,
) -> QuerySet[Document]:
checksums = {document.checksum}
if document.archive_checksum:
checksums.add(document.archive_checksum)
duplicates = Document.global_objects.filter(
Q(checksum__in=checksums) | Q(archive_checksum__in=checksums),
).exclude(pk=document.pk)
duplicates = duplicates.order_by("-created")
allowed = get_objects_for_user_owner_aware(
user,
"documents.view_document",
Document,
include_deleted=True,
)
return duplicates.filter(id__in=allowed)
class DuplicateDocumentSummarySerializer(serializers.Serializer):
id = serializers.IntegerField()
title = serializers.CharField()
deleted_at = serializers.DateTimeField(allow_null=True)
@extend_schema_serializer(
deprecate_fields=["created_date"],
)
@@ -1031,6 +1062,7 @@ class DocumentSerializer(
archived_file_name = SerializerMethodField()
created_date = serializers.DateField(required=False)
page_count = SerializerMethodField()
duplicate_documents = SerializerMethodField()
notes = NotesSerializer(many=True, required=False, read_only=True)
@@ -1056,6 +1088,16 @@ class DocumentSerializer(
def get_page_count(self, obj) -> int | None:
return obj.page_count
@extend_schema_field(DuplicateDocumentSummarySerializer(many=True))
def get_duplicate_documents(self, obj):
view = self.context.get("view")
if view and getattr(view, "action", None) != "retrieve":
return []
request = self.context.get("request")
user = request.user if request else None
duplicates = _get_viewable_duplicates(obj, user)
return list(duplicates.values("id", "title", "deleted_at"))
def get_original_file_name(self, obj) -> str | None:
return obj.original_filename
@@ -1233,6 +1275,7 @@ class DocumentSerializer(
"archive_serial_number",
"original_file_name",
"archived_file_name",
"duplicate_documents",
"owner",
"permissions",
"user_can_change",
@@ -2094,10 +2137,12 @@ class TasksViewSerializer(OwnedObjectSerializer):
"result",
"acknowledged",
"related_document",
"duplicate_documents",
"owner",
)
related_document = serializers.SerializerMethodField()
duplicate_documents = serializers.SerializerMethodField()
created_doc_re = re.compile(r"New document id (\d+) created")
duplicate_doc_re = re.compile(r"It is a duplicate of .* \(#(\d+)\)")
@@ -2122,6 +2167,17 @@ class TasksViewSerializer(OwnedObjectSerializer):
return result
@extend_schema_field(DuplicateDocumentSummarySerializer(many=True))
def get_duplicate_documents(self, obj):
related_document = self.get_related_document(obj)
request = self.context.get("request")
user = request.user if request else None
document = Document.global_objects.filter(pk=related_document).first()
if not related_document or not user or not document:
return []
duplicates = _get_viewable_duplicates(document, user)
return list(duplicates.values("id", "title", "deleted_at"))
class RunTaskViewSerializer(serializers.Serializer):
task_name = serializers.ChoiceField(

View File

@@ -131,6 +131,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertIn("content", results_full[0])
self.assertIn("id", results_full[0])
# Content length is used internally for performance reasons.
# No need to expose this field.
self.assertNotIn("content_length", results_full[0])
response = self.client.get("/api/documents/?fields=id", format="json")
self.assertEqual(response.status_code, status.HTTP_200_OK)
results = response.data["results"]

View File

@@ -7,6 +7,7 @@ from django.contrib.auth.models import User
from rest_framework import status
from rest_framework.test import APITestCase
from documents.models import Document
from documents.models import PaperlessTask
from documents.tests.utils import DirectoriesMixin
from documents.views import TasksViewSet
@@ -258,7 +259,7 @@ class TestTasks(DirectoriesMixin, APITestCase):
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.FAILURE,
result="test.pdf: Not consuming test.pdf: It is a duplicate.",
result="test.pdf: Unexpected error during ingestion.",
)
response = self.client.get(self.ENDPOINT)
@@ -270,7 +271,7 @@ class TestTasks(DirectoriesMixin, APITestCase):
self.assertEqual(
returned_data["result"],
"test.pdf: Not consuming test.pdf: It is a duplicate.",
"test.pdf: Unexpected error during ingestion.",
)
def test_task_name_webui(self):
@@ -325,20 +326,34 @@ class TestTasks(DirectoriesMixin, APITestCase):
self.assertEqual(returned_data["task_file_name"], "anothertest.pdf")
def test_task_result_failed_duplicate_includes_related_doc(self):
def test_task_result_duplicate_warning_includes_count(self):
"""
GIVEN:
- A celery task failed with a duplicate error
- A celery task succeeds, but a duplicate exists
WHEN:
- API call is made to get tasks
THEN:
- The returned data includes a related document link
- The returned data includes duplicate warning metadata
"""
checksum = "duplicate-checksum"
Document.objects.create(
title="Existing",
content="",
mime_type="application/pdf",
checksum=checksum,
)
created_doc = Document.objects.create(
title="Created",
content="",
mime_type="application/pdf",
checksum=checksum,
archive_checksum="another-checksum",
)
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.FAILURE,
result="Not consuming task_one.pdf: It is a duplicate of task_one_existing.pdf (#1234).",
status=celery.states.SUCCESS,
result=f"Success. New document id {created_doc.pk} created",
)
response = self.client.get(self.ENDPOINT)
@@ -348,7 +363,7 @@ class TestTasks(DirectoriesMixin, APITestCase):
returned_data = response.data[0]
self.assertEqual(returned_data["related_document"], "1234")
self.assertEqual(returned_data["related_document"], str(created_doc.pk))
def test_run_train_classifier_task(self):
"""

View File

@@ -485,21 +485,21 @@ class TestConsumer(
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
with self.assertRaisesMessage(ConsumerError, "It is a duplicate"):
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
self._assert_first_last_send_progress(last_status="FAILED")
self.assertEqual(Document.objects.count(), 2)
self._assert_first_last_send_progress()
def testDuplicates2(self):
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
with self.assertRaisesMessage(ConsumerError, "It is a duplicate"):
with self.get_consumer(self.get_test_archive_file()) as consumer:
consumer.run()
self._assert_first_last_send_progress(last_status="FAILED")
self.assertEqual(Document.objects.count(), 2)
self._assert_first_last_send_progress()
def testDuplicates3(self):
with self.get_consumer(self.get_test_archive_file()) as consumer:
@@ -513,10 +513,11 @@ class TestConsumer(
Document.objects.all().delete()
with self.assertRaisesMessage(ConsumerError, "document is in the trash"):
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
self.assertEqual(Document.objects.count(), 1)
def testAsnExists(self):
with self.get_consumer(
self.get_test_file(),
@@ -718,12 +719,45 @@ class TestConsumer(
dst = self.get_test_file()
self.assertIsFile(dst)
with self.assertRaises(ConsumerError):
expected_message = (
f"{dst.name}: Not consuming {dst.name}: "
f"It is a duplicate of {document.title} (#{document.pk})"
)
with self.assertRaisesMessage(ConsumerError, expected_message):
with self.get_consumer(dst) as consumer:
consumer.run()
self.assertIsNotFile(dst)
self._assert_first_last_send_progress(last_status="FAILED")
self.assertEqual(Document.objects.count(), 1)
self._assert_first_last_send_progress(last_status=ProgressStatusOptions.FAILED)
@override_settings(CONSUMER_DELETE_DUPLICATES=True)
def test_delete_duplicate_in_trash(self):
dst = self.get_test_file()
with self.get_consumer(dst) as consumer:
consumer.run()
# Move the existing document to trash
document = Document.objects.first()
document.delete()
dst = self.get_test_file()
self.assertIsFile(dst)
expected_message = (
f"{dst.name}: Not consuming {dst.name}: "
f"It is a duplicate of {document.title} (#{document.pk})"
f" Note: existing document is in the trash."
)
with self.assertRaisesMessage(ConsumerError, expected_message):
with self.get_consumer(dst) as consumer:
consumer.run()
self.assertIsNotFile(dst)
self.assertEqual(Document.global_objects.count(), 1)
self.assertEqual(Document.objects.count(), 0)
@override_settings(CONSUMER_DELETE_DUPLICATES=False)
def test_no_delete_duplicate(self):
@@ -743,15 +777,12 @@ class TestConsumer(
dst = self.get_test_file()
self.assertIsFile(dst)
with self.assertRaisesRegex(
ConsumerError,
r"sample\.pdf: Not consuming sample\.pdf: It is a duplicate of sample \(#\d+\)",
):
with self.get_consumer(dst) as consumer:
consumer.run()
self.assertIsFile(dst)
self._assert_first_last_send_progress(last_status="FAILED")
self.assertIsNotFile(dst)
self.assertEqual(Document.objects.count(), 2)
self._assert_first_last_send_progress()
@override_settings(FILENAME_FORMAT="{title}")
@mock.patch("documents.parsers.document_consumer_declaration.send")

View File

@@ -180,7 +180,7 @@ class TestRewriteNaturalDateKeywords(SimpleTestCase):
(
"added:this year",
datetime(2025, 7, 15, 12, 0, 0, tzinfo=timezone.utc),
("added:[20250101", "TO 20250715"),
("added:[20250101", "TO 20251231"),
),
(
"added:previous year",

View File

@@ -241,6 +241,10 @@ class TestExportImport(
checksum = hashlib.md5(f.read()).hexdigest()
self.assertEqual(checksum, element["fields"]["checksum"])
# Generated field "content_length" should not be exported,
# it is automatically computed during import.
self.assertNotIn("content_length", element["fields"])
if document_exporter.EXPORTER_ARCHIVE_NAME in element:
fname = (
self.target / element[document_exporter.EXPORTER_ARCHIVE_NAME]

View File

@@ -35,7 +35,6 @@ from django.db.models import Model
from django.db.models import Q
from django.db.models import Sum
from django.db.models import When
from django.db.models.functions import Length
from django.db.models.functions import Lower
from django.db.models.manager import Manager
from django.http import FileResponse
@@ -479,11 +478,11 @@ class TagViewSet(ModelViewSet, PermissionsAwareDocumentCountMixin):
if descendant_pks:
filter_q = self.get_document_count_filter()
children_source = (
children_source = list(
Tag.objects.filter(pk__in=descendant_pks | {t.pk for t in all_tags})
.select_related("owner")
.annotate(document_count=Count("documents", filter=filter_q))
.order_by(*ordering)
.order_by(*ordering),
)
else:
children_source = all_tags
@@ -495,7 +494,11 @@ class TagViewSet(ModelViewSet, PermissionsAwareDocumentCountMixin):
page = self.paginate_queryset(queryset)
serializer = self.get_serializer(page, many=True)
return self.get_paginated_response(serializer.data)
response = self.get_paginated_response(serializer.data)
if descendant_pks:
# Include children in the "all" field, if needed
response.data["all"] = [tag.pk for tag in children_source]
return response
def perform_update(self, serializer):
old_parent = self.get_object().get_parent()
@@ -2322,7 +2325,6 @@ class StatisticsView(GenericAPIView):
user = request.user if request.user is not None else None
documents = (
(
Document.objects.all()
if user is None
else get_objects_for_user_owner_aware(
@@ -2331,14 +2333,11 @@ class StatisticsView(GenericAPIView):
Document,
)
)
.only("mime_type", "content")
.prefetch_related("tags")
)
tags = (
Tag.objects.all()
if user is None
else get_objects_for_user_owner_aware(user, "documents.view_tag", Tag)
)
).only("id", "is_inbox_tag")
correspondent_count = (
Correspondent.objects.count()
if user is None
@@ -2367,31 +2366,33 @@ class StatisticsView(GenericAPIView):
).count()
)
documents_total = documents.count()
inbox_tags = tags.filter(is_inbox_tag=True)
inbox_tag_pks = list(
tags.filter(is_inbox_tag=True).values_list("pk", flat=True),
)
documents_inbox = (
documents.filter(tags__id__in=inbox_tags).distinct().count()
if inbox_tags.exists()
documents.filter(tags__id__in=inbox_tag_pks).values("id").distinct().count()
if inbox_tag_pks
else None
)
document_file_type_counts = (
# Single SQL request for document stats and mime type counts
mime_type_stats = list(
documents.values("mime_type")
.annotate(mime_type_count=Count("mime_type"))
.order_by("-mime_type_count")
if documents_total > 0
else []
.annotate(
mime_type_count=Count("id"),
mime_type_chars=Sum("content_length"),
)
.order_by("-mime_type_count"),
)
character_count = (
documents.annotate(
characters=Length("content"),
)
.aggregate(Sum("characters"))
.get("characters__sum")
)
# Calculate totals from grouped results
documents_total = sum(row["mime_type_count"] for row in mime_type_stats)
character_count = sum(row["mime_type_chars"] or 0 for row in mime_type_stats)
document_file_type_counts = [
{"mime_type": row["mime_type"], "mime_type_count": row["mime_type_count"]}
for row in mime_type_stats
]
current_asn = Document.objects.aggregate(
Max("archive_serial_number", default=0),
@@ -2404,11 +2405,9 @@ class StatisticsView(GenericAPIView):
"documents_total": documents_total,
"documents_inbox": documents_inbox,
"inbox_tag": (
inbox_tags.first().pk if inbox_tags.exists() else None
inbox_tag_pks[0] if inbox_tag_pks else None
), # backwards compatibility
"inbox_tags": (
[tag.pk for tag in inbox_tags] if inbox_tags.exists() else None
),
"inbox_tags": (inbox_tag_pks if inbox_tag_pks else None),
"document_file_type_counts": document_file_type_counts,
"character_count": character_count,
"tag_count": len(tags),

File diff suppressed because it is too large Load Diff

View File

@@ -3,12 +3,15 @@ from urllib.parse import quote
from allauth.account.adapter import DefaultAccountAdapter
from allauth.core import context
from allauth.headless.tokens.sessions import SessionTokenStrategy
from allauth.socialaccount.adapter import DefaultSocialAccountAdapter
from django.conf import settings
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
from django.forms import ValidationError
from django.http import HttpRequest
from django.urls import reverse
from rest_framework.authtoken.models import Token
from documents.models import Document
from paperless.signals import handle_social_account_updated
@@ -159,3 +162,11 @@ class CustomSocialAccountAdapter(DefaultSocialAccountAdapter):
exception,
extra_context,
)
class DrfTokenStrategy(SessionTokenStrategy):
def create_access_token(self, request: HttpRequest) -> str | None:
if not request.user.is_authenticated:
return None
token, _ = Token.objects.get_or_create(user=request.user)
return token.key

View File

@@ -345,6 +345,7 @@ INSTALLED_APPS = [
"allauth.account",
"allauth.socialaccount",
"allauth.mfa",
"allauth.headless",
"drf_spectacular",
"drf_spectacular_sidecar",
"treenode",
@@ -539,6 +540,12 @@ SOCIALACCOUNT_PROVIDERS = json.loads(
)
SOCIAL_ACCOUNT_DEFAULT_GROUPS = __get_list("PAPERLESS_SOCIAL_ACCOUNT_DEFAULT_GROUPS")
SOCIAL_ACCOUNT_SYNC_GROUPS = __get_boolean("PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS")
SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM: Final[str] = os.getenv(
"PAPERLESS_SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM",
"groups",
)
HEADLESS_TOKEN_STRATEGY = "paperless.adapter.DrfTokenStrategy"
MFA_TOTP_ISSUER = "Paperless-ngx"

View File

@@ -40,15 +40,19 @@ def handle_social_account_updated(sender, request, sociallogin, **kwargs):
extra_data = sociallogin.account.extra_data or {}
social_account_groups = extra_data.get(
"groups",
settings.SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM,
[],
) # pre-allauth 65.11.0 structure
if not social_account_groups:
# allauth 65.11.0+ nests claims under `userinfo`/`id_token`
social_account_groups = (
extra_data.get("userinfo", {}).get("groups")
or extra_data.get("id_token", {}).get("groups")
extra_data.get("userinfo", {}).get(
settings.SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM,
)
or extra_data.get("id_token", {}).get(
settings.SOCIAL_ACCOUNT_SYNC_GROUPS_CLAIM,
)
or []
)
if settings.SOCIAL_ACCOUNT_SYNC_GROUPS and social_account_groups is not None:

View File

@@ -4,6 +4,7 @@ from allauth.account.adapter import get_adapter
from allauth.core import context
from allauth.socialaccount.adapter import get_adapter as get_social_adapter
from django.conf import settings
from django.contrib.auth.models import AnonymousUser
from django.contrib.auth.models import Group
from django.contrib.auth.models import User
from django.forms import ValidationError
@@ -11,6 +12,9 @@ from django.http import HttpRequest
from django.test import TestCase
from django.test import override_settings
from django.urls import reverse
from rest_framework.authtoken.models import Token
from paperless.adapter import DrfTokenStrategy
class TestCustomAccountAdapter(TestCase):
@@ -181,3 +185,74 @@ class TestCustomSocialAccountAdapter(TestCase):
self.assertTrue(
any("Test authentication error" in message for message in log_cm.output),
)
class TestDrfTokenStrategy(TestCase):
def test_create_access_token_creates_new_token(self):
"""
GIVEN:
- A user with no existing DRF token
WHEN:
- create_access_token is called
THEN:
- A new token is created and its key is returned
"""
user = User.objects.create_user("testuser")
request = HttpRequest()
request.user = user
strategy = DrfTokenStrategy()
token_key = strategy.create_access_token(request)
# Verify a token was created
self.assertIsNotNone(token_key)
self.assertTrue(Token.objects.filter(user=user).exists())
# Verify the returned key matches the created token
token = Token.objects.get(user=user)
self.assertEqual(token_key, token.key)
def test_create_access_token_returns_existing_token(self):
"""
GIVEN:
- A user with an existing DRF token
WHEN:
- create_access_token is called again
THEN:
- The same token key is returned (no new token created)
"""
user = User.objects.create_user("testuser")
existing_token = Token.objects.create(user=user)
request = HttpRequest()
request.user = user
strategy = DrfTokenStrategy()
token_key = strategy.create_access_token(request)
# Verify the existing token key is returned
self.assertEqual(token_key, existing_token.key)
# Verify only one token exists (no duplicate created)
self.assertEqual(Token.objects.filter(user=user).count(), 1)
def test_create_access_token_returns_none_for_unauthenticated_user(self):
"""
GIVEN:
- An unauthenticated request
WHEN:
- create_access_token is called
THEN:
- None is returned and no token is created
"""
request = HttpRequest()
request.user = AnonymousUser()
strategy = DrfTokenStrategy()
token_key = strategy.create_access_token(request)
self.assertIsNone(token_key)
self.assertEqual(Token.objects.count(), 0)

View File

@@ -228,6 +228,7 @@ urlpatterns = [
],
),
),
re_path("^auth/headless/", include("allauth.headless.urls")),
re_path(
"^$", # Redirect to the API swagger view
RedirectView.as_view(url="schema/view/"),

View File

@@ -1,11 +1,14 @@
import logging
import shutil
from datetime import timedelta
from pathlib import Path
import faiss
import llama_index.core.settings as llama_settings
import tqdm
from celery import states
from django.conf import settings
from django.utils import timezone
from llama_index.core import Document as LlamaDocument
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex
@@ -21,6 +24,7 @@ from llama_index.core.text_splitter import TokenTextSplitter
from llama_index.vector_stores.faiss import FaissVectorStore
from documents.models import Document
from documents.models import PaperlessTask
from paperless_ai.embedding import build_llm_index_text
from paperless_ai.embedding import get_embedding_dim
from paperless_ai.embedding import get_embedding_model
@@ -28,6 +32,29 @@ from paperless_ai.embedding import get_embedding_model
logger = logging.getLogger("paperless_ai.indexing")
def queue_llm_index_update_if_needed(*, rebuild: bool, reason: str) -> bool:
from documents.tasks import llmindex_index
has_running = PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
status__in=[states.PENDING, states.STARTED],
).exists()
has_recent = PaperlessTask.objects.filter(
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
date_created__gte=(timezone.now() - timedelta(minutes=5)),
).exists()
if has_running or has_recent:
return False
llmindex_index.delay(rebuild=rebuild, scheduled=False, auto=True)
logger.warning(
"Queued LLM index update%s: %s",
" (rebuild)" if rebuild else "",
reason,
)
return True
def get_or_create_storage_context(*, rebuild=False):
"""
Loads or creates the StorageContext (vector store, docstore, index store).
@@ -93,6 +120,10 @@ def load_or_build_index(nodes=None):
except ValueError as e:
logger.warning("Failed to load index from storage: %s", e)
if not nodes:
queue_llm_index_update_if_needed(
rebuild=vector_store_file_exists(),
reason="LLM index missing or invalid while loading.",
)
logger.info("No nodes provided for index creation.")
raise
return VectorStoreIndex(
@@ -250,6 +281,13 @@ def query_similar_documents(
"""
Runs a similarity query and returns top-k similar Document objects.
"""
if not vector_store_file_exists():
queue_llm_index_update_if_needed(
rebuild=False,
reason="LLM index not found for similarity query.",
)
return []
index = load_or_build_index()
# constrain only the node(s) that match the document IDs, if given

View File

@@ -3,11 +3,13 @@ from unittest.mock import MagicMock
from unittest.mock import patch
import pytest
from celery import states
from django.test import override_settings
from django.utils import timezone
from llama_index.core.base.embeddings.base import BaseEmbedding
from documents.models import Document
from documents.models import PaperlessTask
from paperless_ai import indexing
@@ -288,6 +290,36 @@ def test_update_llm_index_no_documents(
)
@pytest.mark.django_db
def test_queue_llm_index_update_if_needed_enqueues_when_idle_or_skips_recent():
# No existing tasks
with patch("documents.tasks.llmindex_index") as mock_task:
result = indexing.queue_llm_index_update_if_needed(
rebuild=True,
reason="test enqueue",
)
assert result is True
mock_task.delay.assert_called_once_with(rebuild=True, scheduled=False, auto=True)
PaperlessTask.objects.create(
task_id="task-1",
task_name=PaperlessTask.TaskName.LLMINDEX_UPDATE,
status=states.STARTED,
date_created=timezone.now(),
)
# Existing running task
with patch("documents.tasks.llmindex_index") as mock_task:
result = indexing.queue_llm_index_update_if_needed(
rebuild=False,
reason="should skip",
)
assert result is False
mock_task.delay.assert_not_called()
@override_settings(
LLM_EMBEDDING_BACKEND="huggingface",
LLM_BACKEND="ollama",
@@ -299,11 +331,15 @@ def test_query_similar_documents(
with (
patch("paperless_ai.indexing.get_or_create_storage_context") as mock_storage,
patch("paperless_ai.indexing.load_or_build_index") as mock_load_or_build_index,
patch(
"paperless_ai.indexing.vector_store_file_exists",
) as mock_vector_store_exists,
patch("paperless_ai.indexing.VectorIndexRetriever") as mock_retriever_cls,
patch("paperless_ai.indexing.Document.objects.filter") as mock_filter,
):
mock_storage.return_value = MagicMock()
mock_storage.return_value.persist_dir = temp_llm_index_dir
mock_vector_store_exists.return_value = True
mock_index = MagicMock()
mock_load_or_build_index.return_value = mock_index
@@ -332,3 +368,31 @@ def test_query_similar_documents(
mock_filter.assert_called_once_with(pk__in=[1, 2])
assert result == mock_filtered_docs
@pytest.mark.django_db
def test_query_similar_documents_triggers_update_when_index_missing(
temp_llm_index_dir,
real_document,
):
with (
patch(
"paperless_ai.indexing.vector_store_file_exists",
return_value=False,
),
patch(
"paperless_ai.indexing.queue_llm_index_update_if_needed",
) as mock_queue,
patch("paperless_ai.indexing.load_or_build_index") as mock_load,
):
result = indexing.query_similar_documents(
real_document,
top_k=2,
)
mock_queue.assert_called_once_with(
rebuild=False,
reason="LLM index not found for similarity query.",
)
mock_load.assert_not_called()
assert result == []