Compare commits

..

11 Commits

Author SHA1 Message Date
shamoon
90561857e8 Documentation: add note about WAL mode with SQLite 2025-02-24 14:08:49 -08:00
shamoon
fc68f55d1a Documentation: correct modify_tags param requirement 2025-02-14 08:20:43 -08:00
shamoon
6a8ec182fa Documentation: clarify encryption docs 2025-02-08 08:07:04 -08:00
Stéphane Brunner
69541546ea Fix URL to django-rest-framework (#8998) 2025-02-02 08:00:02 -08:00
github-actions[bot]
16d6bb7334 Documentation: Add v2.14.7 changelog (#8972)
* Changelog v2.14.7 - GHA

* Update changelog.md

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2025-01-31 08:36:29 -08:00
shamoon
49b658a944 Bump version to 2.14.7 2025-01-31 07:46:43 -08:00
shamoon
e1d8680698 Fix: also ensure symmetric doc link removal on bulk edit (#8963) 2025-01-31 07:44:47 -08:00
shamoon
ee72e2d1fd Fix: resolve error in trackBy 2025-01-31 07:44:47 -08:00
shamoon
e0ea4a4625 Fix: reflect doc links in bulk modify custom fields (#8962) 2025-01-31 07:44:47 -08:00
shamoon
c2a9ac332a Enhancement: require totp code for obtain auth token (#8936) 2025-01-31 07:44:47 -08:00
dependabot[bot]
bf368aadd0 Chore(deps-dev): Bump ruff from 0.9.2 to 0.9.3 in the development group (#8928)
* Chore(deps-dev): Bump ruff from 0.9.2 to 0.9.3 in the development group

Bumps the development group with 1 update: [ruff](https://github.com/astral-sh/ruff).


Updates `ruff` from 0.9.2 to 0.9.3
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.9.2...0.9.3)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: development
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update .pre-commit-config.yaml

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2025-01-31 07:44:46 -08:00
23 changed files with 484 additions and 448 deletions

View File

@@ -38,6 +38,7 @@ ignore = ["DJ001", "SIM105", "RUF012"]
[lint.per-file-ignores]
".github/scripts/*.py" = ["E501", "INP001", "SIM117"]
"docker/wait-for-redis.py" = ["INP001", "T201"]
"src/documents/consumer.py" = ["PTH"] # TODO Enable & remove
"src/documents/file_handling.py" = ["PTH"] # TODO Enable & remove
"src/documents/management/commands/document_consumer.py" = ["PTH"] # TODO Enable & remove
"src/documents/management/commands/document_exporter.py" = ["PTH"] # TODO Enable & remove
@@ -50,6 +51,8 @@ ignore = ["DJ001", "SIM105", "RUF012"]
"src/documents/signals/handlers.py" = ["PTH"] # TODO Enable & remove
"src/documents/tasks.py" = ["PTH"] # TODO Enable & remove
"src/documents/tests/test_api_app_config.py" = ["PTH"] # TODO Enable & remove
"src/documents/tests/test_api_bulk_download.py" = ["PTH"] # TODO Enable & remove
"src/documents/tests/test_api_documents.py" = ["PTH"] # TODO Enable & remove
"src/documents/tests/test_classifier.py" = ["PTH"] # TODO Enable & remove
"src/documents/tests/test_consumer.py" = ["PTH"] # TODO Enable & remove
"src/documents/tests/test_file_handling.py" = ["PTH"] # TODO Enable & remove

View File

@@ -565,19 +565,15 @@ document.
### Managing encryption {#encryption}
Documents can be stored in Paperless using GnuPG encryption.
!!! warning
Encryption is deprecated since [paperless-ng 0.9](changelog.md#paperless-ng-090) and doesn't really
provide any additional security, since you have to store the passphrase
in a configuration file on the same system as the encrypted documents
for paperless to work. Furthermore, the entire text content of the
documents is stored plain in the database, even if your documents are
encrypted. Filenames are not encrypted as well.
Also, the web server provides transparent access to your encrypted
documents.
Encryption was removed in [paperless-ng 0.9](changelog.md#paperless-ng-090)
because it did not really provide any additional security, the passphrase
was stored in a configuration file on the same system as the documents.
Furthermore, the entire text content of the documents is stored plain in
the database, even if your documents are encrypted. Filenames are not
encrypted as well. Finally, the web server provides transparent access to
your encrypted documents.
Consider running paperless on an encrypted filesystem instead, which
will then at least provide security against physical hardware theft.

View File

@@ -1,7 +1,7 @@
# The REST API
Paperless makes use of the [Django REST
Framework](https://django-rest-framework.org/) standard API interface. It
Framework](https://www.django-rest-framework.org/) standard API interface. It
provides a browsable API for most of its endpoints, which you can
inspect at `http://<paperless-host>:<port>/api/`. This also documents
most of the available filters and ordering fields.
@@ -444,7 +444,7 @@ The following methods are supported:
- `remove_tag`
- Requires `parameters`: `{ "tag": TAG_ID }`
- `modify_tags`
- Requires `parameters`: `{ "add_tags": [LIST_OF_TAG_IDS] }` and / or `{ "remove_tags": [LIST_OF_TAG_IDS] }`
- Requires `parameters`: `{ "add_tags": [LIST_OF_TAG_IDS] }` and `{ "remove_tags": [LIST_OF_TAG_IDS] }`
- `delete`
- No `parameters` required
- `reprocess`

View File

@@ -1,5 +1,28 @@
# Changelog
## paperless-ngx 2.14.7
### Features
- Enhancement: require totp code for obtain auth token by [@shamoon](https://github.com/shamoon) [#8936](https://github.com/paperless-ngx/paperless-ngx/pull/8936)
### Bug Fixes
- Enhancement: require totp code for obtain auth token by [@shamoon](https://github.com/shamoon) [#8936](https://github.com/paperless-ngx/paperless-ngx/pull/8936)
- Fix: reflect doc links in bulk modify custom fields by [@shamoon](https://github.com/shamoon) [#8962](https://github.com/paperless-ngx/paperless-ngx/pull/8962)
- Fix: also ensure symmetric doc link removal on bulk edit by [@shamoon](https://github.com/shamoon) [#8963](https://github.com/paperless-ngx/paperless-ngx/pull/8963)
### All App Changes
<details>
<summary>4 changes</summary>
- Chore(deps-dev): Bump ruff from 0.9.2 to 0.9.3 in the development group by @[dependabot[bot]](https://github.com/apps/dependabot) [#8928](https://github.com/paperless-ngx/paperless-ngx/pull/8928)
- Enhancement: require totp code for obtain auth token by [@shamoon](https://github.com/shamoon) [#8936](https://github.com/paperless-ngx/paperless-ngx/pull/8936)
- Fix: reflect doc links in bulk modify custom fields by [@shamoon](https://github.com/shamoon) [#8962](https://github.com/paperless-ngx/paperless-ngx/pull/8962)
- Fix: also ensure symmetric doc link removal on bulk edit by [@shamoon](https://github.com/shamoon) [#8963](https://github.com/paperless-ngx/paperless-ngx/pull/8963)
</details>
## paperless-ngx 2.14.6
### Bug Fixes

View File

@@ -713,7 +713,8 @@ Paperless runs on Raspberry Pi. However, some things are rather slow on
the Pi and configuring some options in paperless can help improve
performance immensely:
- Stick with SQLite to save some resources.
- Stick with SQLite to save some resources. See [troubleshooting](troubleshooting.md#log-reports-creating-paperlesstask-failed)
if you encounter issues with SQLite locking.
- Consider setting [`PAPERLESS_OCR_PAGES`](configuration.md#PAPERLESS_OCR_PAGES) to 1, so that paperless will
only OCR the first page of your documents. In most cases, this page
contains enough information to be able to find it.

View File

@@ -320,7 +320,9 @@ many workers attempting to access the database simultaneously.
Consider changing to the PostgreSQL database if you will be processing
many documents at once often. Otherwise, try tweaking the
[`PAPERLESS_DB_TIMEOUT`](configuration.md#PAPERLESS_DB_TIMEOUT) setting to allow more time for the database to
unlock. This may have minor performance implications.
unlock. Additionally, you can change your SQLite database to use ["Write-Ahead Logging"](https://sqlite.org/wal.html).
These changes may have minor performance implications but can help
prevent database locking issues.
## gunicorn fails to start with "is not a valid port number"

View File

@@ -5,7 +5,6 @@ import { first, Subscription } from 'rxjs'
import { ToastsComponent } from './components/common/toasts/toasts.component'
import { FileDropComponent } from './components/file-drop/file-drop.component'
import { SETTINGS_KEYS } from './data/ui-settings'
import { ComponentRouterService } from './services/component-router.service'
import { ConsumerStatusService } from './services/consumer-status.service'
import { HotKeyService } from './services/hot-key.service'
import {
@@ -42,8 +41,7 @@ export class AppComponent implements OnInit, OnDestroy {
public tourService: TourService,
private renderer: Renderer2,
private permissionsService: PermissionsService,
private hotKeyService: HotKeyService,
private componentRouterService: ComponentRouterService
private hotKeyService: HotKeyService
) {
let anyWindow = window as any
anyWindow.pdfWorkerSrc = 'assets/js/pdf.worker.min.mjs'

View File

@@ -45,7 +45,6 @@ import { Tag } from 'src/app/data/tag'
import { PermissionsGuard } from 'src/app/guards/permissions.guard'
import { CustomDatePipe } from 'src/app/pipes/custom-date.pipe'
import { DocumentTitlePipe } from 'src/app/pipes/document-title.pipe'
import { ComponentRouterService } from 'src/app/services/component-router.service'
import { DocumentListViewService } from 'src/app/services/document-list-view.service'
import { OpenDocumentsService } from 'src/app/services/open-documents.service'
import { PermissionsService } from 'src/app/services/permissions.service'
@@ -128,7 +127,6 @@ describe('DocumentDetailComponent', () => {
let settingsService: SettingsService
let customFieldsService: CustomFieldsService
let httpTestingController: HttpTestingController
let componentRouterService: ComponentRouterService
let currentUserCan = true
let currentUserHasObjectPermissions = true
@@ -266,7 +264,6 @@ describe('DocumentDetailComponent', () => {
customFieldsService = TestBed.inject(CustomFieldsService)
fixture = TestBed.createComponent(DocumentDetailComponent)
httpTestingController = TestBed.inject(HttpTestingController)
componentRouterService = TestBed.inject(ComponentRouterService)
component = fixture.componentInstance
})
@@ -571,16 +568,6 @@ describe('DocumentDetailComponent', () => {
expect(navigateSpy).toHaveBeenCalledWith(['documents'])
})
it('should allow close and navigate to the last view if available', () => {
initNormally()
jest
.spyOn(componentRouterService, 'getComponentURLBefore')
.mockReturnValue('dashboard')
const navigateSpy = jest.spyOn(router, 'navigate')
component.close()
expect(navigateSpy).toHaveBeenCalledWith(['dashboard'])
})
it('should allow close and navigate to documents by default', () => {
initNormally()
jest

View File

@@ -59,7 +59,6 @@ import { CustomDatePipe } from 'src/app/pipes/custom-date.pipe'
import { DocumentTitlePipe } from 'src/app/pipes/document-title.pipe'
import { FileSizePipe } from 'src/app/pipes/file-size.pipe'
import { SafeUrlPipe } from 'src/app/pipes/safeurl.pipe'
import { ComponentRouterService } from 'src/app/services/component-router.service'
import { DocumentListViewService } from 'src/app/services/document-list-view.service'
import { HotKeyService } from 'src/app/services/hot-key.service'
import { OpenDocumentsService } from 'src/app/services/open-documents.service'
@@ -273,8 +272,7 @@ export class DocumentDetailComponent
private userService: UserService,
private customFieldsService: CustomFieldsService,
private http: HttpClient,
private hotKeyService: HotKeyService,
private componentRouterService: ComponentRouterService
private hotKeyService: HotKeyService
) {
super()
}
@@ -890,10 +888,6 @@ export class DocumentDetailComponent
'view',
this.documentListViewService.activeSavedViewId,
])
} else if (this.componentRouterService.getComponentURLBefore()) {
this.router.navigate([
this.componentRouterService.getComponentURLBefore(),
])
} else {
this.router.navigate(['documents'])
}

View File

@@ -32,7 +32,7 @@
{{document.title | documentTitle}}
}
@if (displayFields.includes(DisplayField.TAGS)) {
@for (tagID of document.tags; track t) {
@for (tagID of document.tags; track tagID) {
<pngx-tag [tagID]="tagID" linkTitle="Filter by tag" i18n-linkTitle class="ms-1" (click)="clickTag.emit(tagID);$event.stopPropagation()" [clickable]="clickTag.observers.length"></pngx-tag>
}
}

View File

@@ -1,102 +0,0 @@
import { TestBed } from '@angular/core/testing'
import { ActivationStart, Router } from '@angular/router'
import { Subject } from 'rxjs'
import { ComponentRouterService } from './component-router.service'
describe('ComponentRouterService', () => {
let service: ComponentRouterService
let router: Router
let eventsSubject: Subject<any>
beforeEach(() => {
eventsSubject = new Subject<any>()
TestBed.configureTestingModule({
providers: [
ComponentRouterService,
{
provide: Router,
useValue: {
events: eventsSubject.asObservable(),
},
},
],
})
service = TestBed.inject(ComponentRouterService)
router = TestBed.inject(Router)
})
it('should add to history and componentHistory on ActivationStart event', () => {
eventsSubject.next(
new ActivationStart({
url: 'test-url',
component: { name: 'TestComponent' },
} as any)
)
expect((service as any).history).toEqual(['test-url'])
expect((service as any).componentHistory).toEqual(['TestComponent'])
})
it('should not add duplicate component names to componentHistory', () => {
eventsSubject.next(
new ActivationStart({
url: 'test-url-1',
component: { name: 'TestComponent' },
} as any)
)
eventsSubject.next(
new ActivationStart({
url: 'test-url-2',
component: { name: 'TestComponent' },
} as any)
)
expect((service as any).componentHistory.length).toBe(1)
expect((service as any).componentHistory).toEqual(['TestComponent'])
})
it('should return the URL of the component before the current one', () => {
eventsSubject.next(
new ActivationStart({
url: 'test-url-1',
component: { name: 'TestComponent1' },
} as any)
)
eventsSubject.next(
new ActivationStart({
url: 'test-url-2',
component: { name: 'TestComponent2' },
} as any)
)
expect(service.getComponentURLBefore()).toBe('test-url-1')
})
it('should update the URL of the current component if the same component is loaded via a different URL', () => {
eventsSubject.next(
new ActivationStart({
url: 'test-url-1',
component: { name: 'TestComponent' },
} as any)
)
eventsSubject.next(
new ActivationStart({
url: 'test-url-2',
component: { name: 'TestComponent' },
} as any)
)
expect((service as any).history).toEqual(['test-url-2'])
})
it('should return null if there is no previous component', () => {
eventsSubject.next(
new ActivationStart({
url: 'test-url',
component: { name: 'TestComponent' },
} as any)
)
expect(service.getComponentURLBefore()).toBeNull()
})
})

View File

@@ -1,35 +0,0 @@
import { Injectable } from '@angular/core'
import { ActivationStart, Event, Router } from '@angular/router'
import { filter } from 'rxjs'
@Injectable({
providedIn: 'root',
})
export class ComponentRouterService {
private history: string[] = []
private componentHistory: any[] = []
constructor(private router: Router) {
this.router.events
.pipe(filter((event: Event) => event instanceof ActivationStart))
.subscribe((event: ActivationStart) => {
if (
this.componentHistory[this.componentHistory.length - 1] !==
event.snapshot.component.name
) {
this.history.push(event.snapshot.url.toString())
this.componentHistory.push(event.snapshot.component.name)
} else {
// Update the URL of the current component in case the same component was loaded via a different URL
this.history[this.history.length - 1] = event.snapshot.url.toString()
}
})
}
public getComponentURLBefore(): any {
if (this.componentHistory.length > 1) {
return this.history[this.history.length - 2]
}
return null
}
}

View File

@@ -5,7 +5,7 @@ export const environment = {
apiBaseUrl: document.baseURI + 'api/',
apiVersion: '7',
appTitle: 'Paperless-ngx',
version: '2.14.6',
version: '2.14.7',
webSocketHost: window.location.host,
webSocketProtocol: window.location.protocol == 'https:' ? 'wss:' : 'ws:',
webSocketBaseUrl: base_url.pathname + 'ws/',

View File

@@ -12,6 +12,7 @@ from celery import shared_task
from django.conf import settings
from django.contrib.auth.models import User
from django.db.models import Q
from django.utils import timezone
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentMetadataOverrides
@@ -177,6 +178,27 @@ def modify_custom_fields(
field_id=field_id,
defaults=defaults,
)
if custom_field.data_type == CustomField.FieldDataType.DOCUMENTLINK:
doc = Document.objects.get(id=doc_id)
reflect_doclinks(doc, custom_field, value)
# For doc link fields that are being removed, remove symmetrical links
for doclink_being_removed_instance in CustomFieldInstance.objects.filter(
document_id__in=affected_docs,
field__id__in=remove_custom_fields,
field__data_type=CustomField.FieldDataType.DOCUMENTLINK,
value_document_ids__isnull=False,
):
for target_doc_id in doclink_being_removed_instance.value:
remove_doclink(
document=Document.objects.get(
id=doclink_being_removed_instance.document.id,
),
field=doclink_being_removed_instance.field,
target_doc_id=target_doc_id,
)
# Finally, remove the custom fields
CustomFieldInstance.objects.filter(
document_id__in=affected_docs,
field_id__in=remove_custom_fields,
@@ -447,3 +469,87 @@ def delete_pages(doc_ids: list[int], pages: list[int]) -> Literal["OK"]:
logger.exception(f"Error deleting pages from document {doc.id}: {e}")
return "OK"
def reflect_doclinks(
document: Document,
field: CustomField,
target_doc_ids: list[int],
):
"""
Add or remove 'symmetrical' links to `document` on all `target_doc_ids`
"""
if target_doc_ids is None:
target_doc_ids = []
# Check if any documents are going to be removed from the current list of links and remove the symmetrical links
current_field_instance = CustomFieldInstance.objects.filter(
field=field,
document=document,
).first()
if current_field_instance is not None and current_field_instance.value is not None:
for doc_id in current_field_instance.value:
if doc_id not in target_doc_ids:
remove_doclink(
document=document,
field=field,
target_doc_id=doc_id,
)
# Create an instance if target doc doesn't have this field or append it to an existing one
existing_custom_field_instances = {
custom_field.document_id: custom_field
for custom_field in CustomFieldInstance.objects.filter(
field=field,
document_id__in=target_doc_ids,
)
}
custom_field_instances_to_create = []
custom_field_instances_to_update = []
for target_doc_id in target_doc_ids:
target_doc_field_instance = existing_custom_field_instances.get(
target_doc_id,
)
if target_doc_field_instance is None:
custom_field_instances_to_create.append(
CustomFieldInstance(
document_id=target_doc_id,
field=field,
value_document_ids=[document.id],
),
)
elif target_doc_field_instance.value is None:
target_doc_field_instance.value_document_ids = [document.id]
custom_field_instances_to_update.append(target_doc_field_instance)
elif document.id not in target_doc_field_instance.value:
target_doc_field_instance.value_document_ids.append(document.id)
custom_field_instances_to_update.append(target_doc_field_instance)
CustomFieldInstance.objects.bulk_create(custom_field_instances_to_create)
CustomFieldInstance.objects.bulk_update(
custom_field_instances_to_update,
["value_document_ids"],
)
Document.objects.filter(id__in=target_doc_ids).update(modified=timezone.now())
def remove_doclink(
document: Document,
field: CustomField,
target_doc_id: int,
):
"""
Removes a 'symmetrical' link to `document` from the target document's existing custom field instance
"""
target_doc_field_instance = CustomFieldInstance.objects.filter(
document_id=target_doc_id,
field=field,
).first()
if (
target_doc_field_instance is not None
and document.id in target_doc_field_instance.value
):
target_doc_field_instance.value.remove(document.id)
target_doc_field_instance.save()
Document.objects.filter(id=target_doc_id).update(modified=timezone.now())

View File

@@ -4,7 +4,6 @@ import os
import tempfile
from enum import Enum
from pathlib import Path
from typing import TYPE_CHECKING
import magic
from django.conf import settings
@@ -155,11 +154,7 @@ class ConsumerPlugin(
"""
Confirm the input file still exists where it should
"""
if TYPE_CHECKING:
assert isinstance(self.input_doc.original_file, Path), (
self.input_doc.original_file
)
if not self.input_doc.original_file.is_file():
if not os.path.isfile(self.input_doc.original_file):
self._fail(
ConsumerStatusShortMessage.FILE_NOT_FOUND,
f"Cannot consume {self.input_doc.original_file}: File not found.",
@@ -169,7 +164,7 @@ class ConsumerPlugin(
"""
Using the MD5 of the file, check this exact file doesn't already exist
"""
with Path(self.input_doc.original_file).open("rb") as f:
with open(self.input_doc.original_file, "rb") as f:
checksum = hashlib.md5(f.read()).hexdigest()
existing_doc = Document.global_objects.filter(
Q(checksum=checksum) | Q(archive_checksum=checksum),
@@ -183,7 +178,7 @@ class ConsumerPlugin(
log_msg += " Note: existing document is in the trash."
if settings.CONSUMER_DELETE_DUPLICATES:
Path(self.input_doc.original_file).unlink()
os.unlink(self.input_doc.original_file)
self._fail(
msg,
log_msg,
@@ -242,7 +237,7 @@ class ConsumerPlugin(
if not settings.PRE_CONSUME_SCRIPT:
return
if not Path(settings.PRE_CONSUME_SCRIPT).is_file():
if not os.path.isfile(settings.PRE_CONSUME_SCRIPT):
self._fail(
ConsumerStatusShortMessage.PRE_CONSUME_SCRIPT_NOT_FOUND,
f"Configured pre-consume script "
@@ -285,7 +280,7 @@ class ConsumerPlugin(
if not settings.POST_CONSUME_SCRIPT:
return
if not Path(settings.POST_CONSUME_SCRIPT).is_file():
if not os.path.isfile(settings.POST_CONSUME_SCRIPT):
self._fail(
ConsumerStatusShortMessage.POST_CONSUME_SCRIPT_NOT_FOUND,
f"Configured post-consume script "
@@ -587,7 +582,7 @@ class ConsumerPlugin(
document.thumbnail_path,
)
if archive_path and Path(archive_path).is_file():
if archive_path and os.path.isfile(archive_path):
document.archive_filename = generate_unique_filename(
document,
archive_filename=True,
@@ -599,7 +594,7 @@ class ConsumerPlugin(
document.archive_path,
)
with Path(archive_path).open("rb") as f:
with open(archive_path, "rb") as f:
document.archive_checksum = hashlib.md5(
f.read(),
).hexdigest()
@@ -617,14 +612,14 @@ class ConsumerPlugin(
self.unmodified_original.unlink()
# https://github.com/jonaswinkler/paperless-ng/discussions/1037
shadow_file = (
Path(self.input_doc.original_file).parent
/ f"._{Path(self.input_doc.original_file).name}"
shadow_file = os.path.join(
os.path.dirname(self.input_doc.original_file),
"._" + os.path.basename(self.input_doc.original_file),
)
if Path(shadow_file).is_file():
if os.path.isfile(shadow_file):
self.log.debug(f"Deleting file {shadow_file}")
Path(shadow_file).unlink()
os.unlink(shadow_file)
except Exception as e:
self._fail(
@@ -709,7 +704,7 @@ class ConsumerPlugin(
create_date = date
self.log.debug(f"Creation date from parse_date: {create_date}")
else:
stats = Path(self.input_doc.original_file).stat()
stats = os.stat(self.input_doc.original_file)
create_date = timezone.make_aware(
datetime.datetime.fromtimestamp(stats.st_mtime),
)
@@ -805,10 +800,7 @@ class ConsumerPlugin(
) # adds to document
def _write(self, storage_type, source, target):
with (
Path(source).open("rb") as read_file,
Path(target).open("wb") as write_file,
):
with open(source, "rb") as read_file, open(target, "wb") as write_file:
write_file.write(read_file.read())
# Attempt to copy file's original stats, but it's ok if we can't

View File

@@ -1,5 +1,6 @@
import logging
import os
from concurrent.futures import ThreadPoolExecutor
from fnmatch import filter
from pathlib import Path
from pathlib import PurePath
@@ -12,9 +13,8 @@ from django import db
from django.conf import settings
from django.core.management.base import BaseCommand
from django.core.management.base import CommandError
from watchfiles import Change
from watchfiles import DefaultFilter
from watchfiles import watch
from watchdog.events import FileSystemEventHandler
from watchdog.observers.polling import PollingObserver
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentMetadataOverrides
@@ -141,6 +141,53 @@ def _consume(filepath: str) -> None:
logger.exception("Error while consuming document")
def _consume_wait_unmodified(file: str) -> None:
"""
Waits for the given file to appear unmodified based on file size
and modification time. Will wait a configured number of seconds
and retry a configured number of times before either consuming or
giving up
"""
if _is_ignored(file):
return
logger.debug(f"Waiting for file {file} to remain unmodified")
mtime = -1
size = -1
current_try = 0
while current_try < settings.CONSUMER_POLLING_RETRY_COUNT:
try:
stat_data = os.stat(file)
new_mtime = stat_data.st_mtime
new_size = stat_data.st_size
except FileNotFoundError:
logger.debug(
f"File {file} moved while waiting for it to remain unmodified.",
)
return
if new_mtime == mtime and new_size == size:
_consume(file)
return
mtime = new_mtime
size = new_size
sleep(settings.CONSUMER_POLLING_DELAY)
current_try += 1
logger.error(f"Timeout while waiting on file {file} to remain unmodified.")
class Handler(FileSystemEventHandler):
def __init__(self, pool: ThreadPoolExecutor) -> None:
super().__init__()
self._pool = pool
def on_created(self, event):
self._pool.submit(_consume_wait_unmodified, event.src_path)
def on_moved(self, event):
self._pool.submit(_consume_wait_unmodified, event.dest_path)
class Command(BaseCommand):
"""
On every iteration of an infinite loop, consume what we can from the
@@ -152,7 +199,7 @@ class Command(BaseCommand):
# Also only for testing, configures in one place the timeout used before checking
# the stop flag
testing_timeout_s: Final[float] = 0.5
testing_timeout_ms: Final[int] = int(testing_timeout_s * 1000)
testing_timeout_ms: Final[float] = testing_timeout_s * 1000.0
def add_arguments(self, parser):
parser.add_argument(
@@ -174,121 +221,139 @@ class Command(BaseCommand):
)
def handle(self, *args, **options):
directory: Final[Path] = Path(options["directory"]).resolve()
is_recursive: Final[bool] = settings.CONSUMER_RECURSIVE
is_oneshot: Final[bool] = options["oneshot"]
is_testing: Final[bool] = options["testing"]
directory = options["directory"]
recursive = settings.CONSUMER_RECURSIVE
if not directory:
raise CommandError("CONSUMPTION_DIR does not appear to be set.")
if not directory.exists():
raise CommandError(f"Consumption directory {directory} does not exist")
directory = os.path.abspath(directory)
if not directory.is_dir():
raise CommandError(f"Consumption directory {directory} is not a directory")
if not os.path.isdir(directory):
raise CommandError(f"Consumption directory {directory} does not exist")
# Consumer will need this
settings.SCRATCH_DIR.mkdir(parents=True, exist_ok=True)
# Check for existing files at startup
glob_str = "**/*" if is_recursive else "*"
if recursive:
for dirpath, _, filenames in os.walk(directory):
for filename in filenames:
filepath = os.path.join(dirpath, filename)
_consume(filepath)
else:
for entry in os.scandir(directory):
_consume(entry.path)
for filepath in directory.glob(glob_str):
_consume(filepath)
if is_oneshot:
logger.info("One shot consume requested, exiting")
if options["oneshot"]:
return
use_polling: Final[bool] = settings.CONSUMER_POLLING != 0
poll_delay_ms: Final[int] = int(settings.CONSUMER_POLLING * 1000)
if use_polling:
logger.info(
f"Polling {directory} for changes every {settings.CONSUMER_POLLING}s ",
)
if settings.CONSUMER_POLLING == 0 and INotify:
self.handle_inotify(directory, recursive, options["testing"])
else:
logger.info(f"Using inotify to watch {directory} for changes")
read_timeout_ms = 0
if options["testing"]:
read_timeout_ms = self.testing_timeout_ms
logger.debug(f"Configuring initial timeout to {read_timeout_ms}ms")
inotify_debounce_secs: Final[float] = settings.CONSUMER_INOTIFY_DELAY
inotify_debounce_ms: Final[int] = int(inotify_debounce_secs * 1000)
filter = DefaultFilter(ignore_entity_patterns={r"__paperless_write_test_\d+__"})
notified_files: dict[Path, float] = {}
while not self.stop_flag.is_set():
try:
for changes in watch(
directory,
watch_filter=filter,
rust_timeout=read_timeout_ms,
yield_on_timeout=True,
force_polling=use_polling,
poll_delay_ms=poll_delay_ms,
recursive=is_recursive,
stop_event=self.stop_flag,
):
for change_type, path in changes:
path = Path(path).resolve()
logger.info(f"Got {change_type.name} for {path}")
match change_type:
case Change.added | Change.modified:
logger.info(
f"New event time for {path} at {monotonic()}",
)
notified_files[path] = monotonic()
case Change.deleted:
notified_files.pop(path, None)
logger.info("Checking for files that are ready")
# Check the files against the timeout
still_waiting = {}
# last_event_time is time of the last inotify event for this file
for filepath, last_event_time in notified_files.items():
# Current time - last time over the configured timeout
waited_long_enough = (
monotonic() - last_event_time
) > inotify_debounce_secs
# Also make sure the file exists still, some scanners might write a
# temporary file first
file_still_exists = filepath.exists() and filepath.is_file()
logger.info(
f"{filepath} - {waited_long_enough} - {file_still_exists}",
)
if waited_long_enough and file_still_exists:
logger.info(f"Consuming {filepath}")
_consume(filepath)
elif file_still_exists:
still_waiting[filepath] = last_event_time
# These files are still waiting to hit the timeout
notified_files = still_waiting
# Always exit the watch loop to reconfigure the timeout
break
if len(notified_files) > 0:
logger.info("Using inotify_debounce_ms")
read_timeout_ms = inotify_debounce_ms
elif is_testing:
logger.info("Using testing_timeout_ms")
read_timeout_ms = self.testing_timeout_ms
else:
logger.info("No files in waiting, configuring indefinite timeout")
read_timeout_ms = 0
logger.info(f"Configuring timeout to {read_timeout_ms}ms")
except KeyboardInterrupt:
self.stop_flag.set()
if INotify is None and settings.CONSUMER_POLLING == 0: # pragma: no cover
logger.warning("Using polling as INotify import failed")
self.handle_polling(directory, recursive, options["testing"])
logger.debug("Consumer exiting.")
def handle_polling(self, directory, recursive, is_testing: bool):
logger.info(f"Polling directory for changes: {directory}")
timeout = None
if is_testing:
timeout = self.testing_timeout_s
logger.debug(f"Configuring timeout to {timeout}s")
polling_interval = settings.CONSUMER_POLLING
if polling_interval == 0: # pragma: no cover
# Only happens if INotify failed to import
logger.warning("Using polling of 10s, consider setting this")
polling_interval = 10
with ThreadPoolExecutor(max_workers=4) as pool:
observer = PollingObserver(timeout=polling_interval)
observer.schedule(Handler(pool), directory, recursive=recursive)
observer.start()
try:
while observer.is_alive():
observer.join(timeout)
if self.stop_flag.is_set():
observer.stop()
except KeyboardInterrupt:
observer.stop()
observer.join()
def handle_inotify(self, directory, recursive, is_testing: bool):
logger.info(f"Using inotify to watch directory for changes: {directory}")
timeout_ms = None
if is_testing:
timeout_ms = self.testing_timeout_ms
logger.debug(f"Configuring timeout to {timeout_ms}ms")
inotify = INotify()
inotify_flags = flags.CLOSE_WRITE | flags.MOVED_TO | flags.MODIFY
if recursive:
descriptor = inotify.add_watch_recursive(directory, inotify_flags)
else:
descriptor = inotify.add_watch(directory, inotify_flags)
inotify_debounce_secs: Final[float] = settings.CONSUMER_INOTIFY_DELAY
inotify_debounce_ms: Final[int] = inotify_debounce_secs * 1000
finished = False
notified_files = {}
while not finished:
try:
for event in inotify.read(timeout=timeout_ms):
path = inotify.get_path(event.wd) if recursive else directory
filepath = os.path.join(path, event.name)
if flags.MODIFY in flags.from_mask(event.mask):
notified_files.pop(filepath, None)
else:
notified_files[filepath] = monotonic()
# Check the files against the timeout
still_waiting = {}
# last_event_time is time of the last inotify event for this file
for filepath, last_event_time in notified_files.items():
# Current time - last time over the configured timeout
waited_long_enough = (
monotonic() - last_event_time
) > inotify_debounce_secs
# Also make sure the file exists still, some scanners might write a
# temporary file first
file_still_exists = os.path.exists(filepath) and os.path.isfile(
filepath,
)
if waited_long_enough and file_still_exists:
_consume(filepath)
elif file_still_exists:
still_waiting[filepath] = last_event_time
# These files are still waiting to hit the timeout
notified_files = still_waiting
# If files are waiting, need to exit read() to check them
# Otherwise, go back to infinite sleep time, but only if not testing
if len(notified_files) > 0:
timeout_ms = inotify_debounce_ms
elif is_testing:
timeout_ms = self.testing_timeout_ms
else:
timeout_ms = None
if self.stop_flag.is_set():
logger.debug("Finishing because event is set")
finished = True
except KeyboardInterrupt:
logger.info("Received SIGINT, stopping inotify")
finished = True
inotify.rm_watch(descriptor)
inotify.close()

View File

@@ -16,7 +16,6 @@ from django.core.validators import DecimalValidator
from django.core.validators import MaxLengthValidator
from django.core.validators import RegexValidator
from django.core.validators import integer_validator
from django.utils import timezone
from django.utils.crypto import get_random_string
from django.utils.text import slugify
from django.utils.translation import gettext as _
@@ -647,7 +646,7 @@ class CustomFieldInstanceSerializer(serializers.ModelSerializer):
if custom_field.data_type == CustomField.FieldDataType.DOCUMENTLINK:
# prior to update so we can look for any docs that are going to be removed
self.reflect_doclinks(document, custom_field, validated_data["value"])
bulk_edit.reflect_doclinks(document, custom_field, validated_data["value"])
# Actually update or create the instance, providing the value
# to fill in the correct attribute based on the type
@@ -767,89 +766,6 @@ class CustomFieldInstanceSerializer(serializers.ModelSerializer):
return ret
def reflect_doclinks(
self,
document: Document,
field: CustomField,
target_doc_ids: list[int],
):
"""
Add or remove 'symmetrical' links to `document` on all `target_doc_ids`
"""
if target_doc_ids is None:
target_doc_ids = []
# Check if any documents are going to be removed from the current list of links and remove the symmetrical links
current_field_instance = CustomFieldInstance.objects.filter(
field=field,
document=document,
).first()
if (
current_field_instance is not None
and current_field_instance.value is not None
):
for doc_id in current_field_instance.value:
if doc_id not in target_doc_ids:
self.remove_doclink(document, field, doc_id)
# Create an instance if target doc doesn't have this field or append it to an existing one
existing_custom_field_instances = {
custom_field.document_id: custom_field
for custom_field in CustomFieldInstance.objects.filter(
field=field,
document_id__in=target_doc_ids,
)
}
custom_field_instances_to_create = []
custom_field_instances_to_update = []
for target_doc_id in target_doc_ids:
target_doc_field_instance = existing_custom_field_instances.get(
target_doc_id,
)
if target_doc_field_instance is None:
custom_field_instances_to_create.append(
CustomFieldInstance(
document_id=target_doc_id,
field=field,
value_document_ids=[document.id],
),
)
elif target_doc_field_instance.value is None:
target_doc_field_instance.value_document_ids = [document.id]
custom_field_instances_to_update.append(target_doc_field_instance)
elif document.id not in target_doc_field_instance.value:
target_doc_field_instance.value_document_ids.append(document.id)
custom_field_instances_to_update.append(target_doc_field_instance)
CustomFieldInstance.objects.bulk_create(custom_field_instances_to_create)
CustomFieldInstance.objects.bulk_update(
custom_field_instances_to_update,
["value_document_ids"],
)
Document.objects.filter(id__in=target_doc_ids).update(modified=timezone.now())
@staticmethod
def remove_doclink(
document: Document,
field: CustomField,
target_doc_id: int,
):
"""
Removes a 'symmetrical' link to `document` from the target document's existing custom field instance
"""
target_doc_field_instance = CustomFieldInstance.objects.filter(
document_id=target_doc_id,
field=field,
).first()
if (
target_doc_field_instance is not None
and document.id in target_doc_field_instance.value
):
target_doc_field_instance.value.remove(document.id)
target_doc_field_instance.save()
Document.objects.filter(id=target_doc_id).update(modified=timezone.now())
class Meta:
model = CustomFieldInstance
fields = [
@@ -951,7 +867,7 @@ class DocumentSerializer(
):
# Doc link field is being removed entirely
for doc_id in custom_field_instance.value:
CustomFieldInstanceSerializer.remove_doclink(
bulk_edit.remove_doclink(
instance,
custom_field_instance.field,
doc_id,

View File

@@ -353,7 +353,7 @@ def cleanup_document_deletion(sender, instance, **kwargs):
f"{filename} could not be deleted: {e}",
)
elif filename and not os.path.isfile(filename):
logger.warning(f"Expected {filename} to exist, but it did not")
logger.warn(f"Expected {filename} tp exist, but it did not")
delete_empty_directories(
os.path.dirname(instance.source_path),

View File

@@ -1,6 +1,7 @@
import datetime
import io
import json
import os
import shutil
import zipfile
@@ -14,10 +15,9 @@ from documents.models import Correspondent
from documents.models import Document
from documents.models import DocumentType
from documents.tests.utils import DirectoriesMixin
from documents.tests.utils import SampleDirMixin
class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
class TestBulkDownload(DirectoriesMixin, APITestCase):
ENDPOINT = "/api/documents/bulk_download/"
def setUp(self):
@@ -51,10 +51,22 @@ class TestBulkDownload(DirectoriesMixin, SampleDirMixin, APITestCase):
archive_checksum="D",
)
shutil.copy(self.SAMPLE_DIR / "simple.pdf", self.doc2.source_path)
shutil.copy(self.SAMPLE_DIR / "simple.png", self.doc2b.source_path)
shutil.copy(self.SAMPLE_DIR / "simple.jpg", self.doc3.source_path)
shutil.copy(self.SAMPLE_DIR / "test_with_bom.pdf", self.doc3.archive_path)
shutil.copy(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
self.doc2.source_path,
)
shutil.copy(
os.path.join(os.path.dirname(__file__), "samples", "simple.png"),
self.doc2b.source_path,
)
shutil.copy(
os.path.join(os.path.dirname(__file__), "samples", "simple.jpg"),
self.doc3.source_path,
)
shutil.copy(
os.path.join(os.path.dirname(__file__), "samples", "test_with_bom.pdf"),
self.doc3.archive_path,
)
def test_download_originals(self):
response = self.client.post(

View File

@@ -211,7 +211,7 @@ class TestBulkEditAPI(DirectoriesMixin, APITestCase):
def test_api_modify_tags_not_provided(self, m):
"""
GIVEN:
- API data to modify tags is missing modify_tags field
- API data to modify tags is missing remove_tags field
WHEN:
- API to edit tags is called
THEN:

View File

@@ -1,4 +1,5 @@
import datetime
import os
import shutil
import tempfile
import uuid
@@ -7,7 +8,6 @@ from binascii import hexlify
from datetime import date
from datetime import timedelta
from pathlib import Path
from typing import TYPE_CHECKING
from unittest import mock
import celery
@@ -171,18 +171,19 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
content = b"This is a test"
content_thumbnail = b"thumbnail content"
with Path(filename).open("wb") as f:
with open(filename, "wb") as f:
f.write(content)
doc = Document.objects.create(
title="none",
filename=Path(filename).name,
filename=os.path.basename(filename),
mime_type="application/pdf",
)
if TYPE_CHECKING:
assert isinstance(self.dirs.thumbnail_dir, Path), self.dirs.thumbnail_dir
with (self.dirs.thumbnail_dir / f"{doc.pk:07d}.webp").open("wb") as f:
with open(
os.path.join(self.dirs.thumbnail_dir, f"{doc.pk:07d}.webp"),
"wb",
) as f:
f.write(content_thumbnail)
response = self.client.get(f"/api/documents/{doc.pk}/download/")
@@ -216,7 +217,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
content = b"This is a test"
content_thumbnail = b"thumbnail content"
with Path(filename).open("wb") as f:
with open(filename, "wb") as f:
f.write(content)
user1 = User.objects.create_user(username="test1")
@@ -228,12 +229,15 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
doc = Document.objects.create(
title="none",
filename=Path(filename).name,
filename=os.path.basename(filename),
mime_type="application/pdf",
owner=user1,
)
with (Path(self.dirs.thumbnail_dir) / f"{doc.pk:07d}.webp").open("wb") as f:
with open(
os.path.join(self.dirs.thumbnail_dir, f"{doc.pk:07d}.webp"),
"wb",
) as f:
f.write(content_thumbnail)
response = self.client.get(f"/api/documents/{doc.pk}/download/")
@@ -268,10 +272,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
mime_type="application/pdf",
)
with Path(doc.source_path).open("wb") as f:
with open(doc.source_path, "wb") as f:
f.write(content)
with Path(doc.archive_path).open("wb") as f:
with open(doc.archive_path, "wb") as f:
f.write(content_archive)
response = self.client.get(f"/api/documents/{doc.pk}/download/")
@@ -301,7 +305,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
def test_document_actions_not_existing_file(self):
doc = Document.objects.create(
title="none",
filename=Path("asd").name,
filename=os.path.basename("asd"),
mime_type="application/pdf",
)
@@ -1022,7 +1026,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f},
@@ -1054,7 +1061,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{
@@ -1085,7 +1095,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"documenst": f},
@@ -1098,7 +1111,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.zip").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.zip"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f},
@@ -1111,7 +1127,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "title": "my custom title"},
@@ -1133,7 +1152,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
c = Correspondent.objects.create(name="test-corres")
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "correspondent": c.id},
@@ -1154,7 +1176,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "correspondent": 3456},
@@ -1169,7 +1194,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
dt = DocumentType.objects.create(name="invoice")
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "document_type": dt.id},
@@ -1190,7 +1218,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "document_type": 34578},
@@ -1205,7 +1236,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
)
sp = StoragePath.objects.create(name="invoices")
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "storage_path": sp.id},
@@ -1226,7 +1260,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "storage_path": 34578},
@@ -1242,7 +1279,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
t1 = Tag.objects.create(name="tag1")
t2 = Tag.objects.create(name="tag2")
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "tags": [t2.id, t1.id]},
@@ -1265,7 +1305,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
t1 = Tag.objects.create(name="tag1")
t2 = Tag.objects.create(name="tag2")
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "tags": [t2.id, t1.id, 734563]},
@@ -1289,7 +1332,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
0,
tzinfo=zoneinfo.ZoneInfo("America/Los_Angeles"),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "created": created},
@@ -1307,7 +1353,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f, "archive_serial_number": 500},
@@ -1336,7 +1385,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
data_type=CustomField.FieldDataType.STRING,
)
with (Path(__file__).parent / "samples" / "simple.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{
@@ -1365,7 +1417,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
id=str(uuid.uuid4()),
)
with (Path(__file__).parent / "samples" / "invalid_pdf.pdf").open("rb") as f:
with open(
os.path.join(os.path.dirname(__file__), "samples", "invalid_pdf.pdf"),
"rb",
) as f:
response = self.client.post(
"/api/documents/post_document/",
{"document": f},
@@ -1382,14 +1437,14 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
archive_filename="archive.pdf",
)
source_file: Path = (
Path(__file__).parent
/ "samples"
/ "documents"
/ "thumbnails"
/ "0000001.webp"
source_file = os.path.join(
os.path.dirname(__file__),
"samples",
"documents",
"thumbnails",
"0000001.webp",
)
archive_file: Path = Path(__file__).parent / "samples" / "simple.pdf"
archive_file = os.path.join(os.path.dirname(__file__), "samples", "simple.pdf")
shutil.copy(source_file, doc.source_path)
shutil.copy(archive_file, doc.archive_path)
@@ -1405,8 +1460,8 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
self.assertGreater(len(meta["archive_metadata"]), 0)
self.assertEqual(meta["media_filename"], "file.pdf")
self.assertEqual(meta["archive_media_filename"], "archive.pdf")
self.assertEqual(meta["original_size"], Path(source_file).stat().st_size)
self.assertEqual(meta["archive_size"], Path(archive_file).stat().st_size)
self.assertEqual(meta["original_size"], os.stat(source_file).st_size)
self.assertEqual(meta["archive_size"], os.stat(archive_file).st_size)
response = self.client.get(f"/api/documents/{doc.pk}/metadata/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
@@ -1422,7 +1477,10 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
mime_type="application/pdf",
)
shutil.copy(Path(__file__).parent / "samples" / "simple.pdf", doc.source_path)
shutil.copy(
os.path.join(os.path.dirname(__file__), "samples", "simple.pdf"),
doc.source_path,
)
response = self.client.get(f"/api/documents/{doc.pk}/metadata/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
@@ -1881,9 +1939,9 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
def test_get_logs(self):
log_data = "test\ntest2\n"
with (Path(settings.LOGGING_DIR) / "mail.log").open("w") as f:
with open(os.path.join(settings.LOGGING_DIR, "mail.log"), "w") as f:
f.write(log_data)
with (Path(settings.LOGGING_DIR) / "paperless.log").open("w") as f:
with open(os.path.join(settings.LOGGING_DIR, "paperless.log"), "w") as f:
f.write(log_data)
response = self.client.get("/api/logs/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
@@ -1891,7 +1949,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
def test_get_logs_only_when_exist(self):
log_data = "test\ntest2\n"
with (Path(settings.LOGGING_DIR) / "paperless.log").open("w") as f:
with open(os.path.join(settings.LOGGING_DIR, "paperless.log"), "w") as f:
f.write(log_data)
response = self.client.get("/api/logs/")
self.assertEqual(response.status_code, status.HTTP_200_OK)
@@ -1908,7 +1966,7 @@ class TestDocumentApi(DirectoriesMixin, DocumentConsumeDelayMixin, APITestCase):
def test_get_log(self):
log_data = "test\ntest2\n"
with (Path(settings.LOGGING_DIR) / "paperless.log").open("w") as f:
with open(os.path.join(settings.LOGGING_DIR, "paperless.log"), "w") as f:
f.write(log_data)
response = self.client.get("/api/logs/paperless/")
self.assertEqual(response.status_code, status.HTTP_200_OK)

View File

@@ -268,7 +268,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
cf3 = CustomField.objects.create(
name="cf3",
data_type=CustomField.FieldDataType.STRING,
data_type=CustomField.FieldDataType.DOCUMENTLINK,
)
CustomFieldInstance.objects.create(
document=self.doc2,
@@ -284,7 +284,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
bulk_edit.modify_custom_fields(
[self.doc1.id, self.doc2.id],
add_custom_fields={cf2.id: None, cf3.id: "value"},
add_custom_fields={cf2.id: None, cf3.id: [self.doc3.id]},
remove_custom_fields=[cf.id],
)
@@ -301,7 +301,7 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.assertEqual(
self.doc1.custom_fields.get(field=cf3).value,
"value",
[self.doc3.id],
)
self.assertEqual(
self.doc2.custom_fields.count(),
@@ -309,13 +309,33 @@ class TestBulkEdit(DirectoriesMixin, TestCase):
)
self.assertEqual(
self.doc2.custom_fields.get(field=cf3).value,
"value",
[self.doc3.id],
)
# assert reflect document link
self.assertEqual(
self.doc3.custom_fields.first().value,
[self.doc2.id, self.doc1.id],
)
self.async_task.assert_called_once()
args, kwargs = self.async_task.call_args
self.assertCountEqual(kwargs["document_ids"], [self.doc1.id, self.doc2.id])
# removal of document link cf, should also remove symmetric link
bulk_edit.modify_custom_fields(
[self.doc3.id],
add_custom_fields={},
remove_custom_fields=[cf3.id],
)
self.assertNotIn(
self.doc3.id,
self.doc1.custom_fields.filter(field=cf3).first().value,
)
self.assertNotIn(
self.doc3.id,
self.doc2.custom_fields.filter(field=cf3).first().value,
)
def test_delete(self):
self.assertEqual(Document.objects.count(), 5)
bulk_edit.delete([self.doc1.id, self.doc2.id])

View File

@@ -1,6 +1,6 @@
from typing import Final
__version__: Final[tuple[int, int, int]] = (2, 14, 6)
__version__: Final[tuple[int, int, int]] = (2, 14, 7)
# Version string like X.Y.Z
__full_version_str__: Final[str] = ".".join(map(str, __version__))
# Version string like X.Y