Compare commits

...

31 Commits

Author SHA1 Message Date
shamoon
4f941d3190 Update test_api_tasks.py 2026-01-18 19:30:35 -08:00
shamoon
7f658dc93a Fix/refactor 2026-01-18 16:46:40 -08:00
shamoon
5944c21be5 Backend coverage 2026-01-18 16:27:52 -08:00
shamoon
12ac170a67 Refactor serializer 2026-01-18 16:27:52 -08:00
shamoon
31ba831a9a Frontend coverage 2026-01-18 16:27:52 -08:00
shamoon
47ddb266dd Some random cleanups 2026-01-18 16:27:52 -08:00
shamoon
681ae581bd Fix schema 2026-01-18 16:27:52 -08:00
shamoon
aa4b685a07 Nice, UX for doc in trash 2026-01-18 16:27:52 -08:00
shamoon
cd1070bd3f Make these anchors 2026-01-18 16:27:52 -08:00
shamoon
ef661ae101 Treat CONSUMER_DELETE_DUPLICATES as a hard no 2026-01-18 16:27:52 -08:00
shamoon
b5413525c4 Ok lets make duplicates a tab, nice 2026-01-18 16:27:52 -08:00
shamoon
efbd0c1bfa Drop DuplicateDocument 2026-01-18 16:27:52 -08:00
shamoon
1e595a5aab Core elements, migration, consumer modifications 2026-01-18 16:27:52 -08:00
shamoon
62248f5702 Chore: use consts in doc details 2026-01-18 16:04:51 -08:00
shamoon
fa6a0a81f4 Chore: reverse migration order (#11813) 2026-01-18 11:20:54 -08:00
shamoon
b2541f3e8c Fix: ensure horizontal scroll for long tag names in list, wrap tags without parent (#11811) 2026-01-18 08:21:20 -08:00
dependabot[bot]
f8ab81cef7 Chore(deps): Bump the utilities-patch group across 1 directory with 7 updates (#11793)
Bumps the utilities-patch group with 7 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [channels](https://github.com/django/channels) | `4.3.1` | `4.3.2` |
| [django-soft-delete](https://github.com/san4ezy/django_softdelete) | `1.0.21` | `1.0.22` |
| [django-treenode](https://github.com/fabiocaccamo/django-treenode) | `0.23.2` | `0.23.3` |
| [imap-tools](https://github.com/ikvk/imap_tools) | `1.11.0` | `1.11.1` |
| [python-gnupg](https://github.com/vsajip/python-gnupg) | `0.5.5` | `0.5.6` |
| [mkdocs-material](https://github.com/squidfunk/mkdocs-material) | `9.7.0` | `9.7.1` |
| [ruff](https://github.com/astral-sh/ruff) | `0.14.5` | `0.14.13` |



Updates `channels` from 4.3.1 to 4.3.2
- [Changelog](https://github.com/django/channels/blob/main/CHANGELOG.txt)
- [Commits](https://github.com/django/channels/compare/4.3.1...4.3.2)

Updates `django-soft-delete` from 1.0.21 to 1.0.22
- [Changelog](https://github.com/san4ezy/django_softdelete/blob/master/CHANGELOG.md)
- [Commits](https://github.com/san4ezy/django_softdelete/commits)

Updates `django-treenode` from 0.23.2 to 0.23.3
- [Release notes](https://github.com/fabiocaccamo/django-treenode/releases)
- [Changelog](https://github.com/fabiocaccamo/django-treenode/blob/main/CHANGELOG.md)
- [Commits](https://github.com/fabiocaccamo/django-treenode/compare/0.23.2...0.23.3)

Updates `imap-tools` from 1.11.0 to 1.11.1
- [Release notes](https://github.com/ikvk/imap_tools/releases)
- [Changelog](https://github.com/ikvk/imap_tools/blob/master/docs/release_notes.rst)
- [Commits](https://github.com/ikvk/imap_tools/compare/v1.11.0...v1.11.1)

Updates `python-gnupg` from 0.5.5 to 0.5.6
- [Release notes](https://github.com/vsajip/python-gnupg/releases)
- [Changelog](https://github.com/vsajip/python-gnupg/blob/master/release)
- [Commits](https://github.com/vsajip/python-gnupg/compare/0.5.5...0.5.6)

Updates `mkdocs-material` from 9.7.0 to 9.7.1
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.7.0...9.7.1)

Updates `ruff` from 0.14.5 to 0.14.13
- [Release notes](https://github.com/astral-sh/ruff/releases)
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/ruff/compare/0.14.5...0.14.13)

---
updated-dependencies:
- dependency-name: channels
  dependency-version: 4.3.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: django-soft-delete
  dependency-version: 1.0.22
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: django-treenode
  dependency-version: 0.23.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: imap-tools
  dependency-version: 1.11.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: python-gnupg
  dependency-version: 0.5.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: mkdocs-material
  dependency-version: 9.7.1
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
- dependency-name: ruff
  dependency-version: 0.14.13
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: utilities-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 15:14:01 -08:00
dependabot[bot]
e9f7993ba5 Chore(deps): Bump the utilities-minor group across 1 directory with 10 updates (#11799)
* Chore(deps): Bump the utilities-minor group across 1 directory with 10 updates

Bumps the utilities-minor group with 10 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [django-auditlog](https://github.com/jazzband/django-auditlog) | `3.3.0` | `3.4.1` |
| [drf-spectacular](https://github.com/tfranzel/drf-spectacular) | `0.28.0` | `0.29.0` |
| [faiss-cpu](https://github.com/kyamagu/faiss-wheels) | `1.10.0` | `1.13.2` |
| [gotenberg-client](https://github.com/stumpylog/gotenberg-client) | `0.12.0` | `0.13.1` |
| [ocrmypdf](https://github.com/ocrmypdf/OCRmyPDF) | `16.12.0` | `16.13.0` |
| [torch](https://github.com/pytorch/pytorch) | `2.7.1` | `2.9.1` |
| [psycopg-pool](https://github.com/psycopg/psycopg) | `3.2.7` | `3.3.0` |
| [pre-commit](https://github.com/pre-commit/pre-commit) | `4.4.0` | `4.5.1` |
| [celery-types](https://github.com/sbdchd/celery-types) | `0.23.0` | `0.24.0` |
| [mypy](https://github.com/python/mypy) | `1.18.2` | `1.19.1` |

Updates `django-auditlog` from 3.3.0 to 3.4.1
- [Release notes](https://github.com/jazzband/django-auditlog/releases)
- [Changelog](https://github.com/jazzband/django-auditlog/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jazzband/django-auditlog/compare/v3.3.0...v3.4.1)

Updates `drf-spectacular` from 0.28.0 to 0.29.0
- [Release notes](https://github.com/tfranzel/drf-spectacular/releases)
- [Changelog](https://github.com/tfranzel/drf-spectacular/blob/master/CHANGELOG.rst)
- [Commits](https://github.com/tfranzel/drf-spectacular/compare/0.28.0...0.29.0)

Updates `faiss-cpu` from 1.10.0 to 1.13.2
- [Release notes](https://github.com/kyamagu/faiss-wheels/releases)
- [Commits](https://github.com/kyamagu/faiss-wheels/compare/v1.10.0...v1.13.2)

Updates `gotenberg-client` from 0.12.0 to 0.13.1
- [Release notes](https://github.com/stumpylog/gotenberg-client/releases)
- [Changelog](https://github.com/stumpylog/gotenberg-client/blob/main/CHANGELOG.md)
- [Commits](https://github.com/stumpylog/gotenberg-client/compare/0.12.0...0.13.1)

Updates `ocrmypdf` from 16.12.0 to 16.13.0
- [Release notes](https://github.com/ocrmypdf/OCRmyPDF/releases)
- [Changelog](https://github.com/ocrmypdf/OCRmyPDF/blob/main/docs/release_notes.md)
- [Commits](https://github.com/ocrmypdf/OCRmyPDF/compare/v16.12.0...v16.13.0)

Updates `torch` from 2.7.1 to 2.9.1
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v2.7.1...v2.9.1)

Updates `psycopg-pool` from 3.2.7 to 3.3.0
- [Changelog](https://github.com/psycopg/psycopg/blob/master/docs/news.rst)
- [Commits](https://github.com/psycopg/psycopg/compare/3.2.7...3.3.0)

Updates `pre-commit` from 4.4.0 to 4.5.1
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v4.4.0...v4.5.1)

Updates `celery-types` from 0.23.0 to 0.24.0
- [Commits](https://github.com/sbdchd/celery-types/commits)

Updates `mypy` from 1.18.2 to 1.19.1
- [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/python/mypy/compare/v1.18.2...v1.19.1)

---
updated-dependencies:
- dependency-name: django-auditlog
  dependency-version: 3.4.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: drf-spectacular
  dependency-version: 0.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: faiss-cpu
  dependency-version: 1.13.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: gotenberg-client
  dependency-version: 0.13.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: ocrmypdf
  dependency-version: 16.13.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: torch
  dependency-version: 2.9.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: psycopg-pool
  dependency-version: 3.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: pre-commit
  dependency-version: 4.5.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: celery-types
  dependency-version: 0.24.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
- dependency-name: mypy
  dependency-version: 1.19.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: utilities-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Apply suggestion from @shamoon

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-01-16 14:40:42 -08:00
dependabot[bot]
3ea5e05137 Chore(deps): Bump pyasn1 from 0.6.1 to 0.6.2 (#11801)
Bumps [pyasn1](https://github.com/pyasn1/pyasn1) from 0.6.1 to 0.6.2.
- [Release notes](https://github.com/pyasn1/pyasn1/releases)
- [Changelog](https://github.com/pyasn1/pyasn1/blob/main/CHANGES.rst)
- [Commits](https://github.com/pyasn1/pyasn1/compare/v0.6.1...v0.6.2)

---
updated-dependencies:
- dependency-name: pyasn1
  dependency-version: 0.6.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 14:06:20 -08:00
dependabot[bot]
56fddf1e58 Chore(deps): Bump torch from 2.7.1 to 2.8.0 (#11800)
Bumps [torch](https://github.com/pytorch/pytorch) from 2.7.1 to 2.8.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v2.7.1...v2.8.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.8.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 14:03:04 -08:00
dependabot[bot]
d447a9fb32 docker(deps): Bump astral-sh/uv (#11762)
Bumps [astral-sh/uv](https://github.com/astral-sh/uv) from 0.9.15-python3.12-trixie-slim to 0.9.24-python3.12-trixie-slim.
- [Release notes](https://github.com/astral-sh/uv/releases)
- [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/uv/compare/0.9.15...0.9.24)

---
updated-dependencies:
- dependency-name: astral-sh/uv
  dependency-version: 0.9.24-python3.12-trixie-slim
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 13:43:43 -08:00
dependabot[bot]
155d69b211 Chore(deps): Bump brotli from 1.1.0 to 1.2.0 (#11796)
Bumps [brotli](https://github.com/google/brotli) from 1.1.0 to 1.2.0.
- [Release notes](https://github.com/google/brotli/releases)
- [Changelog](https://github.com/google/brotli/blob/master/CHANGELOG.md)
- [Commits](https://github.com/google/brotli/compare/go/cbrotli/v1.1.0...v1.2.0)

---
updated-dependencies:
- dependency-name: brotli
  dependency-version: 1.2.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 11:16:34 -08:00
dependabot[bot]
4a7f9fa984 Chore(deps): Bump transformers from 4.51.3 to 4.53.0 (#11797)
Bumps [transformers](https://github.com/huggingface/transformers) from 4.51.3 to 4.53.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.51.3...v4.53.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 4.53.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 17:25:52 +00:00
dependabot[bot]
c471c201ee Chore(deps): Bump django from 5.2.7 to 5.2.9 (#11794)
Bumps [django](https://github.com/django/django) from 5.2.7 to 5.2.9.
- [Commits](https://github.com/django/django/compare/5.2.7...5.2.9)

---
updated-dependencies:
- dependency-name: django
  dependency-version: 5.2.9
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 09:11:04 -08:00
dependabot[bot]
a9548afb42 Chore(deps): Bump the ai-group (#11798)
* Chore(deps): Bump llama-index-core from 0.12.33.post1 to 0.13.0

Bumps [llama-index-core](https://github.com/run-llama/llama_index) from 0.12.33.post1 to 0.13.0.
- [Release notes](https://github.com/run-llama/llama_index/releases)
- [Changelog](https://github.com/run-llama/llama_index/blob/main/CHANGELOG.md)
- [Commits](https://github.com/run-llama/llama_index/commits/v0.13.0)

---
updated-dependencies:
- dependency-name: llama-index-core
  dependency-version: 0.13.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update llama-index to latest versions

* Fix embedding mock

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2026-01-16 16:31:47 +00:00
Trenton H
939b2f7553 Chore: Fixes Docker image pushing for every PR we get (#11777) 2026-01-16 07:35:49 -08:00
dependabot[bot]
8b58718fff Chore(deps): Bump marshmallow from 3.26.1 to 3.26.2 (#11790)
Bumps [marshmallow](https://github.com/marshmallow-code/marshmallow) from 3.26.1 to 3.26.2.
- [Changelog](https://github.com/marshmallow-code/marshmallow/blob/dev/CHANGELOG.rst)
- [Commits](https://github.com/marshmallow-code/marshmallow/compare/3.26.1...3.26.2)

---
updated-dependencies:
- dependency-name: marshmallow
  dependency-version: 3.26.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 15:25:09 +00:00
dependabot[bot]
ad78c436c0 Chore(deps): Bump uv from 0.9.3 to 0.9.6 (#11795)
Bumps [uv](https://github.com/astral-sh/uv) from 0.9.3 to 0.9.6.
- [Release notes](https://github.com/astral-sh/uv/releases)
- [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md)
- [Commits](https://github.com/astral-sh/uv/compare/0.9.3...0.9.6)

---
updated-dependencies:
- dependency-name: uv
  dependency-version: 0.9.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-16 07:14:59 -08:00
dependabot[bot]
c6697cd82b Chore(deps): Bump aiohttp from 3.11.18 to 3.13.3 (#11789)
---
updated-dependencies:
- dependency-name: aiohttp
  dependency-version: 3.13.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-15 21:26:01 -08:00
dependabot[bot]
0689c8ad3a Chore(deps): Bump urllib3 from 2.5.0 to 2.6.3 (#11792)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.3)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-15 20:15:52 -08:00
dependabot[bot]
825e9ca14c Chore(deps): Bump virtualenv from 20.34.0 to 20.36.1 (#11774)
Bumps [virtualenv](https://github.com/pypa/virtualenv) from 20.34.0 to 20.36.1.
- [Release notes](https://github.com/pypa/virtualenv/releases)
- [Changelog](https://github.com/pypa/virtualenv/blob/main/docs/changelog.rst)
- [Commits](https://github.com/pypa/virtualenv/compare/20.34.0...20.36.1)

---
updated-dependencies:
- dependency-name: virtualenv
  dependency-version: 20.36.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-15 20:12:12 -08:00
23 changed files with 893 additions and 483 deletions

View File

@@ -35,7 +35,7 @@ jobs:
contents: read
packages: write
outputs:
can-push: ${{ steps.check-push.outputs.can-push }}
should-push: ${{ steps.check-push.outputs.should-push }}
push-external: ${{ steps.check-push.outputs.push-external }}
repository: ${{ steps.repo.outputs.name }}
ref-name: ${{ steps.ref.outputs.name }}
@@ -59,16 +59,28 @@ jobs:
env:
REF_NAME: ${{ steps.ref.outputs.name }}
run: |
# can-push: Can we push to GHCR?
# True for: pushes, or PRs from the same repo (not forks)
can_push=${{ github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository }}
echo "can-push=${can_push}"
echo "can-push=${can_push}" >> $GITHUB_OUTPUT
# should-push: Should we push to GHCR?
# True for:
# 1. Pushes (tags/dev/beta) - filtered via the workflow triggers
# 2. Internal PRs where the branch name starts with 'feature-' - filtered here when a PR is synced
should_push="false"
if [[ "${{ github.event_name }}" == "push" ]]; then
should_push="true"
elif [[ "${{ github.event_name }}" == "pull_request" && "${{ github.event.pull_request.head.repo.full_name }}" == "${{ github.repository }}" ]]; then
if [[ "${REF_NAME}" == feature-* || "${REF_NAME}" == fix-* ]]; then
should_push="true"
fi
fi
echo "should-push=${should_push}"
echo "should-push=${should_push}" >> $GITHUB_OUTPUT
# push-external: Should we also push to Docker Hub and Quay.io?
# Only for main repo on dev/beta branches or version tags
push_external="false"
if [[ "${can_push}" == "true" && "${{ github.repository_owner }}" == "paperless-ngx" ]]; then
if [[ "${should_push}" == "true" && "${{ github.repository_owner }}" == "paperless-ngx" ]]; then
case "${REF_NAME}" in
dev|beta)
push_external="true"
@@ -125,20 +137,20 @@ jobs:
labels: ${{ steps.docker-meta.outputs.labels }}
build-args: |
PNGX_TAG_VERSION=${{ steps.docker-meta.outputs.version }}
outputs: type=image,name=${{ env.REGISTRY }}/${{ steps.repo.outputs.name }},push-by-digest=true,name-canonical=true,push=${{ steps.check-push.outputs.can-push }}
outputs: type=image,name=${{ env.REGISTRY }}/${{ steps.repo.outputs.name }},push-by-digest=true,name-canonical=true,push=${{ steps.check-push.outputs.should-push }}
cache-from: |
type=registry,ref=${{ env.REGISTRY }}/${{ steps.repo.outputs.name }}/cache/app:${{ steps.ref.outputs.cache-ref }}-${{ matrix.arch }}
type=registry,ref=${{ env.REGISTRY }}/${{ steps.repo.outputs.name }}/cache/app:dev-${{ matrix.arch }}
cache-to: ${{ steps.check-push.outputs.can-push == 'true' && format('type=registry,mode=max,ref={0}/{1}/cache/app:{2}-{3}', env.REGISTRY, steps.repo.outputs.name, steps.ref.outputs.cache-ref, matrix.arch) || '' }}
cache-to: ${{ steps.check-push.outputs.should-push == 'true' && format('type=registry,mode=max,ref={0}/{1}/cache/app:{2}-{3}', env.REGISTRY, steps.repo.outputs.name, steps.ref.outputs.cache-ref, matrix.arch) || '' }}
- name: Export digest
if: steps.check-push.outputs.can-push == 'true'
if: steps.check-push.outputs.should-push == 'true'
run: |
mkdir -p /tmp/digests
digest="${{ steps.build.outputs.digest }}"
echo "digest=${digest}"
touch "/tmp/digests/${digest#sha256:}"
- name: Upload digest
if: steps.check-push.outputs.can-push == 'true'
if: steps.check-push.outputs.should-push == 'true'
uses: actions/upload-artifact@v6.0.0
with:
name: digests-${{ matrix.arch }}
@@ -149,7 +161,7 @@ jobs:
name: Merge and Push Manifest
runs-on: ubuntu-24.04
needs: build-arch
if: needs.build-arch.outputs.can-push == 'true'
if: needs.build-arch.outputs.should-push == 'true'
permissions:
contents: read
packages: write

View File

@@ -30,7 +30,7 @@ RUN set -eux \
# Purpose: Installs s6-overlay and rootfs
# Comments:
# - Don't leave anything extra in here either
FROM ghcr.io/astral-sh/uv:0.9.15-python3.12-trixie-slim AS s6-overlay-base
FROM ghcr.io/astral-sh/uv:0.9.26-python3.12-trixie-slim AS s6-overlay-base
WORKDIR /usr/src/s6

View File

@@ -1146,8 +1146,9 @@ via the consumption directory, you can disable the consumer to save resources.
#### [`PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool>`](#PAPERLESS_CONSUMER_DELETE_DUPLICATES) {#PAPERLESS_CONSUMER_DELETE_DUPLICATES}
: When the consumer detects a duplicate document, it will not touch
the original document. This default behavior can be changed here.
: As of version 3.0 Paperless-ngx allows duplicate documents to be consumed by default, _except_ when
this setting is enabled. When enabled, Paperless will check if a document with the same hash already
exists in the system and delete the duplicate file from the consumption directory without consuming it.
Defaults to false.

View File

@@ -28,7 +28,7 @@ dependencies = [
# Only patch versions are guaranteed to not introduce breaking changes.
"django~=5.2.5",
"django-allauth[mfa,socialaccount]~=65.12.1",
"django-auditlog~=3.3.0",
"django-auditlog~=3.4.1",
"django-cachalot~=2.8.0",
"django-celery-results~=2.6.0",
"django-compression-middleware~=0.5.0",
@@ -47,20 +47,20 @@ dependencies = [
"faiss-cpu>=1.10",
"filelock~=3.20.0",
"flower~=2.0.1",
"gotenberg-client~=0.12.0",
"gotenberg-client~=0.13.1",
"httpx-oauth~=0.16",
"imap-tools~=1.11.0",
"inotifyrecursive~=0.3",
"jinja2~=3.1.5",
"langdetect~=1.0.9",
"llama-index-core>=0.12.33.post1",
"llama-index-embeddings-huggingface>=0.5.3",
"llama-index-embeddings-openai>=0.3.1",
"llama-index-llms-ollama>=0.5.4",
"llama-index-llms-openai>=0.3.38",
"llama-index-vector-stores-faiss>=0.3",
"llama-index-core>=0.14.12",
"llama-index-embeddings-huggingface>=0.6.1",
"llama-index-embeddings-openai>=0.5.1",
"llama-index-llms-ollama>=0.9.1",
"llama-index-llms-openai>=0.6.13",
"llama-index-vector-stores-faiss>=0.5.2",
"nltk~=3.9.1",
"ocrmypdf~=16.12.0",
"ocrmypdf~=16.13.0",
"openai>=1.76",
"pathvalidate~=3.3.1",
"pdf2image~=1.17.0",
@@ -77,7 +77,7 @@ dependencies = [
"sentence-transformers>=4.1",
"setproctitle~=1.3.4",
"tika-client~=0.10.0",
"torch~=2.7.0",
"torch~=2.9.1",
"tqdm~=4.67.1",
"watchdog~=6.0",
"whitenoise~=6.9",
@@ -92,7 +92,7 @@ optional-dependencies.postgres = [
"psycopg[c,pool]==3.2.12",
# Direct dependency for proper resolution of the pre-built wheels
"psycopg-c==3.2.12",
"psycopg-pool==3.2.7",
"psycopg-pool==3.3",
]
optional-dependencies.webserver = [
"granian[uvloop]~=2.5.1",
@@ -127,7 +127,7 @@ testing = [
]
lint = [
"pre-commit~=4.4.0",
"pre-commit~=4.5.1",
"pre-commit-uv~=4.2.0",
"ruff~=0.14.0",
]

View File

@@ -97,6 +97,12 @@
<br/><em>(<ng-container i18n>click for full output</ng-container>)</em>
}
</ng-template>
@if (task.duplicate_documents?.length > 0) {
<div class="small text-warning-emphasis d-flex align-items-center gap-1">
<i-bs class="lh-1" width="1em" height="1em" name="exclamation-triangle"></i-bs>
<span i18n>Duplicate(s) detected</span>
</div>
}
</td>
}
<td class="d-lg-none">

View File

@@ -28,7 +28,7 @@
</button>
</ng-template>
<ng-template ng-option-tmp let-item="item" let-index="index" let-search="searchTerm">
<div class="tag-option-row d-flex align-items-center">
<div class="tag-option-row d-flex align-items-center" [class.w-auto]="!getTag(item.id)?.parent">
@if (item.id && tags) {
@if (getTag(item.id)?.parent) {
<i-bs name="list-nested" class="me-1"></i-bs>

View File

@@ -23,7 +23,7 @@
// Dropdown hierarchy reveal for ng-select options
::ng-deep .ng-dropdown-panel .ng-option {
overflow-x: scroll;
overflow-x: scroll !important;
.tag-option-row {
font-size: 1rem;

View File

@@ -370,6 +370,37 @@
</ng-template>
</li>
}
@if (document?.duplicate_documents?.length) {
<li [ngbNavItem]="DocumentDetailNavIDs.Duplicates">
<a class="text-nowrap" ngbNavLink i18n>
Duplicates
<span class="badge text-bg-secondary ms-1">{{ document.duplicate_documents.length }}</span>
</a>
<ng-template ngbNavContent>
<div class="d-flex flex-column gap-2">
<div class="fst-italic" i18n>Duplicate documents detected:</div>
<div class="list-group">
@for (duplicate of document.duplicate_documents; track duplicate.id) {
<a
class="list-group-item list-group-item-action d-flex justify-content-between align-items-center"
[routerLink]="['/documents', duplicate.id, 'details']"
[class.disabled]="duplicate.deleted_at"
>
<span class="d-flex align-items-center gap-2">
<span>{{ duplicate.title || ('#' + duplicate.id) }}</span>
@if (duplicate.deleted_at) {
<span class="badge text-bg-secondary" i18n>In trash</span>
}
</span>
<span class="text-secondary">#{{ duplicate.id }}</span>
</a>
}
</div>
</div>
</ng-template>
</li>
}
</ul>
<div [ngbNavOutlet]="nav" class="mt-3"></div>

View File

@@ -301,16 +301,16 @@ describe('DocumentDetailComponent', () => {
.spyOn(openDocumentsService, 'openDocument')
.mockReturnValueOnce(of(true))
fixture.detectChanges()
expect(component.activeNavID).toEqual(5) // DocumentDetailNavIDs.Notes
expect(component.activeNavID).toEqual(component.DocumentDetailNavIDs.Notes)
})
it('should change url on tab switch', () => {
initNormally()
const navigateSpy = jest.spyOn(router, 'navigate')
component.nav.select(5)
component.nav.select(component.DocumentDetailNavIDs.Notes)
component.nav.navChange.next({
activeId: 1,
nextId: 5,
nextId: component.DocumentDetailNavIDs.Notes,
preventDefault: () => {},
})
fixture.detectChanges()
@@ -352,6 +352,18 @@ describe('DocumentDetailComponent', () => {
expect(component.document).toEqual(doc)
})
it('should fall back to details tab when duplicates tab is active but no duplicates', () => {
initNormally()
component.activeNavID = component.DocumentDetailNavIDs.Duplicates
const noDupDoc = { ...doc, duplicate_documents: [] }
component.updateComponent(noDupDoc)
expect(component.activeNavID).toEqual(
component.DocumentDetailNavIDs.Details
)
})
it('should load already-opened document via param', () => {
initNormally()
jest.spyOn(documentService, 'get').mockReturnValueOnce(of(doc))
@@ -367,6 +379,38 @@ describe('DocumentDetailComponent', () => {
expect(component.document).toEqual(doc)
})
it('should update cached open document duplicates when reloading an open doc', () => {
const openDoc = { ...doc, duplicate_documents: [{ id: 1, title: 'Old' }] }
const updatedDuplicates = [
{ id: 2, title: 'Newer duplicate', deleted_at: null },
]
jest
.spyOn(activatedRoute, 'paramMap', 'get')
.mockReturnValue(of(convertToParamMap({ id: 3, section: 'details' })))
jest.spyOn(documentService, 'get').mockReturnValue(
of({
...doc,
modified: new Date('2024-01-02T00:00:00Z'),
duplicate_documents: updatedDuplicates,
})
)
jest.spyOn(openDocumentsService, 'getOpenDocument').mockReturnValue(openDoc)
const saveSpy = jest.spyOn(openDocumentsService, 'save')
jest.spyOn(openDocumentsService, 'openDocument').mockReturnValue(of(true))
jest.spyOn(customFieldsService, 'listAll').mockReturnValue(
of({
count: customFields.length,
all: customFields.map((f) => f.id),
results: customFields,
})
)
fixture.detectChanges()
expect(openDoc.duplicate_documents).toEqual(updatedDuplicates)
expect(saveSpy).toHaveBeenCalled()
})
it('should disable form if user cannot edit', () => {
currentUserHasObjectPermissions = false
initNormally()

View File

@@ -8,7 +8,7 @@ import {
FormsModule,
ReactiveFormsModule,
} from '@angular/forms'
import { ActivatedRoute, Router } from '@angular/router'
import { ActivatedRoute, Router, RouterModule } from '@angular/router'
import {
NgbDateStruct,
NgbDropdownModule,
@@ -124,6 +124,7 @@ enum DocumentDetailNavIDs {
Notes = 5,
Permissions = 6,
History = 7,
Duplicates = 8,
}
enum ContentRenderType {
@@ -181,6 +182,7 @@ export enum ZoomSetting {
NgxBootstrapIconsModule,
PdfViewerModule,
TextAreaComponent,
RouterModule,
],
})
export class DocumentDetailComponent
@@ -285,10 +287,10 @@ export class DocumentDetailComponent
if (
element &&
element.nativeElement.offsetParent !== null &&
this.nav?.activeId == 4
this.nav?.activeId == DocumentDetailNavIDs.Preview
) {
// its visible
setTimeout(() => this.nav?.select(1))
setTimeout(() => this.nav?.select(DocumentDetailNavIDs.Details))
}
}
@@ -454,6 +456,11 @@ export class DocumentDetailComponent
const openDocument = this.openDocumentService.getOpenDocument(
this.documentId
)
// update duplicate documents if present
if (openDocument && doc?.duplicate_documents) {
openDocument.duplicate_documents = doc.duplicate_documents
this.openDocumentService.save()
}
const useDoc = openDocument || doc
if (openDocument) {
if (
@@ -704,6 +711,13 @@ export class DocumentDetailComponent
}
this.title = this.documentTitlePipe.transform(doc.title)
this.prepareForm(doc)
if (
this.activeNavID === DocumentDetailNavIDs.Duplicates &&
!doc?.duplicate_documents?.length
) {
this.activeNavID = DocumentDetailNavIDs.Details
}
}
get customFieldFormFields(): FormArray {

View File

@@ -159,6 +159,8 @@ export interface Document extends ObjectWithPermissions {
page_count?: number
duplicate_documents?: Document[]
// Frontend only
__changedFields?: string[]
}

View File

@@ -1,3 +1,4 @@
import { Document } from './document'
import { ObjectWithId } from './object-with-id'
export enum PaperlessTaskType {
@@ -42,5 +43,7 @@ export interface PaperlessTask extends ObjectWithId {
related_document?: number
duplicate_documents?: Document[]
owner?: number
}

View File

@@ -785,19 +785,45 @@ class ConsumerPreflightPlugin(
Q(checksum=checksum) | Q(archive_checksum=checksum),
)
if existing_doc.exists():
msg = ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS
log_msg = f"Not consuming {self.filename}: It is a duplicate of {existing_doc.get().title} (#{existing_doc.get().pk})."
existing_doc = existing_doc.order_by("-created")
duplicates_in_trash = existing_doc.filter(deleted_at__isnull=False)
log_msg = (
f"Consuming duplicate {self.filename}: "
f"{existing_doc.count()} existing document(s) share the same content."
)
if existing_doc.first().deleted_at is not None:
msg = ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS_IN_TRASH
log_msg += " Note: existing document is in the trash."
if duplicates_in_trash.exists():
log_msg += " Note: at least one existing document is in the trash."
self.log.warning(log_msg)
if settings.CONSUMER_DELETE_DUPLICATES:
duplicate = existing_doc.first()
duplicate_label = (
duplicate.title
or duplicate.original_filename
or (Path(duplicate.filename).name if duplicate.filename else None)
or str(duplicate.pk)
)
Path(self.input_doc.original_file).unlink()
self._fail(
msg,
log_msg,
)
failure_msg = (
f"Not consuming {self.filename}: "
f"It is a duplicate of {duplicate_label} (#{duplicate.pk})"
)
status_msg = ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS
if duplicates_in_trash.exists():
status_msg = (
ConsumerStatusShortMessage.DOCUMENT_ALREADY_EXISTS_IN_TRASH
)
failure_msg += " Note: existing document is in the trash."
self._fail(
status_msg,
failure_msg,
)
def pre_check_directories(self):
"""

View File

@@ -12,7 +12,7 @@ def populate_action_order(apps, schema_editor):
class Migration(migrations.Migration):
dependencies = [
("documents", "1075_alter_paperlesstask_task_name"),
("documents", "1074_workflowrun_deleted_at_workflowrun_restored_at_and_more"),
]
operations = [

View File

@@ -6,7 +6,7 @@ from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1074_workflowrun_deleted_at_workflowrun_restored_at_and_more"),
("documents", "1075_workflowaction_order"),
]
operations = [

View File

@@ -0,0 +1,23 @@
# Generated by Django 5.2.7 on 2026-01-14 17:45
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1076_alter_paperlesstask_task_name"),
]
operations = [
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
max_length=32,
verbose_name="checksum",
help_text="The checksum of the original document.",
),
),
]

View File

@@ -212,7 +212,6 @@ class Document(SoftDeleteModel, ModelWithOwner):
_("checksum"),
max_length=32,
editable=False,
unique=True,
help_text=_("The checksum of the original document."),
)

View File

@@ -148,13 +148,29 @@ def get_document_count_filter_for_user(user):
)
def get_objects_for_user_owner_aware(user, perms, Model) -> QuerySet:
objects_owned = Model.objects.filter(owner=user)
objects_unowned = Model.objects.filter(owner__isnull=True)
def get_objects_for_user_owner_aware(
user,
perms,
Model,
*,
include_deleted=False,
) -> QuerySet:
"""
Returns objects the user owns, are unowned, or has explicit perms.
When include_deleted is True, soft-deleted items are also included.
"""
manager = (
Model.global_objects
if include_deleted and hasattr(Model, "global_objects")
else Model.objects
)
objects_owned = manager.filter(owner=user)
objects_unowned = manager.filter(owner__isnull=True)
objects_with_perms = get_objects_for_user(
user=user,
perms=perms,
klass=Model,
klass=manager.all(),
accept_global_perms=False,
)
return objects_owned | objects_unowned | objects_with_perms

View File

@@ -23,6 +23,7 @@ from django.core.validators import MinValueValidator
from django.core.validators import RegexValidator
from django.core.validators import integer_validator
from django.db.models import Count
from django.db.models import Q
from django.db.models.functions import Lower
from django.utils.crypto import get_random_string
from django.utils.dateparse import parse_datetime
@@ -72,6 +73,7 @@ from documents.models import WorkflowTrigger
from documents.parsers import is_mime_type_supported
from documents.permissions import get_document_count_filter_for_user
from documents.permissions import get_groups_with_only_permission
from documents.permissions import get_objects_for_user_owner_aware
from documents.permissions import set_permissions_for_object
from documents.regex import validate_regex_pattern
from documents.templating.filepath import validate_filepath_template_and_render
@@ -1014,6 +1016,29 @@ class NotesSerializer(serializers.ModelSerializer):
return ret
def _get_viewable_duplicates(document: Document, user: User | None):
checksums = {document.checksum}
if document.archive_checksum:
checksums.add(document.archive_checksum)
duplicates = Document.global_objects.filter(
Q(checksum__in=checksums) | Q(archive_checksum__in=checksums),
).exclude(pk=document.pk)
duplicates = duplicates.order_by("-created")
allowed = get_objects_for_user_owner_aware(
user,
"documents.view_document",
Document,
include_deleted=True,
)
return duplicates.filter(id__in=allowed.values_list("id", flat=True))
class DuplicateDocumentSummarySerializer(serializers.Serializer):
id = serializers.IntegerField()
title = serializers.CharField()
deleted_at = serializers.DateTimeField(allow_null=True)
@extend_schema_serializer(
deprecate_fields=["created_date"],
)
@@ -1031,6 +1056,7 @@ class DocumentSerializer(
archived_file_name = SerializerMethodField()
created_date = serializers.DateField(required=False)
page_count = SerializerMethodField()
duplicate_documents = SerializerMethodField()
notes = NotesSerializer(many=True, required=False, read_only=True)
@@ -1056,6 +1082,16 @@ class DocumentSerializer(
def get_page_count(self, obj) -> int | None:
return obj.page_count
@extend_schema_field(DuplicateDocumentSummarySerializer(many=True))
def get_duplicate_documents(self, obj):
view = self.context.get("view")
if view and getattr(view, "action", None) != "retrieve":
return []
request = self.context.get("request")
user = request.user if request else None
duplicates = _get_viewable_duplicates(obj, user)
return list(duplicates.values("id", "title", "deleted_at"))
def get_original_file_name(self, obj) -> str | None:
return obj.original_filename
@@ -1233,6 +1269,7 @@ class DocumentSerializer(
"archive_serial_number",
"original_file_name",
"archived_file_name",
"duplicate_documents",
"owner",
"permissions",
"user_can_change",
@@ -2094,10 +2131,12 @@ class TasksViewSerializer(OwnedObjectSerializer):
"result",
"acknowledged",
"related_document",
"duplicate_documents",
"owner",
)
related_document = serializers.SerializerMethodField()
duplicate_documents = serializers.SerializerMethodField()
created_doc_re = re.compile(r"New document id (\d+) created")
duplicate_doc_re = re.compile(r"It is a duplicate of .* \(#(\d+)\)")
@@ -2122,6 +2161,17 @@ class TasksViewSerializer(OwnedObjectSerializer):
return result
@extend_schema_field(DuplicateDocumentSummarySerializer(many=True))
def get_duplicate_documents(self, obj):
related_document = self.get_related_document(obj)
request = self.context.get("request")
user = request.user if request else None
document = Document.global_objects.filter(pk=related_document).first()
if not related_document or not user or not document:
return []
duplicates = _get_viewable_duplicates(document, user)
return list(duplicates.values("id", "title", "deleted_at"))
class RunTaskViewSerializer(serializers.Serializer):
task_name = serializers.ChoiceField(

View File

@@ -7,6 +7,7 @@ from django.contrib.auth.models import User
from rest_framework import status
from rest_framework.test import APITestCase
from documents.models import Document
from documents.models import PaperlessTask
from documents.tests.utils import DirectoriesMixin
from documents.views import TasksViewSet
@@ -258,7 +259,7 @@ class TestTasks(DirectoriesMixin, APITestCase):
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.FAILURE,
result="test.pdf: Not consuming test.pdf: It is a duplicate.",
result="test.pdf: Unexpected error during ingestion.",
)
response = self.client.get(self.ENDPOINT)
@@ -270,7 +271,7 @@ class TestTasks(DirectoriesMixin, APITestCase):
self.assertEqual(
returned_data["result"],
"test.pdf: Not consuming test.pdf: It is a duplicate.",
"test.pdf: Unexpected error during ingestion.",
)
def test_task_name_webui(self):
@@ -325,20 +326,34 @@ class TestTasks(DirectoriesMixin, APITestCase):
self.assertEqual(returned_data["task_file_name"], "anothertest.pdf")
def test_task_result_failed_duplicate_includes_related_doc(self):
def test_task_result_duplicate_warning_includes_count(self):
"""
GIVEN:
- A celery task failed with a duplicate error
- A celery task succeeds, but a duplicate exists
WHEN:
- API call is made to get tasks
THEN:
- The returned data includes a related document link
- The returned data includes duplicate warning metadata
"""
checksum = "duplicate-checksum"
Document.objects.create(
title="Existing",
content="",
mime_type="application/pdf",
checksum=checksum,
archive_checksum="another-checksum",
)
created_doc = Document.objects.create(
title="Created",
content="",
mime_type="application/pdf",
checksum=checksum,
)
PaperlessTask.objects.create(
task_id=str(uuid.uuid4()),
task_file_name="task_one.pdf",
status=celery.states.FAILURE,
result="Not consuming task_one.pdf: It is a duplicate of task_one_existing.pdf (#1234).",
status=celery.states.SUCCESS,
result=f"Success. New document id {created_doc.pk} created",
)
response = self.client.get(self.ENDPOINT)
@@ -348,7 +363,7 @@ class TestTasks(DirectoriesMixin, APITestCase):
returned_data = response.data[0]
self.assertEqual(returned_data["related_document"], "1234")
self.assertEqual(returned_data["related_document"], str(created_doc.pk))
def test_run_train_classifier_task(self):
"""

View File

@@ -485,21 +485,21 @@ class TestConsumer(
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
with self.assertRaisesMessage(ConsumerError, "It is a duplicate"):
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
self._assert_first_last_send_progress(last_status="FAILED")
self.assertEqual(Document.objects.count(), 2)
self._assert_first_last_send_progress()
def testDuplicates2(self):
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
with self.assertRaisesMessage(ConsumerError, "It is a duplicate"):
with self.get_consumer(self.get_test_archive_file()) as consumer:
consumer.run()
with self.get_consumer(self.get_test_archive_file()) as consumer:
consumer.run()
self._assert_first_last_send_progress(last_status="FAILED")
self.assertEqual(Document.objects.count(), 2)
self._assert_first_last_send_progress()
def testDuplicates3(self):
with self.get_consumer(self.get_test_archive_file()) as consumer:
@@ -513,9 +513,10 @@ class TestConsumer(
Document.objects.all().delete()
with self.assertRaisesMessage(ConsumerError, "document is in the trash"):
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
with self.get_consumer(self.get_test_file()) as consumer:
consumer.run()
self.assertEqual(Document.objects.count(), 1)
def testAsnExists(self):
with self.get_consumer(
@@ -718,12 +719,45 @@ class TestConsumer(
dst = self.get_test_file()
self.assertIsFile(dst)
with self.assertRaises(ConsumerError):
expected_message = (
f"{dst.name}: Not consuming {dst.name}: "
f"It is a duplicate of {document.title} (#{document.pk})"
)
with self.assertRaisesMessage(ConsumerError, expected_message):
with self.get_consumer(dst) as consumer:
consumer.run()
self.assertIsNotFile(dst)
self._assert_first_last_send_progress(last_status="FAILED")
self.assertEqual(Document.objects.count(), 1)
self._assert_first_last_send_progress(last_status=ProgressStatusOptions.FAILED)
@override_settings(CONSUMER_DELETE_DUPLICATES=True)
def test_delete_duplicate_in_trash(self):
dst = self.get_test_file()
with self.get_consumer(dst) as consumer:
consumer.run()
# Move the existing document to trash
document = Document.objects.first()
document.delete()
dst = self.get_test_file()
self.assertIsFile(dst)
expected_message = (
f"{dst.name}: Not consuming {dst.name}: "
f"It is a duplicate of {document.title} (#{document.pk})"
f" Note: existing document is in the trash."
)
with self.assertRaisesMessage(ConsumerError, expected_message):
with self.get_consumer(dst) as consumer:
consumer.run()
self.assertIsNotFile(dst)
self.assertEqual(Document.global_objects.count(), 1)
self.assertEqual(Document.objects.count(), 0)
@override_settings(CONSUMER_DELETE_DUPLICATES=False)
def test_no_delete_duplicate(self):
@@ -743,15 +777,12 @@ class TestConsumer(
dst = self.get_test_file()
self.assertIsFile(dst)
with self.assertRaisesRegex(
ConsumerError,
r"sample\.pdf: Not consuming sample\.pdf: It is a duplicate of sample \(#\d+\)",
):
with self.get_consumer(dst) as consumer:
consumer.run()
with self.get_consumer(dst) as consumer:
consumer.run()
self.assertIsFile(dst)
self._assert_first_last_send_progress(last_status="FAILED")
self.assertIsNotFile(dst)
self.assertEqual(Document.objects.count(), 2)
self._assert_first_last_send_progress()
@override_settings(FILENAME_FORMAT="{title}")
@mock.patch("documents.parsers.document_consumer_declaration.send")

View File

@@ -11,14 +11,12 @@ from paperless_ai.chat import stream_chat_with_documents
@pytest.fixture(autouse=True)
def patch_embed_model():
from llama_index.core import settings as llama_settings
from llama_index.core.embeddings.mock_embed_model import MockEmbedding
mock_embed_model = MagicMock()
mock_embed_model._get_text_embedding_batch.return_value = [
[0.1] * 1536,
] # 1 vector per input
llama_settings.Settings._embed_model = mock_embed_model
# Use a real BaseEmbedding subclass to satisfy llama-index 0.14 validation
llama_settings.Settings.embed_model = MockEmbedding(embed_dim=1536)
yield
llama_settings.Settings._embed_model = None
llama_settings.Settings.embed_model = None
@pytest.fixture(autouse=True)

933
uv.lock generated

File diff suppressed because it is too large Load Diff