Files
paperless-ngx/src/documents/tests/test_management_fuzzy.py
dependabot[bot] edb8c06e2a Chore(deps): Bump the django group across 1 directory with 9 updates (#10538)
* Chore(deps): Bump the django group across 1 directory with 9 updates

Bumps the django group with 9 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [django](https://github.com/django/django) | `5.1.8` | `5.2.5` |
| [django-auditlog](https://github.com/jazzband/django-auditlog) | `3.1.2` | `3.2.1` |
| [django-guardian](https://github.com/django-guardian/django-guardian) | `2.4.0` | `3.0.3` |
| [django-multiselectfield](https://github.com/goinnn/django-multiselectfield) | `0.1.13` | `1.0.1` |
| [django-soft-delete](https://github.com/san4ezy/django_softdelete) | `1.0.18` | `1.0.19` |
| [djangorestframework](https://github.com/encode/django-rest-framework) | `3.16.0` | `3.16.1` |
| [djangorestframework-guardian](https://github.com/rpkilby/django-rest-framework-guardian) | `0.3.0` | `0.4.0` |
| [drf-spectacular-sidecar](https://github.com/tfranzel/drf-spectacular-sidecar) | `2025.4.1` | `2025.8.1` |
| [pytest-django](https://github.com/pytest-dev/pytest-django) | `4.10.0` | `4.11.1` |



Updates `django` from 5.1.8 to 5.2.5
- [Commits](https://github.com/django/django/compare/5.1.8...5.2.5)

Updates `django-auditlog` from 3.1.2 to 3.2.1
- [Release notes](https://github.com/jazzband/django-auditlog/releases)
- [Changelog](https://github.com/jazzband/django-auditlog/blob/master/CHANGELOG.md)
- [Commits](https://github.com/jazzband/django-auditlog/compare/v3.1.2...v3.2.1)

Updates `django-guardian` from 2.4.0 to 3.0.3
- [Release notes](https://github.com/django-guardian/django-guardian/releases)
- [Commits](https://github.com/django-guardian/django-guardian/compare/v2.4.0...3.0.3)

Updates `django-multiselectfield` from 0.1.13 to 1.0.1
- [Release notes](https://github.com/goinnn/django-multiselectfield/releases)
- [Changelog](https://github.com/goinnn/django-multiselectfield/blob/master/CHANGES.rst)
- [Commits](https://github.com/goinnn/django-multiselectfield/compare/v0.1.13...v1.0.1)

Updates `django-soft-delete` from 1.0.18 to 1.0.19
- [Changelog](https://github.com/san4ezy/django_softdelete/blob/master/CHANGELOG.md)
- [Commits](https://github.com/san4ezy/django_softdelete/commits)

Updates `djangorestframework` from 3.16.0 to 3.16.1
- [Release notes](https://github.com/encode/django-rest-framework/releases)
- [Commits](https://github.com/encode/django-rest-framework/compare/3.16.0...3.16.1)

Updates `djangorestframework-guardian` from 0.3.0 to 0.4.0
- [Changelog](https://github.com/rpkilby/django-rest-framework-guardian/blob/master/CHANGELOG)
- [Commits](https://github.com/rpkilby/django-rest-framework-guardian/compare/0.3.0...0.4.0)

Updates `drf-spectacular-sidecar` from 2025.4.1 to 2025.8.1
- [Commits](https://github.com/tfranzel/drf-spectacular-sidecar/compare/2025.4.1...2025.8.1)

Updates `pytest-django` from 4.10.0 to 4.11.1
- [Release notes](https://github.com/pytest-dev/pytest-django/releases)
- [Changelog](https://github.com/pytest-dev/pytest-django/blob/main/docs/changelog.rst)
- [Commits](https://github.com/pytest-dev/pytest-django/compare/v4.10.0...v4.11.1)

---
updated-dependencies:
- dependency-name: django
  dependency-version: 5.2.5
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: django
- dependency-name: django-auditlog
  dependency-version: 3.2.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: django
- dependency-name: django-guardian
  dependency-version: 3.0.3
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: django
- dependency-name: django-multiselectfield
  dependency-version: 1.0.1
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: django
- dependency-name: django-soft-delete
  dependency-version: 1.0.19
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: django
- dependency-name: djangorestframework
  dependency-version: 3.16.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: django
- dependency-name: djangorestframework-guardian
  dependency-version: 0.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: django
- dependency-name: drf-spectacular-sidecar
  dependency-version: 2025.8.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: django
- dependency-name: pytest-django
  dependency-version: 4.11.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: django
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix log matches related to newlines, add newlines to stdout.writelines

* Fix disable api remote auth test, Django 5.2 no longer uses process_request

* Remove postgres version check

* Update administration.md

* Handle django-multiselectfield v1.0 changes

* Update administration.md

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2025-08-11 13:45:14 -07:00

209 lines
6.4 KiB
Python

from io import StringIO
from django.core.management import CommandError
from django.core.management import call_command
from django.test import TestCase
from documents.models import Document
class TestFuzzyMatchCommand(TestCase):
MSG_REGEX = r"Document \d fuzzy match to \d \(confidence \d\d\.\d\d\d\)"
def call_command(self, *args, **kwargs):
stdout = StringIO()
stderr = StringIO()
call_command(
"document_fuzzy_match",
"--no-progress-bar",
*args,
stdout=stdout,
stderr=stderr,
**kwargs,
)
return stdout.getvalue(), stderr.getvalue()
def test_invalid_ratio_lower_limit(self):
"""
GIVEN:
- Invalid ratio below lower limit
WHEN:
- Command is called
THEN:
- Error is raised indicating issue
"""
with self.assertRaises(CommandError) as e:
self.call_command("--ratio", "-1")
self.assertIn("The ratio must be between 0 and 100", str(e))
def test_invalid_ratio_upper_limit(self):
"""
GIVEN:s
- Invalid ratio above upper
WHEN:
- Command is called
THEN:
- Error is raised indicating issue
"""
with self.assertRaises(CommandError) as e:
self.call_command("--ratio", "101")
self.assertIn("The ratio must be between 0 and 100", str(e))
def test_invalid_process_count(self):
"""
GIVEN:
- Invalid process count less than 0 above upper
WHEN:
- Command is called
THEN:
- Error is raised indicating issue
"""
with self.assertRaises(CommandError) as e:
self.call_command("--processes", "0")
self.assertIn("There must be at least 1 process", str(e))
def test_no_matches(self):
"""
GIVEN:
- 2 documents exist
- Similarity between content is 82.32
WHEN:
- Command is called
THEN:
- No matches are found
"""
Document.objects.create(
checksum="BEEFCAFE",
title="A",
content="first document",
mime_type="application/pdf",
filename="test.pdf",
)
Document.objects.create(
checksum="DEADBEAF",
title="A",
content="other first document",
mime_type="application/pdf",
filename="other_test.pdf",
)
stdout, _ = self.call_command()
self.assertIn("No matches found", stdout)
def test_with_matches(self):
"""
GIVEN:
- 2 documents exist
- Similarity between content is 86.667
WHEN:
- Command is called
THEN:
- 1 match is returned from doc 1 to doc 2
- No match from doc 2 to doc 1 reported
"""
# Content similarity is 86.667
Document.objects.create(
checksum="BEEFCAFE",
title="A",
content="first document scanned by bob",
mime_type="application/pdf",
filename="test.pdf",
)
Document.objects.create(
checksum="DEADBEAF",
title="A",
content="first document scanned by alice",
mime_type="application/pdf",
filename="other_test.pdf",
)
stdout, _ = self.call_command("--processes", "1")
self.assertRegex(stdout, self.MSG_REGEX)
def test_with_3_matches(self):
"""
GIVEN:
- 3 documents exist
- All documents have similarity over 85.0
WHEN:
- Command is called
THEN:
- 3 matches is returned from each document to the others
- No duplication of matches returned
"""
# Content similarity is 86.667
Document.objects.create(
checksum="BEEFCAFE",
title="A",
content="first document scanned by bob",
mime_type="application/pdf",
filename="test.pdf",
)
Document.objects.create(
checksum="DEADBEAF",
title="A",
content="first document scanned by alice",
mime_type="application/pdf",
filename="other_test.pdf",
)
Document.objects.create(
checksum="CATTLE",
title="A",
content="first document scanned by pete",
mime_type="application/pdf",
filename="final_test.pdf",
)
stdout, _ = self.call_command()
lines = [x.strip() for x in stdout.splitlines() if x.strip()]
self.assertEqual(len(lines), 3)
for line in lines:
self.assertRegex(line, self.MSG_REGEX)
def test_document_deletion(self):
"""
GIVEN:
- 3 documents exist
- Document 1 to document 3 has a similarity over 85.0
WHEN:
- Command is called with the --delete option
THEN:
- User is warned about the deletion flag
- Document 3 is deleted
- Documents 1 and 2 remain
"""
# Content similarity is 86.667
Document.objects.create(
checksum="BEEFCAFE",
title="A",
content="first document scanned by bob",
mime_type="application/pdf",
filename="test.pdf",
)
Document.objects.create(
checksum="DEADBEAF",
title="A",
content="second document scanned by alice",
mime_type="application/pdf",
filename="other_test.pdf",
)
Document.objects.create(
checksum="CATTLE",
title="A",
content="first document scanned by pete",
mime_type="application/pdf",
filename="final_test.pdf",
)
self.assertEqual(Document.objects.count(), 3)
stdout, _ = self.call_command("--delete")
self.assertIn(
"The command is configured to delete documents. Use with caution",
stdout,
)
self.assertRegex(stdout, self.MSG_REGEX)
self.assertIn("Deleting 1 documents based on ratio matches", stdout)
self.assertEqual(Document.objects.count(), 2)
self.assertIsNotNone(Document.objects.get(pk=1))
self.assertIsNotNone(Document.objects.get(pk=2))