mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2026-02-28 01:19:36 -06:00
Compare commits
3 Commits
dependabot
...
feature-py
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
98d5d665f0 | ||
|
|
3fa9e75fa0 | ||
|
|
c94b6ce792 |
@@ -39,6 +39,3 @@ max_line_length = off
|
||||
|
||||
[Dockerfile*]
|
||||
indent_style = space
|
||||
|
||||
[*.toml]
|
||||
indent_style = space
|
||||
|
||||
2
.github/workflows/ci-backend.yml
vendored
2
.github/workflows/ci-backend.yml
vendored
@@ -31,7 +31,7 @@ jobs:
|
||||
runs-on: ubuntu-24.04
|
||||
strategy:
|
||||
matrix:
|
||||
python-version: ['3.10', '3.11', '3.12']
|
||||
python-version: ['3.11', '3.12', '3.13', '3.14']
|
||||
fail-fast: false
|
||||
steps:
|
||||
- name: Checkout
|
||||
|
||||
@@ -62,10 +62,6 @@ copies you created in the steps above.
|
||||
|
||||
## Updating Paperless {#updating}
|
||||
|
||||
!!! warning
|
||||
|
||||
Please review the [migration instructions](migration-v3.md) before upgrading Paperless-ngx to v3.0, it includes some breaking changes that require manual intervention before upgrading.
|
||||
|
||||
### Docker Route {#docker-updating}
|
||||
|
||||
If a new release of paperless-ngx is available, upgrading depends on how
|
||||
|
||||
@@ -51,172 +51,137 @@ matcher.
|
||||
### Database
|
||||
|
||||
By default, Paperless uses **SQLite** with a database stored at `data/db.sqlite3`.
|
||||
For multi-user or higher-throughput deployments, **PostgreSQL** (recommended) or
|
||||
**MariaDB** can be used instead by setting [`PAPERLESS_DBENGINE`](#PAPERLESS_DBENGINE)
|
||||
and the relevant connection variables.
|
||||
|
||||
#### [`PAPERLESS_DBENGINE=<engine>`](#PAPERLESS_DBENGINE) {#PAPERLESS_DBENGINE}
|
||||
|
||||
: Specifies the database engine to use. Accepted values are `sqlite`, `postgresql`,
|
||||
and `mariadb`.
|
||||
|
||||
Defaults to `sqlite` if not set.
|
||||
|
||||
PostgreSQL and MariaDB both require [`PAPERLESS_DBHOST`](#PAPERLESS_DBHOST) to be
|
||||
set. SQLite does not use any other connection variables; the database file is always
|
||||
located at `<PAPERLESS_DATA_DIR>/db.sqlite3`.
|
||||
|
||||
!!! warning
|
||||
Using MariaDB comes with some caveats.
|
||||
See [MySQL Caveats](advanced_usage.md#mysql-caveats).
|
||||
To switch to **PostgreSQL** or **MariaDB**, set [`PAPERLESS_DBHOST`](#PAPERLESS_DBHOST) and optionally configure other
|
||||
database-related environment variables.
|
||||
|
||||
#### [`PAPERLESS_DBHOST=<hostname>`](#PAPERLESS_DBHOST) {#PAPERLESS_DBHOST}
|
||||
|
||||
: Hostname of the PostgreSQL or MariaDB database server. Required when
|
||||
`PAPERLESS_DBENGINE` is `postgresql` or `mariadb`.
|
||||
: If unset, Paperless uses **SQLite** by default.
|
||||
|
||||
Set `PAPERLESS_DBHOST` to switch to PostgreSQL or MariaDB instead.
|
||||
|
||||
#### [`PAPERLESS_DBENGINE=<engine_name>`](#PAPERLESS_DBENGINE) {#PAPERLESS_DBENGINE}
|
||||
|
||||
: Optional. Specifies the database engine to use when connecting to a remote database.
|
||||
Available options are `postgresql` and `mariadb`.
|
||||
|
||||
Defaults to `postgresql` if `PAPERLESS_DBHOST` is set.
|
||||
|
||||
!!! warning
|
||||
|
||||
Using MariaDB comes with some caveats. See [MySQL Caveats](advanced_usage.md#mysql-caveats).
|
||||
|
||||
#### [`PAPERLESS_DBPORT=<port>`](#PAPERLESS_DBPORT) {#PAPERLESS_DBPORT}
|
||||
|
||||
: Port to use when connecting to PostgreSQL or MariaDB.
|
||||
|
||||
Defaults to `5432` for PostgreSQL and `3306` for MariaDB.
|
||||
Default is `5432` for PostgreSQL and `3306` for MariaDB.
|
||||
|
||||
#### [`PAPERLESS_DBNAME=<name>`](#PAPERLESS_DBNAME) {#PAPERLESS_DBNAME}
|
||||
|
||||
: Name of the PostgreSQL or MariaDB database to connect to.
|
||||
: Name of the database to connect to when using PostgreSQL or MariaDB.
|
||||
|
||||
Defaults to `paperless`.
|
||||
Defaults to "paperless".
|
||||
|
||||
#### [`PAPERLESS_DBUSER=<user>`](#PAPERLESS_DBUSER) {#PAPERLESS_DBUSER}
|
||||
#### [`PAPERLESS_DBUSER=<name>`](#PAPERLESS_DBUSER) {#PAPERLESS_DBUSER}
|
||||
|
||||
: Username for authenticating with the PostgreSQL or MariaDB database.
|
||||
|
||||
Defaults to `paperless`.
|
||||
Defaults to "paperless".
|
||||
|
||||
#### [`PAPERLESS_DBPASS=<password>`](#PAPERLESS_DBPASS) {#PAPERLESS_DBPASS}
|
||||
|
||||
: Password for the PostgreSQL or MariaDB database user.
|
||||
|
||||
Defaults to `paperless`.
|
||||
Defaults to "paperless".
|
||||
|
||||
#### [`PAPERLESS_DB_OPTIONS=<options>`](#PAPERLESS_DB_OPTIONS) {#PAPERLESS_DB_OPTIONS}
|
||||
#### [`PAPERLESS_DBSSLMODE=<mode>`](#PAPERLESS_DBSSLMODE) {#PAPERLESS_DBSSLMODE}
|
||||
|
||||
: Advanced database connection options as a semicolon-delimited key-value string.
|
||||
Keys and values are separated by `=`. Dot-notation produces nested option
|
||||
dictionaries; for example, `pool.max_size=20` sets
|
||||
`OPTIONS["pool"]["max_size"] = 20`.
|
||||
: SSL mode to use when connecting to PostgreSQL or MariaDB.
|
||||
|
||||
Options specified here are merged over the engine defaults. Unrecognised keys
|
||||
are passed through to the underlying database driver without validation, so a
|
||||
typo will be silently ignored rather than producing an error.
|
||||
See [the official documentation about
|
||||
sslmode for PostgreSQL](https://www.postgresql.org/docs/current/libpq-ssl.html).
|
||||
|
||||
Refer to your database driver's documentation for the full set of accepted keys:
|
||||
See [the official documentation about
|
||||
sslmode for MySQL and MariaDB](https://dev.mysql.com/doc/refman/8.0/en/connection-options.html#option_general_ssl-mode).
|
||||
|
||||
- PostgreSQL: [libpq connection parameters](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS)
|
||||
- MariaDB: [MariaDB Connector/Python](https://mariadb.com/kb/en/mariadb-connector-python/)
|
||||
- SQLite: [SQLite PRAGMA statements](https://www.sqlite.org/pragma.html)
|
||||
*Note*: SSL mode values differ between PostgreSQL and MariaDB.
|
||||
|
||||
!!! note "PostgreSQL connection pooling"
|
||||
Default is `prefer` for PostgreSQL and `PREFERRED` for MariaDB.
|
||||
|
||||
Pool size is controlled via `pool.min_size` and `pool.max_size`. When
|
||||
configuring pooling, ensure your PostgreSQL `max_connections` is large enough
|
||||
to handle all pool connections across all workers:
|
||||
`(web_workers + celery_workers) * pool.max_size + safety_margin`.
|
||||
#### [`PAPERLESS_DBSSLROOTCERT=<ca-path>`](#PAPERLESS_DBSSLROOTCERT) {#PAPERLESS_DBSSLROOTCERT}
|
||||
|
||||
**Examples:**
|
||||
: Path to the SSL root certificate used to verify the database server.
|
||||
|
||||
```bash title="PostgreSQL: require SSL, set a custom CA certificate, and limit the pool size"
|
||||
PAPERLESS_DB_OPTIONS="sslmode=require;sslrootcert=/certs/ca.pem;pool.max_size=5"
|
||||
```
|
||||
See [the official documentation about
|
||||
sslmode for PostgreSQL](https://www.postgresql.org/docs/current/libpq-ssl.html).
|
||||
Changes the location of `root.crt`.
|
||||
|
||||
```bash title="MariaDB: require SSL with a custom CA certificate"
|
||||
PAPERLESS_DB_OPTIONS="ssl_mode=REQUIRED;ssl.ca=/certs/ca.pem"
|
||||
```
|
||||
See [the official documentation about
|
||||
sslmode for MySQL and MariaDB](https://dev.mysql.com/doc/refman/8.0/en/connection-options.html#option_general_ssl-ca).
|
||||
|
||||
```bash title="SQLite: set a busy timeout of 30 seconds"
|
||||
# PostgreSQL: set a connection timeout
|
||||
PAPERLESS_DB_OPTIONS="connect_timeout=10"
|
||||
```
|
||||
Defaults to unset, using the standard location in the home directory.
|
||||
|
||||
#### ~~[`PAPERLESS_DBSSLMODE`](#PAPERLESS_DBSSLMODE)~~ {#PAPERLESS_DBSSLMODE}
|
||||
#### [`PAPERLESS_DBSSLCERT=<client-cert-path>`](#PAPERLESS_DBSSLCERT) {#PAPERLESS_DBSSLCERT}
|
||||
|
||||
!!! failure "Removed in v3"
|
||||
: Path to the client SSL certificate used when connecting securely.
|
||||
|
||||
Use [`PAPERLESS_DB_OPTIONS`](#PAPERLESS_DB_OPTIONS) instead.
|
||||
See [the official documentation about
|
||||
sslmode for PostgreSQL](https://www.postgresql.org/docs/current/libpq-ssl.html).
|
||||
|
||||
```bash title="PostgreSQL"
|
||||
PAPERLESS_DB_OPTIONS="sslmode=require"
|
||||
```
|
||||
See [the official documentation about
|
||||
sslmode for MySQL and MariaDB](https://dev.mysql.com/doc/refman/8.0/en/connection-options.html#option_general_ssl-cert).
|
||||
|
||||
```bash title="MariaDB"
|
||||
PAPERLESS_DB_OPTIONS="ssl_mode=REQUIRED"
|
||||
```
|
||||
Changes the location of `postgresql.crt`.
|
||||
|
||||
#### ~~[`PAPERLESS_DBSSLROOTCERT`](#PAPERLESS_DBSSLROOTCERT)~~ {#PAPERLESS_DBSSLROOTCERT}
|
||||
Defaults to unset, using the standard location in the home directory.
|
||||
|
||||
!!! failure "Removed in v3"
|
||||
#### [`PAPERLESS_DBSSLKEY=<client-cert-key>`](#PAPERLESS_DBSSLKEY) {#PAPERLESS_DBSSLKEY}
|
||||
|
||||
Use [`PAPERLESS_DB_OPTIONS`](#PAPERLESS_DB_OPTIONS) instead.
|
||||
: Path to the client SSL private key used when connecting securely.
|
||||
|
||||
```bash title="PostgreSQL"
|
||||
PAPERLESS_DB_OPTIONS="sslrootcert=/path/to/ca.pem"
|
||||
```
|
||||
See [the official documentation about
|
||||
sslmode for PostgreSQL](https://www.postgresql.org/docs/current/libpq-ssl.html).
|
||||
|
||||
```bash title="MariaDB"
|
||||
PAPERLESS_DB_OPTIONS="ssl.ca=/path/to/ca.pem"
|
||||
```
|
||||
See [the official documentation about
|
||||
sslmode for MySQL and MariaDB](https://dev.mysql.com/doc/refman/8.0/en/connection-options.html#option_general_ssl-key).
|
||||
|
||||
#### ~~[`PAPERLESS_DBSSLCERT`](#PAPERLESS_DBSSLCERT)~~ {#PAPERLESS_DBSSLCERT}
|
||||
Changes the location of `postgresql.key`.
|
||||
|
||||
!!! failure "Removed in v3"
|
||||
Defaults to unset, using the standard location in the home directory.
|
||||
|
||||
Use [`PAPERLESS_DB_OPTIONS`](#PAPERLESS_DB_OPTIONS) instead.
|
||||
#### [`PAPERLESS_DB_TIMEOUT=<int>`](#PAPERLESS_DB_TIMEOUT) {#PAPERLESS_DB_TIMEOUT}
|
||||
|
||||
```bash title="PostgreSQL"
|
||||
PAPERLESS_DB_OPTIONS="sslcert=/path/to/client.crt"
|
||||
```
|
||||
: Sets how long a database connection should wait before timing out.
|
||||
|
||||
```bash title="MariaDB"
|
||||
PAPERLESS_DB_OPTIONS="ssl.cert=/path/to/client.crt"
|
||||
```
|
||||
For SQLite, this sets how long to wait if the database is locked.
|
||||
For PostgreSQL or MariaDB, this sets the connection timeout.
|
||||
|
||||
#### ~~[`PAPERLESS_DBSSLKEY`](#PAPERLESS_DBSSLKEY)~~ {#PAPERLESS_DBSSLKEY}
|
||||
Defaults to unset, which uses Django’s built-in defaults.
|
||||
|
||||
!!! failure "Removed in v3"
|
||||
#### [`PAPERLESS_DB_POOLSIZE=<int>`](#PAPERLESS_DB_POOLSIZE) {#PAPERLESS_DB_POOLSIZE}
|
||||
|
||||
Use [`PAPERLESS_DB_OPTIONS`](#PAPERLESS_DB_OPTIONS) instead.
|
||||
: Defines the maximum number of database connections to keep in the pool.
|
||||
|
||||
```bash title="PostgreSQL"
|
||||
PAPERLESS_DB_OPTIONS="sslkey=/path/to/client.key"
|
||||
```
|
||||
Only applies to PostgreSQL. This setting is ignored for other database engines.
|
||||
|
||||
```bash title="MariaDB"
|
||||
PAPERLESS_DB_OPTIONS="ssl.key=/path/to/client.key"
|
||||
```
|
||||
The value must be greater than or equal to 1 to be used.
|
||||
Defaults to unset, which disables connection pooling.
|
||||
|
||||
#### ~~[`PAPERLESS_DB_TIMEOUT`](#PAPERLESS_DB_TIMEOUT)~~ {#PAPERLESS_DB_TIMEOUT}
|
||||
!!! note
|
||||
|
||||
!!! failure "Removed in v3"
|
||||
A pool of 8-10 connections per worker is typically sufficient.
|
||||
If you encounter error messages such as `couldn't get a connection`
|
||||
or database connection timeouts, you probably need to increase the pool size.
|
||||
|
||||
Use [`PAPERLESS_DB_OPTIONS`](#PAPERLESS_DB_OPTIONS) instead.
|
||||
!!! warning
|
||||
Make sure your PostgreSQL `max_connections` setting is large enough to handle the connection pools:
|
||||
`(NB_PAPERLESS_WORKERS + NB_CELERY_WORKERS) × POOL_SIZE + SAFETY_MARGIN`. For example, with
|
||||
4 Paperless workers and 2 Celery workers, and a pool size of 8:``(4 + 2) × 8 + 10 = 58`,
|
||||
so `max_connections = 60` (or even more) is appropriate.
|
||||
|
||||
```bash title="SQLite"
|
||||
PAPERLESS_DB_OPTIONS="timeout=30"
|
||||
```
|
||||
|
||||
```bash title="PostgreSQL or MariaDB"
|
||||
PAPERLESS_DB_OPTIONS="connect_timeout=30"
|
||||
```
|
||||
|
||||
#### ~~[`PAPERLESS_DB_POOLSIZE`](#PAPERLESS_DB_POOLSIZE)~~ {#PAPERLESS_DB_POOLSIZE}
|
||||
|
||||
!!! failure "Removed in v3"
|
||||
|
||||
Use [`PAPERLESS_DB_OPTIONS`](#PAPERLESS_DB_OPTIONS) instead.
|
||||
|
||||
```bash
|
||||
PAPERLESS_DB_OPTIONS="pool.max_size=10"
|
||||
```
|
||||
This assumes only Paperless-ngx connects to your PostgreSQL instance. If you have other applications,
|
||||
you should increase `max_connections` accordingly.
|
||||
|
||||
#### [`PAPERLESS_DB_READ_CACHE_ENABLED=<bool>`](#PAPERLESS_DB_READ_CACHE_ENABLED) {#PAPERLESS_DB_READ_CACHE_ENABLED}
|
||||
|
||||
|
||||
@@ -48,58 +48,3 @@ The `CONSUMER_BARCODE_SCANNER` setting has been removed. zxing-cpp is now the on
|
||||
reliability.
|
||||
- The `libzbar0` / `libzbar-dev` system packages are no longer required and can be removed from any custom Docker
|
||||
images or host installations.
|
||||
|
||||
## Database Engine
|
||||
|
||||
`PAPERLESS_DBENGINE` is now required to use PostgreSQL or MariaDB. Previously, the
|
||||
engine was inferred from the presence of `PAPERLESS_DBHOST`, with `PAPERLESS_DBENGINE`
|
||||
only needed to select MariaDB over PostgreSQL.
|
||||
|
||||
SQLite users require no changes, though they may explicitly set their engine if desired.
|
||||
|
||||
#### Action Required
|
||||
|
||||
PostgreSQL and MariaDB users must add `PAPERLESS_DBENGINE` to their environment:
|
||||
|
||||
```yaml
|
||||
# v2 (PostgreSQL inferred from PAPERLESS_DBHOST)
|
||||
PAPERLESS_DBHOST: postgres
|
||||
|
||||
# v3 (engine must be explicit)
|
||||
PAPERLESS_DBENGINE: postgresql
|
||||
PAPERLESS_DBHOST: postgres
|
||||
```
|
||||
|
||||
See [`PAPERLESS_DBENGINE`](configuration.md#PAPERLESS_DBENGINE) for accepted values.
|
||||
|
||||
## Database Advanced Options
|
||||
|
||||
The individual SSL, timeout, and pooling variables have been removed in favor of a
|
||||
single [`PAPERLESS_DB_OPTIONS`](configuration.md#PAPERLESS_DB_OPTIONS) string. This
|
||||
consolidates a growing set of engine-specific variables into one place, and allows
|
||||
any option supported by the underlying database driver to be set without requiring a
|
||||
dedicated environment variable for each.
|
||||
|
||||
The removed variables and their replacements are:
|
||||
|
||||
| Removed Variable | Replacement in `PAPERLESS_DB_OPTIONS` |
|
||||
| ------------------------- | ---------------------------------------------------------------------------- |
|
||||
| `PAPERLESS_DBSSLMODE` | `sslmode=<value>` (PostgreSQL) or `ssl_mode=<value>` (MariaDB) |
|
||||
| `PAPERLESS_DBSSLROOTCERT` | `sslrootcert=<path>` (PostgreSQL) or `ssl.ca=<path>` (MariaDB) |
|
||||
| `PAPERLESS_DBSSLCERT` | `sslcert=<path>` (PostgreSQL) or `ssl.cert=<path>` (MariaDB) |
|
||||
| `PAPERLESS_DBSSLKEY` | `sslkey=<path>` (PostgreSQL) or `ssl.key=<path>` (MariaDB) |
|
||||
| `PAPERLESS_DB_POOLSIZE` | `pool.max_size=<value>` (PostgreSQL only) |
|
||||
| `PAPERLESS_DB_TIMEOUT` | `timeout=<value>` (SQLite) or `connect_timeout=<value>` (PostgreSQL/MariaDB) |
|
||||
|
||||
The deprecated variables will continue to function for now but will be removed in a
|
||||
future release. A deprecation warning is logged at startup for each deprecated variable
|
||||
that is still set.
|
||||
|
||||
#### Action Required
|
||||
|
||||
Users with any of the deprecated variables set should migrate to `PAPERLESS_DB_OPTIONS`.
|
||||
Multiple options are combined in a single value:
|
||||
|
||||
```bash
|
||||
PAPERLESS_DB_OPTIONS="sslmode=require;sslrootcert=/certs/ca.pem;pool.max_size=10"
|
||||
```
|
||||
@@ -504,7 +504,8 @@ installation. Keep these points in mind:
|
||||
- Read the [changelog](changelog.md) and
|
||||
take note of breaking changes.
|
||||
- Decide whether to stay on SQLite or migrate to PostgreSQL.
|
||||
Both work fine with
|
||||
See [documentation](#sqlite_to_psql) for details on moving data
|
||||
from SQLite to PostgreSQL. Both work fine with
|
||||
Paperless. However, if you already have a database server running
|
||||
for other services, you might as well use it for Paperless as well.
|
||||
- The task scheduler of Paperless, which is used to execute periodic
|
||||
|
||||
@@ -3,10 +3,9 @@ name = "paperless-ngx"
|
||||
version = "2.20.8"
|
||||
description = "A community-supported supercharged document management system: scan, index and archive all your physical documents"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.10"
|
||||
requires-python = ">=3.11"
|
||||
classifiers = [
|
||||
"Programming Language :: Python :: 3 :: Only",
|
||||
"Programming Language :: Python :: 3.10",
|
||||
"Programming Language :: Python :: 3.11",
|
||||
"Programming Language :: Python :: 3.12",
|
||||
"Programming Language :: Python :: 3.13",
|
||||
@@ -37,7 +36,6 @@ dependencies = [
|
||||
"django-filter~=25.1",
|
||||
"django-guardian~=3.3.0",
|
||||
"django-multiselectfield~=1.0.1",
|
||||
"django-rich~=2.2.0",
|
||||
"django-soft-delete~=1.0.18",
|
||||
"django-treenode>=0.23.2",
|
||||
"djangorestframework~=3.16",
|
||||
@@ -77,6 +75,7 @@ dependencies = [
|
||||
"setproctitle~=1.3.4",
|
||||
"tika-client~=0.10.0",
|
||||
"torch~=2.10.0",
|
||||
"tqdm~=4.67.1",
|
||||
"watchfiles>=1.1.1",
|
||||
"whitenoise~=6.11",
|
||||
"whoosh-reloaded>=2.7.5",
|
||||
@@ -149,6 +148,7 @@ typing = [
|
||||
"types-pytz",
|
||||
"types-redis",
|
||||
"types-setuptools",
|
||||
"types-tqdm",
|
||||
]
|
||||
|
||||
[tool.uv]
|
||||
@@ -176,7 +176,7 @@ torch = [
|
||||
]
|
||||
|
||||
[tool.ruff]
|
||||
target-version = "py310"
|
||||
target-version = "py311"
|
||||
line-length = 88
|
||||
src = [
|
||||
"src",
|
||||
@@ -303,7 +303,6 @@ markers = [
|
||||
"tika: Tests requiring Tika service",
|
||||
"greenmail: Tests requiring Greenmail service",
|
||||
"date_parsing: Tests which cover date parsing from content or filename",
|
||||
"management: Tests which cover management commands/functionality",
|
||||
]
|
||||
|
||||
[tool.pytest_env]
|
||||
|
||||
@@ -57,7 +57,7 @@
|
||||
}
|
||||
</div>
|
||||
@for (version of versions; track version.id) {
|
||||
<div class="dropdown-item border-top px-0" [class.pe-3]="versions.length === 1">
|
||||
<div class="dropdown-item border-top px-0">
|
||||
<div class="d-flex align-items-center w-100 py-2 version-item">
|
||||
<div class="btn btn-link link-underline link-underline-opacity-0 d-flex align-items-center small text-start p-0 version-link"
|
||||
(click)="selectVersion(version.id)"
|
||||
@@ -88,7 +88,7 @@
|
||||
@if (version.version_label) {
|
||||
{{ version.version_label }}
|
||||
} @else {
|
||||
<ng-container i18n>Version</ng-container> {{ versions.length - $index }} <span class="text-muted small">(#{{ version.id }})</span>
|
||||
<span i18n>Version</span> #{{ version.id }}
|
||||
}
|
||||
</span>
|
||||
}
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
from datetime import UTC
|
||||
from datetime import datetime
|
||||
from datetime import timezone
|
||||
from typing import Any
|
||||
|
||||
from django.conf import settings
|
||||
@@ -139,7 +139,7 @@ def thumbnail_last_modified(request: Any, pk: int) -> datetime | None:
|
||||
# No cache, get the timestamp and cache the datetime
|
||||
last_modified = datetime.fromtimestamp(
|
||||
doc.thumbnail_path.stat().st_mtime,
|
||||
tz=timezone.utc,
|
||||
tz=UTC,
|
||||
)
|
||||
cache.set(doc_key, last_modified, CACHE_50_MINUTES)
|
||||
return last_modified
|
||||
|
||||
@@ -2,7 +2,7 @@ import datetime
|
||||
import hashlib
|
||||
import os
|
||||
import tempfile
|
||||
from enum import Enum
|
||||
from enum import StrEnum
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING
|
||||
from typing import Final
|
||||
@@ -80,7 +80,7 @@ class ConsumerError(Exception):
|
||||
pass
|
||||
|
||||
|
||||
class ConsumerStatusShortMessage(str, Enum):
|
||||
class ConsumerStatusShortMessage(StrEnum):
|
||||
DOCUMENT_ALREADY_EXISTS = "document_already_exists"
|
||||
DOCUMENT_ALREADY_EXISTS_IN_TRASH = "document_already_exists_in_trash"
|
||||
ASN_ALREADY_EXISTS = "asn_already_exists"
|
||||
|
||||
@@ -5,10 +5,10 @@ import math
|
||||
import re
|
||||
from collections import Counter
|
||||
from contextlib import contextmanager
|
||||
from datetime import UTC
|
||||
from datetime import datetime
|
||||
from datetime import time
|
||||
from datetime import timedelta
|
||||
from datetime import timezone
|
||||
from shutil import rmtree
|
||||
from time import sleep
|
||||
from typing import TYPE_CHECKING
|
||||
@@ -437,7 +437,7 @@ class ManualResults:
|
||||
class LocalDateParser(English):
|
||||
def reverse_timezone_offset(self, d):
|
||||
return (d.replace(tzinfo=django_timezone.get_current_timezone())).astimezone(
|
||||
timezone.utc,
|
||||
UTC,
|
||||
)
|
||||
|
||||
def date_from(self, *args, **kwargs):
|
||||
@@ -641,8 +641,8 @@ def rewrite_natural_date_keywords(query_string: str) -> str:
|
||||
end = datetime(local_now.year - 1, 12, 31, 23, 59, 59, tzinfo=tz)
|
||||
|
||||
# Convert to UTC and format
|
||||
start_str = start.astimezone(timezone.utc).strftime("%Y%m%d%H%M%S")
|
||||
end_str = end.astimezone(timezone.utc).strftime("%Y%m%d%H%M%S")
|
||||
start_str = start.astimezone(UTC).strftime("%Y%m%d%H%M%S")
|
||||
end_str = end.astimezone(UTC).strftime("%Y%m%d%H%M%S")
|
||||
return f"{field}:[{start_str} TO {end_str}]"
|
||||
|
||||
return re.sub(pattern, repl, query_string, flags=re.IGNORECASE)
|
||||
|
||||
@@ -1,320 +0,0 @@
|
||||
"""
|
||||
Base command class for Paperless-ngx management commands.
|
||||
|
||||
Provides automatic progress bar and multiprocessing support with minimal boilerplate.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from collections.abc import Iterable
|
||||
from collections.abc import Sized
|
||||
from concurrent.futures import ProcessPoolExecutor
|
||||
from concurrent.futures import as_completed
|
||||
from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING
|
||||
from typing import Any
|
||||
from typing import ClassVar
|
||||
from typing import Generic
|
||||
from typing import TypeVar
|
||||
|
||||
from django import db
|
||||
from django.core.management import CommandError
|
||||
from django.db.models import QuerySet
|
||||
from django_rich.management import RichCommand
|
||||
from rich.console import Console
|
||||
from rich.progress import BarColumn
|
||||
from rich.progress import MofNCompleteColumn
|
||||
from rich.progress import Progress
|
||||
from rich.progress import SpinnerColumn
|
||||
from rich.progress import TextColumn
|
||||
from rich.progress import TimeElapsedColumn
|
||||
from rich.progress import TimeRemainingColumn
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
from collections.abc import Generator
|
||||
from collections.abc import Iterable
|
||||
from collections.abc import Sequence
|
||||
|
||||
from django.core.management import CommandParser
|
||||
|
||||
T = TypeVar("T")
|
||||
R = TypeVar("R")
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class ProcessResult(Generic[T, R]):
|
||||
"""
|
||||
Result of processing a single item in parallel.
|
||||
|
||||
Attributes:
|
||||
item: The input item that was processed.
|
||||
result: The return value from the processing function, or None if an error occurred.
|
||||
error: The exception if processing failed, or None on success.
|
||||
"""
|
||||
|
||||
item: T
|
||||
result: R | None
|
||||
error: BaseException | None
|
||||
|
||||
@property
|
||||
def success(self) -> bool:
|
||||
"""Return True if the item was processed successfully."""
|
||||
return self.error is None
|
||||
|
||||
|
||||
class PaperlessCommand(RichCommand):
|
||||
"""
|
||||
Base command class with automatic progress bar and multiprocessing support.
|
||||
|
||||
Features are opt-in via class attributes:
|
||||
supports_progress_bar: Adds --no-progress-bar argument (default: True)
|
||||
supports_multiprocessing: Adds --processes argument (default: False)
|
||||
|
||||
Example usage:
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
help = "Process all documents"
|
||||
|
||||
def handle(self, *args, **options):
|
||||
documents = Document.objects.all()
|
||||
for doc in self.track(documents, description="Processing..."):
|
||||
process_document(doc)
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
help = "Regenerate thumbnails"
|
||||
supports_multiprocessing = True
|
||||
|
||||
def handle(self, *args, **options):
|
||||
ids = list(Document.objects.values_list("id", flat=True))
|
||||
for result in self.process_parallel(process_doc, ids):
|
||||
if result.error:
|
||||
self.console.print(f"[red]Failed: {result.error}[/red]")
|
||||
"""
|
||||
|
||||
supports_progress_bar: ClassVar[bool] = True
|
||||
supports_multiprocessing: ClassVar[bool] = False
|
||||
|
||||
# Instance attributes set by execute() before handle() runs
|
||||
no_progress_bar: bool
|
||||
process_count: int
|
||||
|
||||
def add_arguments(self, parser: CommandParser) -> None:
|
||||
"""Add arguments based on supported features."""
|
||||
super().add_arguments(parser)
|
||||
|
||||
if self.supports_progress_bar:
|
||||
parser.add_argument(
|
||||
"--no-progress-bar",
|
||||
default=False,
|
||||
action="store_true",
|
||||
help="Disable the progress bar",
|
||||
)
|
||||
|
||||
if self.supports_multiprocessing:
|
||||
default_processes = max(1, (os.cpu_count() or 1) // 4)
|
||||
parser.add_argument(
|
||||
"--processes",
|
||||
default=default_processes,
|
||||
type=int,
|
||||
help=f"Number of processes to use (default: {default_processes})",
|
||||
)
|
||||
|
||||
def execute(self, *args: Any, **options: Any) -> str | None:
|
||||
"""
|
||||
Set up instance state before handle() is called.
|
||||
|
||||
This is called by Django's command infrastructure after argument parsing
|
||||
but before handle(). We use it to set instance attributes from options.
|
||||
"""
|
||||
# Set progress bar state
|
||||
if self.supports_progress_bar:
|
||||
self.no_progress_bar = options.get("no_progress_bar", False)
|
||||
else:
|
||||
self.no_progress_bar = True
|
||||
|
||||
# Set multiprocessing state
|
||||
if self.supports_multiprocessing:
|
||||
self.process_count = options.get("processes", 1)
|
||||
if self.process_count < 1:
|
||||
raise CommandError("--processes must be at least 1")
|
||||
else:
|
||||
self.process_count = 1
|
||||
|
||||
return super().execute(*args, **options)
|
||||
|
||||
def _create_progress(self, description: str) -> Progress:
|
||||
"""
|
||||
Create a configured Progress instance.
|
||||
|
||||
Progress output is directed to stderr to match the convention that
|
||||
progress bars are transient UI feedback, not command output. This
|
||||
mirrors tqdm's default behavior and prevents progress bar rendering
|
||||
from interfering with stdout-based assertions in tests or piped
|
||||
command output.
|
||||
|
||||
Args:
|
||||
description: Text to display alongside the progress bar.
|
||||
|
||||
Returns:
|
||||
A Progress instance configured with appropriate columns.
|
||||
"""
|
||||
return Progress(
|
||||
SpinnerColumn(),
|
||||
TextColumn("[progress.description]{task.description}"),
|
||||
BarColumn(),
|
||||
MofNCompleteColumn(),
|
||||
TimeElapsedColumn(),
|
||||
TimeRemainingColumn(),
|
||||
console=Console(stderr=True),
|
||||
transient=False,
|
||||
)
|
||||
|
||||
def _get_iterable_length(self, iterable: Iterable[object]) -> int | None:
|
||||
"""
|
||||
Attempt to determine the length of an iterable without consuming it.
|
||||
|
||||
Tries .count() first (for Django querysets - executes SELECT COUNT(*)),
|
||||
then falls back to len() for sequences.
|
||||
|
||||
Args:
|
||||
iterable: The iterable to measure.
|
||||
|
||||
Returns:
|
||||
The length if determinable, None otherwise.
|
||||
"""
|
||||
if isinstance(iterable, QuerySet):
|
||||
return iterable.count()
|
||||
|
||||
if isinstance(iterable, Sized):
|
||||
return len(iterable)
|
||||
|
||||
return None
|
||||
|
||||
def track(
|
||||
self,
|
||||
iterable: Iterable[T],
|
||||
*,
|
||||
description: str = "Processing...",
|
||||
total: int | None = None,
|
||||
) -> Generator[T, None, None]:
|
||||
"""
|
||||
Iterate over items with an optional progress bar.
|
||||
|
||||
Respects --no-progress-bar flag. When disabled, simply yields items
|
||||
without any progress display.
|
||||
|
||||
Args:
|
||||
iterable: The items to iterate over.
|
||||
description: Text to display alongside the progress bar.
|
||||
total: Total number of items. If None, attempts to determine
|
||||
automatically via .count() (for querysets) or len().
|
||||
|
||||
Yields:
|
||||
Items from the iterable.
|
||||
|
||||
Example:
|
||||
for doc in self.track(documents, description="Renaming..."):
|
||||
process(doc)
|
||||
"""
|
||||
if self.no_progress_bar:
|
||||
yield from iterable
|
||||
return
|
||||
|
||||
# Attempt to determine total if not provided
|
||||
if total is None:
|
||||
total = self._get_iterable_length(iterable)
|
||||
|
||||
with self._create_progress(description) as progress:
|
||||
task_id = progress.add_task(description, total=total)
|
||||
for item in iterable:
|
||||
yield item
|
||||
progress.advance(task_id)
|
||||
|
||||
def process_parallel(
|
||||
self,
|
||||
fn: Callable[[T], R],
|
||||
items: Sequence[T],
|
||||
*,
|
||||
description: str = "Processing...",
|
||||
) -> Generator[ProcessResult[T, R], None, None]:
|
||||
"""
|
||||
Process items in parallel with progress tracking.
|
||||
|
||||
When --processes=1, runs sequentially in the main process without
|
||||
spawning subprocesses. This is critical for testing, as multiprocessing
|
||||
breaks fixtures, mocks, and database transactions.
|
||||
|
||||
When --processes > 1, uses ProcessPoolExecutor and automatically closes
|
||||
database connections before spawning workers (required for PostgreSQL).
|
||||
|
||||
Args:
|
||||
fn: Function to apply to each item. Must be picklable for parallel
|
||||
execution (i.e., defined at module level, not a lambda or closure).
|
||||
items: Sequence of items to process.
|
||||
description: Text to display alongside the progress bar.
|
||||
|
||||
Yields:
|
||||
ProcessResult for each item, containing the item, result, and any error.
|
||||
|
||||
Example:
|
||||
def regenerate_thumbnail(doc_id: int) -> Path:
|
||||
...
|
||||
|
||||
for result in self.process_parallel(regenerate_thumbnail, doc_ids):
|
||||
if result.error:
|
||||
self.console.print(f"[red]Failed {result.item}[/red]")
|
||||
"""
|
||||
total = len(items)
|
||||
|
||||
if self.process_count == 1:
|
||||
# Sequential execution in main process - critical for testing
|
||||
yield from self._process_sequential(fn, items, description, total)
|
||||
else:
|
||||
# Parallel execution with ProcessPoolExecutor
|
||||
yield from self._process_parallel(fn, items, description, total)
|
||||
|
||||
def _process_sequential(
|
||||
self,
|
||||
fn: Callable[[T], R],
|
||||
items: Sequence[T],
|
||||
description: str,
|
||||
total: int,
|
||||
) -> Generator[ProcessResult[T, R], None, None]:
|
||||
"""Process items sequentially in the main process."""
|
||||
for item in self.track(items, description=description, total=total):
|
||||
try:
|
||||
result = fn(item)
|
||||
yield ProcessResult(item=item, result=result, error=None)
|
||||
except Exception as e:
|
||||
yield ProcessResult(item=item, result=None, error=e)
|
||||
|
||||
def _process_parallel(
|
||||
self,
|
||||
fn: Callable[[T], R],
|
||||
items: Sequence[T],
|
||||
description: str,
|
||||
total: int,
|
||||
) -> Generator[ProcessResult[T, R], None, None]:
|
||||
"""Process items in parallel using ProcessPoolExecutor."""
|
||||
# Close database connections before forking - required for PostgreSQL
|
||||
db.connections.close_all()
|
||||
|
||||
with self._create_progress(description) as progress:
|
||||
task_id = progress.add_task(description, total=total)
|
||||
|
||||
with ProcessPoolExecutor(max_workers=self.process_count) as executor:
|
||||
# Submit all tasks and map futures back to items
|
||||
future_to_item = {executor.submit(fn, item): item for item in items}
|
||||
|
||||
# Yield results as they complete
|
||||
for future in as_completed(future_to_item):
|
||||
item = future_to_item[future]
|
||||
try:
|
||||
result = future.result()
|
||||
yield ProcessResult(item=item, result=result, error=None)
|
||||
except Exception as e:
|
||||
yield ProcessResult(item=item, result=None, error=e)
|
||||
finally:
|
||||
progress.advance(task_id)
|
||||
@@ -1,15 +1,20 @@
|
||||
import logging
|
||||
import multiprocessing
|
||||
|
||||
import tqdm
|
||||
from django import db
|
||||
from django.conf import settings
|
||||
from django.core.management.base import BaseCommand
|
||||
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
from documents.management.commands.mixins import MultiProcessMixin
|
||||
from documents.management.commands.mixins import ProgressBarMixin
|
||||
from documents.models import Document
|
||||
from documents.tasks import update_document_content_maybe_archive_file
|
||||
|
||||
logger = logging.getLogger("paperless.management.archiver")
|
||||
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
class Command(MultiProcessMixin, ProgressBarMixin, BaseCommand):
|
||||
help = (
|
||||
"Using the current classification model, assigns correspondents, tags "
|
||||
"and document types to all documents, effectively allowing you to "
|
||||
@@ -17,10 +22,7 @@ class Command(PaperlessCommand):
|
||||
"modified) after their initial import."
|
||||
)
|
||||
|
||||
supports_multiprocessing = True
|
||||
|
||||
def add_arguments(self, parser):
|
||||
super().add_arguments(parser)
|
||||
parser.add_argument(
|
||||
"-f",
|
||||
"--overwrite",
|
||||
@@ -42,8 +44,13 @@ class Command(PaperlessCommand):
|
||||
"run on this specific document."
|
||||
),
|
||||
)
|
||||
self.add_argument_progress_bar_mixin(parser)
|
||||
self.add_argument_processes_mixin(parser)
|
||||
|
||||
def handle(self, *args, **options):
|
||||
self.handle_processes_mixin(**options)
|
||||
self.handle_progress_bar_mixin(**options)
|
||||
|
||||
settings.SCRATCH_DIR.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
overwrite = options["overwrite"]
|
||||
@@ -53,21 +60,35 @@ class Command(PaperlessCommand):
|
||||
else:
|
||||
documents = Document.objects.all()
|
||||
|
||||
document_ids = [
|
||||
doc.id for doc in documents if overwrite or not doc.has_archive_version
|
||||
]
|
||||
document_ids = list(
|
||||
map(
|
||||
lambda doc: doc.id,
|
||||
filter(lambda d: overwrite or not d.has_archive_version, documents),
|
||||
),
|
||||
)
|
||||
|
||||
# Note to future self: this prevents django from reusing database
|
||||
# connections between processes, which is bad and does not work
|
||||
# with postgres.
|
||||
db.connections.close_all()
|
||||
|
||||
try:
|
||||
logging.getLogger().handlers[0].level = logging.ERROR
|
||||
|
||||
for result in self.process_parallel(
|
||||
update_document_content_maybe_archive_file,
|
||||
document_ids,
|
||||
description="Archiving...",
|
||||
):
|
||||
if result.error:
|
||||
self.console.print(
|
||||
f"[red]Failed document {result.item}: {result.error}[/red]",
|
||||
if self.process_count == 1:
|
||||
for doc_id in document_ids:
|
||||
update_document_content_maybe_archive_file(doc_id)
|
||||
else: # pragma: no cover
|
||||
with multiprocessing.Pool(self.process_count) as pool:
|
||||
list(
|
||||
tqdm.tqdm(
|
||||
pool.imap_unordered(
|
||||
update_document_content_maybe_archive_file,
|
||||
document_ids,
|
||||
),
|
||||
total=len(document_ids),
|
||||
disable=self.no_progress_bar,
|
||||
),
|
||||
)
|
||||
except KeyboardInterrupt: # pragma: no cover
|
||||
self.console.print("[yellow]Aborting...[/yellow]")
|
||||
except KeyboardInterrupt:
|
||||
self.stdout.write(self.style.NOTICE("Aborting..."))
|
||||
|
||||
@@ -1,20 +1,24 @@
|
||||
import dataclasses
|
||||
import multiprocessing
|
||||
from typing import Final
|
||||
|
||||
import rapidfuzz
|
||||
import tqdm
|
||||
from django.core.management import BaseCommand
|
||||
from django.core.management import CommandError
|
||||
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
from documents.management.commands.mixins import MultiProcessMixin
|
||||
from documents.management.commands.mixins import ProgressBarMixin
|
||||
from documents.models import Document
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True, slots=True)
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class _WorkPackage:
|
||||
first_doc: Document
|
||||
second_doc: Document
|
||||
|
||||
|
||||
@dataclasses.dataclass(frozen=True, slots=True)
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class _WorkResult:
|
||||
doc_one_pk: int
|
||||
doc_two_pk: int
|
||||
@@ -27,23 +31,22 @@ class _WorkResult:
|
||||
def _process_and_match(work: _WorkPackage) -> _WorkResult:
|
||||
"""
|
||||
Does basic processing of document content, gets the basic ratio
|
||||
and returns the result package.
|
||||
and returns the result package
|
||||
"""
|
||||
# Normalize the string some, lower case, whitespace, etc
|
||||
first_string = rapidfuzz.utils.default_process(work.first_doc.content)
|
||||
second_string = rapidfuzz.utils.default_process(work.second_doc.content)
|
||||
|
||||
# Basic matching ratio
|
||||
match = rapidfuzz.fuzz.ratio(first_string, second_string)
|
||||
|
||||
return _WorkResult(work.first_doc.pk, work.second_doc.pk, match)
|
||||
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
class Command(MultiProcessMixin, ProgressBarMixin, BaseCommand):
|
||||
help = "Searches for documents where the content almost matches"
|
||||
|
||||
supports_multiprocessing = True
|
||||
|
||||
def add_arguments(self, parser):
|
||||
super().add_arguments(parser)
|
||||
parser.add_argument(
|
||||
"--ratio",
|
||||
default=85.0,
|
||||
@@ -56,11 +59,16 @@ class Command(PaperlessCommand):
|
||||
action="store_true",
|
||||
help="If set, one document of matches above the ratio WILL BE DELETED",
|
||||
)
|
||||
self.add_argument_progress_bar_mixin(parser)
|
||||
self.add_argument_processes_mixin(parser)
|
||||
|
||||
def handle(self, *args, **options):
|
||||
RATIO_MIN: Final[float] = 0.0
|
||||
RATIO_MAX: Final[float] = 100.0
|
||||
|
||||
self.handle_processes_mixin(**options)
|
||||
self.handle_progress_bar_mixin(**options)
|
||||
|
||||
if options["delete"]:
|
||||
self.stdout.write(
|
||||
self.style.WARNING(
|
||||
@@ -72,58 +80,66 @@ class Command(PaperlessCommand):
|
||||
checked_pairs: set[tuple[int, int]] = set()
|
||||
work_pkgs: list[_WorkPackage] = []
|
||||
|
||||
# Ratio is a float from 0.0 to 100.0
|
||||
if opt_ratio < RATIO_MIN or opt_ratio > RATIO_MAX:
|
||||
raise CommandError("The ratio must be between 0 and 100")
|
||||
|
||||
all_docs = Document.objects.all().order_by("id")
|
||||
|
||||
# Build work packages for processing
|
||||
for first_doc in all_docs:
|
||||
for second_doc in all_docs:
|
||||
# doc to doc is obviously not useful
|
||||
if first_doc.pk == second_doc.pk:
|
||||
continue
|
||||
# Skip empty documents (e.g. password-protected)
|
||||
if first_doc.content.strip() == "" or second_doc.content.strip() == "":
|
||||
continue
|
||||
# Skip matching which have already been matched together
|
||||
# doc 1 to doc 2 is the same as doc 2 to doc 1
|
||||
doc_1_to_doc_2 = (first_doc.pk, second_doc.pk)
|
||||
doc_2_to_doc_1 = doc_1_to_doc_2[::-1]
|
||||
if doc_1_to_doc_2 in checked_pairs or doc_2_to_doc_1 in checked_pairs:
|
||||
continue
|
||||
checked_pairs.update([doc_1_to_doc_2, doc_2_to_doc_1])
|
||||
# Actually something useful to work on now
|
||||
work_pkgs.append(_WorkPackage(first_doc, second_doc))
|
||||
|
||||
results: list[_WorkResult] = []
|
||||
# Don't spin up a pool of 1 process
|
||||
if self.process_count == 1:
|
||||
for work in self.track(work_pkgs, description="Matching..."):
|
||||
results = []
|
||||
for work in tqdm.tqdm(work_pkgs, disable=self.no_progress_bar):
|
||||
results.append(_process_and_match(work))
|
||||
else: # pragma: no cover
|
||||
for proc_result in self.process_parallel(
|
||||
_process_and_match,
|
||||
work_pkgs,
|
||||
description="Matching...",
|
||||
):
|
||||
if proc_result.error:
|
||||
self.console.print(
|
||||
f"[red]Failed: {proc_result.error}[/red]",
|
||||
)
|
||||
elif proc_result.result is not None:
|
||||
results.append(proc_result.result)
|
||||
|
||||
messages: list[str] = []
|
||||
maybe_delete_ids: list[int] = []
|
||||
for match_result in sorted(results):
|
||||
if match_result.ratio >= opt_ratio:
|
||||
messages.append(
|
||||
self.style.NOTICE(
|
||||
f"Document {match_result.doc_one_pk} fuzzy match"
|
||||
f" to {match_result.doc_two_pk}"
|
||||
f" (confidence {match_result.ratio:.3f})\n",
|
||||
with multiprocessing.Pool(processes=self.process_count) as pool:
|
||||
results = list(
|
||||
tqdm.tqdm(
|
||||
pool.imap_unordered(_process_and_match, work_pkgs),
|
||||
total=len(work_pkgs),
|
||||
disable=self.no_progress_bar,
|
||||
),
|
||||
)
|
||||
maybe_delete_ids.append(match_result.doc_two_pk)
|
||||
|
||||
# Check results
|
||||
messages = []
|
||||
maybe_delete_ids = []
|
||||
for result in sorted(results):
|
||||
if result.ratio >= opt_ratio:
|
||||
messages.append(
|
||||
self.style.NOTICE(
|
||||
f"Document {result.doc_one_pk} fuzzy match"
|
||||
f" to {result.doc_two_pk} (confidence {result.ratio:.3f})\n",
|
||||
),
|
||||
)
|
||||
maybe_delete_ids.append(result.doc_two_pk)
|
||||
|
||||
if len(messages) == 0:
|
||||
messages.append(self.style.SUCCESS("No matches found\n"))
|
||||
self.stdout.writelines(messages)
|
||||
|
||||
messages.append(
|
||||
self.style.SUCCESS("No matches found\n"),
|
||||
)
|
||||
self.stdout.writelines(
|
||||
messages,
|
||||
)
|
||||
if options["delete"]:
|
||||
self.stdout.write(
|
||||
self.style.NOTICE(
|
||||
|
||||
@@ -1,12 +1,25 @@
|
||||
import logging
|
||||
|
||||
import tqdm
|
||||
from django.core.management.base import BaseCommand
|
||||
from django.db.models.signals import post_save
|
||||
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
from documents.management.commands.mixins import ProgressBarMixin
|
||||
from documents.models import Document
|
||||
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
help = "Rename all documents"
|
||||
class Command(ProgressBarMixin, BaseCommand):
|
||||
help = "This will rename all documents to match the latest filename format."
|
||||
|
||||
def add_arguments(self, parser):
|
||||
self.add_argument_progress_bar_mixin(parser)
|
||||
|
||||
def handle(self, *args, **options):
|
||||
for document in self.track(Document.objects.all(), description="Renaming..."):
|
||||
self.handle_progress_bar_mixin(**options)
|
||||
logging.getLogger().handlers[0].level = logging.ERROR
|
||||
|
||||
for document in tqdm.tqdm(
|
||||
Document.objects.all(),
|
||||
disable=self.no_progress_bar,
|
||||
):
|
||||
post_save.send(Document, instance=document, created=False)
|
||||
|
||||
@@ -1,7 +1,10 @@
|
||||
import logging
|
||||
|
||||
import tqdm
|
||||
from django.core.management.base import BaseCommand
|
||||
|
||||
from documents.classifier import load_classifier
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
from documents.management.commands.mixins import ProgressBarMixin
|
||||
from documents.models import Document
|
||||
from documents.signals.handlers import set_correspondent
|
||||
from documents.signals.handlers import set_document_type
|
||||
@@ -11,7 +14,7 @@ from documents.signals.handlers import set_tags
|
||||
logger = logging.getLogger("paperless.management.retagger")
|
||||
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
class Command(ProgressBarMixin, BaseCommand):
|
||||
help = (
|
||||
"Using the current classification model, assigns correspondents, tags "
|
||||
"and document types to all documents, effectively allowing you to "
|
||||
@@ -20,7 +23,6 @@ class Command(PaperlessCommand):
|
||||
)
|
||||
|
||||
def add_arguments(self, parser):
|
||||
super().add_arguments(parser)
|
||||
parser.add_argument("-c", "--correspondent", default=False, action="store_true")
|
||||
parser.add_argument("-T", "--tags", default=False, action="store_true")
|
||||
parser.add_argument("-t", "--document_type", default=False, action="store_true")
|
||||
@@ -32,7 +34,7 @@ class Command(PaperlessCommand):
|
||||
action="store_true",
|
||||
help=(
|
||||
"By default this command won't try to assign a correspondent "
|
||||
"if more than one matches the document. Use this flag if "
|
||||
"if more than one matches the document. Use this flag if "
|
||||
"you'd rather it just pick the first one it finds."
|
||||
),
|
||||
)
|
||||
@@ -47,6 +49,7 @@ class Command(PaperlessCommand):
|
||||
"and tags that do not match anymore due to changed rules."
|
||||
),
|
||||
)
|
||||
self.add_argument_progress_bar_mixin(parser)
|
||||
parser.add_argument(
|
||||
"--suggest",
|
||||
default=False,
|
||||
@@ -65,6 +68,8 @@ class Command(PaperlessCommand):
|
||||
)
|
||||
|
||||
def handle(self, *args, **options):
|
||||
self.handle_progress_bar_mixin(**options)
|
||||
|
||||
if options["inbox_only"]:
|
||||
queryset = Document.objects.filter(tags__is_inbox_tag=True)
|
||||
else:
|
||||
@@ -79,7 +84,7 @@ class Command(PaperlessCommand):
|
||||
|
||||
classifier = load_classifier()
|
||||
|
||||
for document in self.track(documents, description="Retagging..."):
|
||||
for document in tqdm.tqdm(documents, disable=self.no_progress_bar):
|
||||
if options["correspondent"]:
|
||||
set_correspondent(
|
||||
sender=None,
|
||||
@@ -117,7 +122,6 @@ class Command(PaperlessCommand):
|
||||
stdout=self.stdout,
|
||||
style_func=self.style,
|
||||
)
|
||||
|
||||
if options["storage_path"]:
|
||||
set_storage_path(
|
||||
sender=None,
|
||||
|
||||
@@ -1,45 +1,43 @@
|
||||
import logging
|
||||
import multiprocessing
|
||||
import shutil
|
||||
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
import tqdm
|
||||
from django import db
|
||||
from django.core.management.base import BaseCommand
|
||||
|
||||
from documents.management.commands.mixins import MultiProcessMixin
|
||||
from documents.management.commands.mixins import ProgressBarMixin
|
||||
from documents.models import Document
|
||||
from documents.parsers import get_parser_class_for_mime_type
|
||||
|
||||
logger = logging.getLogger("paperless.management.thumbnails")
|
||||
|
||||
|
||||
def _process_document(doc_id: int) -> None:
|
||||
def _process_document(doc_id) -> None:
|
||||
document: Document = Document.objects.get(id=doc_id)
|
||||
parser_class = get_parser_class_for_mime_type(document.mime_type)
|
||||
|
||||
if parser_class is None:
|
||||
logger.warning(
|
||||
"%s: No parser for mime type %s",
|
||||
document,
|
||||
document.mime_type,
|
||||
)
|
||||
if parser_class:
|
||||
parser = parser_class(logging_group=None)
|
||||
else:
|
||||
print(f"{document} No parser for mime type {document.mime_type}") # noqa: T201
|
||||
return
|
||||
|
||||
parser = parser_class(logging_group=None)
|
||||
|
||||
try:
|
||||
thumb = parser.get_thumbnail(
|
||||
document.source_path,
|
||||
document.mime_type,
|
||||
document.get_public_filename(),
|
||||
)
|
||||
|
||||
shutil.move(thumb, document.thumbnail_path)
|
||||
finally:
|
||||
parser.cleanup()
|
||||
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
class Command(MultiProcessMixin, ProgressBarMixin, BaseCommand):
|
||||
help = "This will regenerate the thumbnails for all documents."
|
||||
|
||||
supports_multiprocessing = True
|
||||
|
||||
def add_arguments(self, parser) -> None:
|
||||
super().add_arguments(parser)
|
||||
parser.add_argument(
|
||||
"-d",
|
||||
"--document",
|
||||
@@ -51,23 +49,36 @@ class Command(PaperlessCommand):
|
||||
"run on this specific document."
|
||||
),
|
||||
)
|
||||
self.add_argument_progress_bar_mixin(parser)
|
||||
self.add_argument_processes_mixin(parser)
|
||||
|
||||
def handle(self, *args, **options):
|
||||
logging.getLogger().handlers[0].level = logging.ERROR
|
||||
|
||||
self.handle_processes_mixin(**options)
|
||||
self.handle_progress_bar_mixin(**options)
|
||||
|
||||
if options["document"]:
|
||||
documents = Document.objects.filter(pk=options["document"])
|
||||
else:
|
||||
documents = Document.objects.all()
|
||||
|
||||
ids = list(documents.values_list("id", flat=True))
|
||||
ids = [doc.id for doc in documents]
|
||||
|
||||
for result in self.process_parallel(
|
||||
_process_document,
|
||||
ids,
|
||||
description="Regenerating thumbnails...",
|
||||
):
|
||||
if result.error: # pragma: no cover
|
||||
self.console.print(
|
||||
f"[red]Failed document {result.item}: {result.error}[/red]",
|
||||
# Note to future self: this prevents django from reusing database
|
||||
# connections between processes, which is bad and does not work
|
||||
# with postgres.
|
||||
db.connections.close_all()
|
||||
|
||||
if self.process_count == 1:
|
||||
for doc_id in ids:
|
||||
_process_document(doc_id)
|
||||
else: # pragma: no cover
|
||||
with multiprocessing.Pool(processes=self.process_count) as pool:
|
||||
list(
|
||||
tqdm.tqdm(
|
||||
pool.imap_unordered(_process_document, ids),
|
||||
total=len(ids),
|
||||
disable=self.no_progress_bar,
|
||||
),
|
||||
)
|
||||
|
||||
@@ -21,6 +21,26 @@ class CryptFields(TypedDict):
|
||||
fields: list[str]
|
||||
|
||||
|
||||
class MultiProcessMixin:
|
||||
"""
|
||||
Small class to handle adding an argument and validating it
|
||||
for the use of multiple processes
|
||||
"""
|
||||
|
||||
def add_argument_processes_mixin(self, parser: ArgumentParser) -> None:
|
||||
parser.add_argument(
|
||||
"--processes",
|
||||
default=max(1, os.cpu_count() // 4),
|
||||
type=int,
|
||||
help="Number of processes to distribute work amongst",
|
||||
)
|
||||
|
||||
def handle_processes_mixin(self, *args, **options) -> None:
|
||||
self.process_count = options["processes"]
|
||||
if self.process_count < 1:
|
||||
raise CommandError("There must be at least 1 process")
|
||||
|
||||
|
||||
class ProgressBarMixin:
|
||||
"""
|
||||
Many commands use a progress bar, which can be disabled
|
||||
|
||||
@@ -1,21 +1,27 @@
|
||||
from auditlog.models import LogEntry
|
||||
from django.core.management.base import BaseCommand
|
||||
from django.db import transaction
|
||||
from tqdm import tqdm
|
||||
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
from documents.management.commands.mixins import ProgressBarMixin
|
||||
|
||||
|
||||
class Command(PaperlessCommand):
|
||||
"""Prune the audit logs of objects that no longer exist."""
|
||||
class Command(BaseCommand, ProgressBarMixin):
|
||||
"""
|
||||
Prune the audit logs of objects that no longer exist.
|
||||
"""
|
||||
|
||||
help = "Prunes the audit logs of objects that no longer exist."
|
||||
|
||||
def handle(self, *args, **options):
|
||||
def add_arguments(self, parser):
|
||||
self.add_argument_progress_bar_mixin(parser)
|
||||
|
||||
def handle(self, **options):
|
||||
self.handle_progress_bar_mixin(**options)
|
||||
with transaction.atomic():
|
||||
for log_entry in self.track(
|
||||
LogEntry.objects.all(),
|
||||
description="Pruning audit logs...",
|
||||
):
|
||||
for log_entry in tqdm(LogEntry.objects.all(), disable=self.no_progress_bar):
|
||||
model_class = log_entry.content_type.model_class()
|
||||
# use global_objects for SoftDeleteModel
|
||||
objects = (
|
||||
model_class.global_objects
|
||||
if hasattr(model_class, "global_objects")
|
||||
@@ -26,8 +32,8 @@ class Command(PaperlessCommand):
|
||||
and not objects.filter(pk=log_entry.object_id).exists()
|
||||
):
|
||||
log_entry.delete()
|
||||
self.console.print(
|
||||
f"Deleted audit log entry for "
|
||||
f"{model_class.__name__} #{log_entry.object_id}",
|
||||
style="yellow",
|
||||
tqdm.write(
|
||||
self.style.NOTICE(
|
||||
f"Deleted audit log entry for {model_class.__name__} #{log_entry.object_id}",
|
||||
),
|
||||
)
|
||||
|
||||
@@ -9,7 +9,7 @@ from types import TracebackType
|
||||
try:
|
||||
from typing import Self
|
||||
except ImportError:
|
||||
from typing_extensions import Self
|
||||
from typing import Self
|
||||
|
||||
import dateparser
|
||||
|
||||
|
||||
@@ -8,7 +8,7 @@ if TYPE_CHECKING:
|
||||
from channels_redis.pubsub import RedisPubSubChannelLayer
|
||||
|
||||
|
||||
class ProgressStatusOptions(str, enum.Enum):
|
||||
class ProgressStatusOptions(enum.StrEnum):
|
||||
STARTED = "STARTED"
|
||||
WORKING = "WORKING"
|
||||
SUCCESS = "SUCCESS"
|
||||
|
||||
@@ -24,7 +24,7 @@ def base_config() -> DateParserConfig:
|
||||
12,
|
||||
0,
|
||||
0,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
),
|
||||
filename_date_order="YMD",
|
||||
content_date_order="DMY",
|
||||
@@ -45,7 +45,7 @@ def config_with_ignore_dates() -> DateParserConfig:
|
||||
12,
|
||||
0,
|
||||
0,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
),
|
||||
filename_date_order="DMY",
|
||||
content_date_order="MDY",
|
||||
|
||||
@@ -101,50 +101,50 @@ class TestFilterDate:
|
||||
[
|
||||
# Valid Dates
|
||||
pytest.param(
|
||||
datetime.datetime(2024, 1, 10, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 10, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 10, tzinfo=datetime.UTC),
|
||||
datetime.datetime(2024, 1, 10, tzinfo=datetime.UTC),
|
||||
id="valid_past_date",
|
||||
),
|
||||
pytest.param(
|
||||
datetime.datetime(2024, 1, 15, 12, 0, 0, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 15, 12, 0, 0, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 15, 12, 0, 0, tzinfo=datetime.UTC),
|
||||
datetime.datetime(2024, 1, 15, 12, 0, 0, tzinfo=datetime.UTC),
|
||||
id="exactly_at_reference",
|
||||
),
|
||||
pytest.param(
|
||||
datetime.datetime(1901, 1, 1, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(1901, 1, 1, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(1901, 1, 1, tzinfo=datetime.UTC),
|
||||
datetime.datetime(1901, 1, 1, tzinfo=datetime.UTC),
|
||||
id="year_1901_valid",
|
||||
),
|
||||
# Date is > reference_time
|
||||
pytest.param(
|
||||
datetime.datetime(2024, 1, 16, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 16, tzinfo=datetime.UTC),
|
||||
None,
|
||||
id="future_date_day_after",
|
||||
),
|
||||
# date.date() in ignore_dates
|
||||
pytest.param(
|
||||
datetime.datetime(2024, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 1, 0, 0, 0, tzinfo=datetime.UTC),
|
||||
None,
|
||||
id="ignored_date_midnight_jan1",
|
||||
),
|
||||
pytest.param(
|
||||
datetime.datetime(2024, 1, 1, 10, 30, 0, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 1, 1, 10, 30, 0, tzinfo=datetime.UTC),
|
||||
None,
|
||||
id="ignored_date_midday_jan1",
|
||||
),
|
||||
pytest.param(
|
||||
datetime.datetime(2024, 12, 25, 15, 0, 0, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2024, 12, 25, 15, 0, 0, tzinfo=datetime.UTC),
|
||||
None,
|
||||
id="ignored_date_dec25_future",
|
||||
),
|
||||
# date.year <= 1900
|
||||
pytest.param(
|
||||
datetime.datetime(1899, 12, 31, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(1899, 12, 31, tzinfo=datetime.UTC),
|
||||
None,
|
||||
id="year_1899",
|
||||
),
|
||||
pytest.param(
|
||||
datetime.datetime(1900, 1, 1, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(1900, 1, 1, tzinfo=datetime.UTC),
|
||||
None,
|
||||
id="year_1900_boundary",
|
||||
),
|
||||
@@ -176,7 +176,7 @@ class TestFilterDate:
|
||||
1,
|
||||
12,
|
||||
0,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
)
|
||||
another_ignored = datetime.datetime(
|
||||
2024,
|
||||
@@ -184,7 +184,7 @@ class TestFilterDate:
|
||||
25,
|
||||
15,
|
||||
30,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
)
|
||||
allowed_date = datetime.datetime(
|
||||
2024,
|
||||
@@ -192,7 +192,7 @@ class TestFilterDate:
|
||||
2,
|
||||
12,
|
||||
0,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
)
|
||||
|
||||
assert parser._filter_date(ignored_date) is None
|
||||
@@ -204,7 +204,7 @@ class TestFilterDate:
|
||||
regex_parser: RegexDateParserPlugin,
|
||||
) -> None:
|
||||
"""Should work with timezone-aware datetimes."""
|
||||
date_utc = datetime.datetime(2024, 1, 10, 12, 0, tzinfo=datetime.timezone.utc)
|
||||
date_utc = datetime.datetime(2024, 1, 10, 12, 0, tzinfo=datetime.UTC)
|
||||
|
||||
result = regex_parser._filter_date(date_utc)
|
||||
|
||||
@@ -221,8 +221,8 @@ class TestRegexDateParser:
|
||||
"report-2023-12-25.txt",
|
||||
"Event recorded on 25/12/2022.",
|
||||
[
|
||||
datetime.datetime(2023, 12, 25, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2022, 12, 25, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2023, 12, 25, tzinfo=datetime.UTC),
|
||||
datetime.datetime(2022, 12, 25, tzinfo=datetime.UTC),
|
||||
],
|
||||
id="filename-y-m-d_and_content-d-m-y",
|
||||
),
|
||||
@@ -230,8 +230,8 @@ class TestRegexDateParser:
|
||||
"img_2023.01.02.jpg",
|
||||
"Taken on 01/02/2023",
|
||||
[
|
||||
datetime.datetime(2023, 1, 2, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2023, 2, 1, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2023, 1, 2, tzinfo=datetime.UTC),
|
||||
datetime.datetime(2023, 2, 1, tzinfo=datetime.UTC),
|
||||
],
|
||||
id="ambiguous-dates-respect-orders",
|
||||
),
|
||||
@@ -239,7 +239,7 @@ class TestRegexDateParser:
|
||||
"notes.txt",
|
||||
"bad date 99/99/9999 and 25/12/2022",
|
||||
[
|
||||
datetime.datetime(2022, 12, 25, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2022, 12, 25, tzinfo=datetime.UTC),
|
||||
],
|
||||
id="parse-exception-skips-bad-and-yields-good",
|
||||
),
|
||||
@@ -275,24 +275,24 @@ class TestRegexDateParser:
|
||||
or "2023.12.25" in date_string
|
||||
or "2023-12-25" in date_string
|
||||
):
|
||||
return datetime.datetime(2023, 12, 25, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 12, 25, tzinfo=datetime.UTC)
|
||||
|
||||
# content DMY 25/12/2022
|
||||
if "25/12/2022" in date_string or "25-12-2022" in date_string:
|
||||
return datetime.datetime(2022, 12, 25, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2022, 12, 25, tzinfo=datetime.UTC)
|
||||
|
||||
# filename YMD 2023.01.02
|
||||
if "2023.01.02" in date_string or "2023-01-02" in date_string:
|
||||
return datetime.datetime(2023, 1, 2, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 1, 2, tzinfo=datetime.UTC)
|
||||
|
||||
# ambiguous 01/02/2023 -> respect DATE_ORDER setting
|
||||
if "01/02/2023" in date_string:
|
||||
if date_order == "DMY":
|
||||
return datetime.datetime(2023, 2, 1, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 2, 1, tzinfo=datetime.UTC)
|
||||
if date_order == "YMD":
|
||||
return datetime.datetime(2023, 1, 2, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 1, 2, tzinfo=datetime.UTC)
|
||||
# fallback
|
||||
return datetime.datetime(2023, 2, 1, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 2, 1, tzinfo=datetime.UTC)
|
||||
|
||||
# simulate parse failure for malformed input
|
||||
if "99/99/9999" in date_string or "bad date" in date_string:
|
||||
@@ -328,7 +328,7 @@ class TestRegexDateParser:
|
||||
12,
|
||||
0,
|
||||
0,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
),
|
||||
filename_date_order="YMD",
|
||||
content_date_order="DMY",
|
||||
@@ -344,13 +344,13 @@ class TestRegexDateParser:
|
||||
) -> datetime.datetime | None:
|
||||
if "10/12/2023" in date_string or "10-12-2023" in date_string:
|
||||
# ignored date
|
||||
return datetime.datetime(2023, 12, 10, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 12, 10, tzinfo=datetime.UTC)
|
||||
if "01/02/2024" in date_string or "01-02-2024" in date_string:
|
||||
# future relative to reference_time -> filtered
|
||||
return datetime.datetime(2024, 2, 1, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2024, 2, 1, tzinfo=datetime.UTC)
|
||||
if "05/01/2023" in date_string or "05-01-2023" in date_string:
|
||||
# valid
|
||||
return datetime.datetime(2023, 1, 5, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 1, 5, tzinfo=datetime.UTC)
|
||||
return None
|
||||
|
||||
mocker.patch(target, side_effect=fake_parse)
|
||||
@@ -358,7 +358,7 @@ class TestRegexDateParser:
|
||||
content = "Ignored: 10/12/2023, Future: 01/02/2024, Keep: 05/01/2023"
|
||||
results = list(parser.parse("whatever.txt", content))
|
||||
|
||||
assert results == [datetime.datetime(2023, 1, 5, tzinfo=datetime.timezone.utc)]
|
||||
assert results == [datetime.datetime(2023, 1, 5, tzinfo=datetime.UTC)]
|
||||
|
||||
def test_parse_handles_no_matches_and_returns_empty_list(
|
||||
self,
|
||||
@@ -392,7 +392,7 @@ class TestRegexDateParser:
|
||||
12,
|
||||
0,
|
||||
0,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
),
|
||||
filename_date_order=None,
|
||||
content_date_order="DMY",
|
||||
@@ -409,9 +409,9 @@ class TestRegexDateParser:
|
||||
) -> datetime.datetime | None:
|
||||
# return distinct datetimes so we can tell which source was parsed
|
||||
if "25/12/2022" in date_string:
|
||||
return datetime.datetime(2022, 12, 25, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2022, 12, 25, tzinfo=datetime.UTC)
|
||||
if "2023-12-25" in date_string:
|
||||
return datetime.datetime(2023, 12, 25, tzinfo=datetime.timezone.utc)
|
||||
return datetime.datetime(2023, 12, 25, tzinfo=datetime.UTC)
|
||||
return None
|
||||
|
||||
mock = mocker.patch(target, side_effect=fake_parse)
|
||||
@@ -429,5 +429,5 @@ class TestRegexDateParser:
|
||||
assert "25/12/2022" in called_date_string
|
||||
# And the parser should have yielded the corresponding datetime
|
||||
assert results == [
|
||||
datetime.datetime(2022, 12, 25, tzinfo=datetime.timezone.utc),
|
||||
datetime.datetime(2022, 12, 25, tzinfo=datetime.UTC),
|
||||
]
|
||||
|
||||
@@ -1,518 +0,0 @@
|
||||
"""Tests for PaperlessCommand base class."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import io
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import pytest
|
||||
from django.core.management import CommandError
|
||||
from django.db.models import QuerySet
|
||||
from rich.console import Console
|
||||
|
||||
from documents.management.commands.base import PaperlessCommand
|
||||
from documents.management.commands.base import ProcessResult
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
|
||||
# --- Test Commands ---
|
||||
# These simulate real command implementations for testing
|
||||
|
||||
|
||||
class SimpleCommand(PaperlessCommand):
|
||||
"""Command with default settings (progress bar, no multiprocessing)."""
|
||||
|
||||
help = "Simple test command"
|
||||
|
||||
def handle(self, *args, **options):
|
||||
items = list(range(5))
|
||||
results = []
|
||||
for item in self.track(items, description="Processing..."):
|
||||
results.append(item * 2)
|
||||
self.stdout.write(f"Results: {results}")
|
||||
|
||||
|
||||
class NoProgressBarCommand(PaperlessCommand):
|
||||
"""Command with progress bar disabled."""
|
||||
|
||||
help = "No progress bar command"
|
||||
supports_progress_bar = False
|
||||
|
||||
def handle(self, *args, **options):
|
||||
items = list(range(3))
|
||||
for _ in self.track(items):
|
||||
# We don't need to actually work
|
||||
pass
|
||||
self.stdout.write("Done")
|
||||
|
||||
|
||||
class MultiprocessCommand(PaperlessCommand):
|
||||
"""Command with multiprocessing support."""
|
||||
|
||||
help = "Multiprocess test command"
|
||||
supports_multiprocessing = True
|
||||
|
||||
def handle(self, *args, **options):
|
||||
items = list(range(5))
|
||||
results = []
|
||||
for result in self.process_parallel(
|
||||
_double_value,
|
||||
items,
|
||||
description="Processing...",
|
||||
):
|
||||
results.append(result)
|
||||
successes = sum(1 for r in results if r.success)
|
||||
self.stdout.write(f"Successes: {successes}")
|
||||
|
||||
|
||||
# --- Helper Functions for Multiprocessing ---
|
||||
# Must be at module level to be picklable
|
||||
|
||||
|
||||
def _double_value(x: int) -> int:
|
||||
"""Double the input value."""
|
||||
return x * 2
|
||||
|
||||
|
||||
def _divide_ten_by(x: int) -> float:
|
||||
"""Divide 10 by x. Raises ZeroDivisionError if x is 0."""
|
||||
return 10 / x
|
||||
|
||||
|
||||
# --- Fixtures ---
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def console() -> Console:
|
||||
"""Create a non-interactive console for testing."""
|
||||
return Console(force_terminal=False, force_interactive=False)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def simple_command(console: Console) -> SimpleCommand:
|
||||
"""Create a SimpleCommand instance configured for testing."""
|
||||
command = SimpleCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
command.console = console
|
||||
command.no_progress_bar = True
|
||||
command.process_count = 1
|
||||
return command
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def multiprocess_command(console: Console) -> MultiprocessCommand:
|
||||
"""Create a MultiprocessCommand instance configured for testing."""
|
||||
command = MultiprocessCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
command.console = console
|
||||
command.no_progress_bar = True
|
||||
command.process_count = 1
|
||||
return command
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def mock_queryset():
|
||||
"""
|
||||
Create a mock Django QuerySet that tracks method calls.
|
||||
|
||||
This verifies we use .count() instead of len() for querysets.
|
||||
"""
|
||||
|
||||
class MockQuerySet(QuerySet):
|
||||
def __init__(self, items: list):
|
||||
self._items = items
|
||||
self.count_called = False
|
||||
|
||||
def count(self) -> int:
|
||||
self.count_called = True
|
||||
return len(self._items)
|
||||
|
||||
def __iter__(self):
|
||||
return iter(self._items)
|
||||
|
||||
def __len__(self):
|
||||
raise AssertionError("len() should not be called on querysets")
|
||||
|
||||
return MockQuerySet
|
||||
|
||||
|
||||
# --- Test Classes ---
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestProcessResult:
|
||||
"""Tests for the ProcessResult dataclass."""
|
||||
|
||||
def test_success_result(self):
|
||||
result = ProcessResult(item=1, result=2, error=None)
|
||||
|
||||
assert result.item == 1
|
||||
assert result.result == 2
|
||||
assert result.error is None
|
||||
assert result.success is True
|
||||
|
||||
def test_error_result(self):
|
||||
error = ValueError("test error")
|
||||
result = ProcessResult(item=1, result=None, error=error)
|
||||
|
||||
assert result.item == 1
|
||||
assert result.result is None
|
||||
assert result.error is error
|
||||
assert result.success is False
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestPaperlessCommandArguments:
|
||||
"""Tests for argument parsing behavior."""
|
||||
|
||||
def test_progress_bar_argument_added_by_default(self):
|
||||
command = SimpleCommand()
|
||||
parser = command.create_parser("manage.py", "simple")
|
||||
|
||||
options = parser.parse_args(["--no-progress-bar"])
|
||||
assert options.no_progress_bar is True
|
||||
|
||||
options = parser.parse_args([])
|
||||
assert options.no_progress_bar is False
|
||||
|
||||
def test_progress_bar_argument_not_added_when_disabled(self):
|
||||
command = NoProgressBarCommand()
|
||||
parser = command.create_parser("manage.py", "noprogress")
|
||||
|
||||
options = parser.parse_args([])
|
||||
assert not hasattr(options, "no_progress_bar")
|
||||
|
||||
def test_processes_argument_added_when_multiprocessing_enabled(self):
|
||||
command = MultiprocessCommand()
|
||||
parser = command.create_parser("manage.py", "multiprocess")
|
||||
|
||||
options = parser.parse_args(["--processes", "4"])
|
||||
assert options.processes == 4
|
||||
|
||||
options = parser.parse_args([])
|
||||
assert options.processes >= 1
|
||||
|
||||
def test_processes_argument_not_added_when_multiprocessing_disabled(self):
|
||||
command = SimpleCommand()
|
||||
parser = command.create_parser("manage.py", "simple")
|
||||
|
||||
options = parser.parse_args([])
|
||||
assert not hasattr(options, "processes")
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestPaperlessCommandExecute:
|
||||
"""Tests for the execute() setup behavior."""
|
||||
|
||||
@pytest.fixture
|
||||
def base_options(self) -> dict:
|
||||
"""Base options required for execute()."""
|
||||
return {
|
||||
"verbosity": 1,
|
||||
"no_color": True,
|
||||
"force_color": False,
|
||||
"skip_checks": True,
|
||||
}
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("no_progress_bar_flag", "expected"),
|
||||
[
|
||||
pytest.param(False, False, id="progress-bar-enabled"),
|
||||
pytest.param(True, True, id="progress-bar-disabled"),
|
||||
],
|
||||
)
|
||||
def test_no_progress_bar_state_set(
|
||||
self,
|
||||
base_options: dict,
|
||||
*,
|
||||
no_progress_bar_flag: bool,
|
||||
expected: bool,
|
||||
):
|
||||
command = SimpleCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
|
||||
options = {**base_options, "no_progress_bar": no_progress_bar_flag}
|
||||
command.execute(**options)
|
||||
|
||||
assert command.no_progress_bar is expected
|
||||
|
||||
def test_no_progress_bar_always_true_when_not_supported(self, base_options: dict):
|
||||
command = NoProgressBarCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
|
||||
command.execute(**base_options)
|
||||
|
||||
assert command.no_progress_bar is True
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("processes", "expected"),
|
||||
[
|
||||
pytest.param(1, 1, id="single-process"),
|
||||
pytest.param(4, 4, id="four-processes"),
|
||||
],
|
||||
)
|
||||
def test_process_count_set(
|
||||
self,
|
||||
base_options: dict,
|
||||
processes: int,
|
||||
expected: int,
|
||||
):
|
||||
command = MultiprocessCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
|
||||
options = {**base_options, "processes": processes, "no_progress_bar": True}
|
||||
command.execute(**options)
|
||||
|
||||
assert command.process_count == expected
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"invalid_count",
|
||||
[
|
||||
pytest.param(0, id="zero"),
|
||||
pytest.param(-1, id="negative"),
|
||||
],
|
||||
)
|
||||
def test_process_count_validation_rejects_invalid(
|
||||
self,
|
||||
base_options: dict,
|
||||
invalid_count: int,
|
||||
):
|
||||
command = MultiprocessCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
|
||||
options = {**base_options, "processes": invalid_count, "no_progress_bar": True}
|
||||
|
||||
with pytest.raises(CommandError, match="--processes must be at least 1"):
|
||||
command.execute(**options)
|
||||
|
||||
def test_process_count_defaults_to_one_when_not_supported(self, base_options: dict):
|
||||
command = SimpleCommand()
|
||||
command.stdout = io.StringIO()
|
||||
command.stderr = io.StringIO()
|
||||
|
||||
options = {**base_options, "no_progress_bar": True}
|
||||
command.execute(**options)
|
||||
|
||||
assert command.process_count == 1
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestGetIterableLength:
|
||||
"""Tests for the _get_iterable_length() method."""
|
||||
|
||||
def test_uses_count_for_querysets(
|
||||
self,
|
||||
simple_command: SimpleCommand,
|
||||
mock_queryset,
|
||||
):
|
||||
"""Should call .count() on Django querysets rather than len()."""
|
||||
queryset = mock_queryset([1, 2, 3, 4, 5])
|
||||
|
||||
result = simple_command._get_iterable_length(queryset)
|
||||
|
||||
assert result == 5
|
||||
assert queryset.count_called is True
|
||||
|
||||
def test_uses_len_for_sized(self, simple_command: SimpleCommand):
|
||||
"""Should use len() for sequences and other Sized types."""
|
||||
result = simple_command._get_iterable_length([1, 2, 3, 4])
|
||||
|
||||
assert result == 4
|
||||
|
||||
def test_returns_none_for_unsized_iterables(self, simple_command: SimpleCommand):
|
||||
"""Should return None for generators and other iterables without len()."""
|
||||
result = simple_command._get_iterable_length(x for x in [1, 2, 3])
|
||||
|
||||
assert result is None
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestTrack:
|
||||
"""Tests for the track() method."""
|
||||
|
||||
def test_with_progress_bar_disabled(self, simple_command: SimpleCommand):
|
||||
simple_command.no_progress_bar = True
|
||||
items = ["a", "b", "c"]
|
||||
|
||||
result = list(simple_command.track(items, description="Test..."))
|
||||
|
||||
assert result == items
|
||||
|
||||
def test_with_progress_bar_enabled(self, simple_command: SimpleCommand):
|
||||
simple_command.no_progress_bar = False
|
||||
items = [1, 2, 3]
|
||||
|
||||
result = list(simple_command.track(items, description="Processing..."))
|
||||
|
||||
assert result == items
|
||||
|
||||
def test_with_explicit_total(self, simple_command: SimpleCommand):
|
||||
simple_command.no_progress_bar = False
|
||||
|
||||
def gen():
|
||||
yield from [1, 2, 3]
|
||||
|
||||
result = list(simple_command.track(gen(), total=3))
|
||||
|
||||
assert result == [1, 2, 3]
|
||||
|
||||
def test_with_generator_no_total(self, simple_command: SimpleCommand):
|
||||
def gen():
|
||||
yield from [1, 2, 3]
|
||||
|
||||
result = list(simple_command.track(gen()))
|
||||
|
||||
assert result == [1, 2, 3]
|
||||
|
||||
def test_empty_iterable(self, simple_command: SimpleCommand):
|
||||
result = list(simple_command.track([]))
|
||||
|
||||
assert result == []
|
||||
|
||||
def test_uses_queryset_count(
|
||||
self,
|
||||
simple_command: SimpleCommand,
|
||||
mock_queryset,
|
||||
mocker: MockerFixture,
|
||||
):
|
||||
"""Verify track() uses .count() for querysets."""
|
||||
simple_command.no_progress_bar = False
|
||||
queryset = mock_queryset([1, 2, 3])
|
||||
|
||||
spy = mocker.spy(simple_command, "_get_iterable_length")
|
||||
|
||||
result = list(simple_command.track(queryset))
|
||||
|
||||
assert result == [1, 2, 3]
|
||||
spy.assert_called_once_with(queryset)
|
||||
assert queryset.count_called is True
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestProcessParallel:
|
||||
"""Tests for the process_parallel() method."""
|
||||
|
||||
def test_sequential_processing_single_process(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
):
|
||||
multiprocess_command.process_count = 1
|
||||
items = [1, 2, 3, 4, 5]
|
||||
|
||||
results = list(multiprocess_command.process_parallel(_double_value, items))
|
||||
|
||||
assert len(results) == 5
|
||||
assert all(r.success for r in results)
|
||||
|
||||
result_map = {r.item: r.result for r in results}
|
||||
assert result_map == {1: 2, 2: 4, 3: 6, 4: 8, 5: 10}
|
||||
|
||||
def test_sequential_processing_handles_errors(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
):
|
||||
multiprocess_command.process_count = 1
|
||||
items = [1, 2, 0, 4] # 0 causes ZeroDivisionError
|
||||
|
||||
results = list(multiprocess_command.process_parallel(_divide_ten_by, items))
|
||||
|
||||
assert len(results) == 4
|
||||
|
||||
successes = [r for r in results if r.success]
|
||||
failures = [r for r in results if not r.success]
|
||||
|
||||
assert len(successes) == 3
|
||||
assert len(failures) == 1
|
||||
assert failures[0].item == 0
|
||||
assert isinstance(failures[0].error, ZeroDivisionError)
|
||||
|
||||
def test_parallel_closes_db_connections(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
mocker: MockerFixture,
|
||||
):
|
||||
multiprocess_command.process_count = 2
|
||||
items = [1, 2, 3]
|
||||
|
||||
mock_connections = mocker.patch(
|
||||
"documents.management.commands.base.db.connections",
|
||||
)
|
||||
|
||||
results = list(multiprocess_command.process_parallel(_double_value, items))
|
||||
|
||||
mock_connections.close_all.assert_called_once()
|
||||
assert len(results) == 3
|
||||
|
||||
def test_parallel_processing_handles_errors(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
mocker: MockerFixture,
|
||||
):
|
||||
multiprocess_command.process_count = 2
|
||||
items = [1, 2, 0, 4]
|
||||
|
||||
mocker.patch("documents.management.commands.base.db.connections")
|
||||
|
||||
results = list(multiprocess_command.process_parallel(_divide_ten_by, items))
|
||||
|
||||
failures = [r for r in results if not r.success]
|
||||
assert len(failures) == 1
|
||||
assert failures[0].item == 0
|
||||
|
||||
def test_empty_items(self, multiprocess_command: MultiprocessCommand):
|
||||
results = list(multiprocess_command.process_parallel(_double_value, []))
|
||||
|
||||
assert results == []
|
||||
|
||||
def test_result_contains_original_item(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
):
|
||||
items = [10, 20, 30]
|
||||
|
||||
results = list(multiprocess_command.process_parallel(_double_value, items))
|
||||
|
||||
for result in results:
|
||||
assert result.item in items
|
||||
assert result.result == result.item * 2
|
||||
|
||||
def test_sequential_path_used_for_single_process(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
mocker: MockerFixture,
|
||||
):
|
||||
"""Verify single process uses sequential path (important for testing)."""
|
||||
multiprocess_command.process_count = 1
|
||||
|
||||
spy_sequential = mocker.spy(multiprocess_command, "_process_sequential")
|
||||
spy_parallel = mocker.spy(multiprocess_command, "_process_parallel")
|
||||
|
||||
list(multiprocess_command.process_parallel(_double_value, [1, 2, 3]))
|
||||
|
||||
spy_sequential.assert_called_once()
|
||||
spy_parallel.assert_not_called()
|
||||
|
||||
def test_parallel_path_used_for_multiple_processes(
|
||||
self,
|
||||
multiprocess_command: MultiprocessCommand,
|
||||
mocker: MockerFixture,
|
||||
):
|
||||
"""Verify multiple processes uses parallel path."""
|
||||
multiprocess_command.process_count = 2
|
||||
|
||||
mocker.patch("documents.management.commands.base.db.connections")
|
||||
spy_sequential = mocker.spy(multiprocess_command, "_process_sequential")
|
||||
spy_parallel = mocker.spy(multiprocess_command, "_process_parallel")
|
||||
|
||||
list(multiprocess_command.process_parallel(_double_value, [1, 2, 3]))
|
||||
|
||||
spy_parallel.assert_called_once()
|
||||
spy_sequential.assert_not_called()
|
||||
@@ -21,16 +21,6 @@ class TestApiUiSettings(DirectoriesMixin, APITestCase):
|
||||
self.test_user.save()
|
||||
self.client.force_authenticate(user=self.test_user)
|
||||
|
||||
@override_settings(
|
||||
APP_TITLE=None,
|
||||
APP_LOGO=None,
|
||||
AUDIT_LOG_ENABLED=True,
|
||||
EMPTY_TRASH_DELAY=30,
|
||||
ENABLE_UPDATE_CHECK="default",
|
||||
EMAIL_ENABLED=False,
|
||||
GMAIL_OAUTH_ENABLED=False,
|
||||
OUTLOOK_OAUTH_ENABLED=False,
|
||||
)
|
||||
def test_api_get_ui_settings(self) -> None:
|
||||
response = self.client.get(self.ENDPOINT, format="json")
|
||||
self.assertEqual(response.status_code, status.HTTP_200_OK)
|
||||
|
||||
@@ -919,7 +919,6 @@ class TestTagBarcode(DirectoriesMixin, SampleDirMixin, GetReaderPluginMixin, Tes
|
||||
@override_settings(
|
||||
CONSUMER_ENABLE_TAG_BARCODE=True,
|
||||
CONSUMER_TAG_BARCODE_MAPPING={"ASN(.*)": "\\g<1>"},
|
||||
CONSUMER_ENABLE_ASN_BARCODE=False,
|
||||
)
|
||||
def test_scan_file_for_many_custom_tags(self) -> None:
|
||||
"""
|
||||
|
||||
@@ -329,14 +329,18 @@ class TestFileHandling(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
|
||||
FILENAME_FORMAT="{added_year}-{added_month}-{added_day}",
|
||||
)
|
||||
def test_added_year_month_day(self) -> None:
|
||||
d1 = timezone.make_aware(datetime.datetime(1232, 1, 9, 1, 1, 1))
|
||||
d1 = timezone.make_aware(datetime.datetime(232, 1, 9, 1, 1, 1))
|
||||
doc1 = Document.objects.create(
|
||||
title="doc1",
|
||||
mime_type="application/pdf",
|
||||
added=d1,
|
||||
)
|
||||
|
||||
self.assertEqual(generate_filename(doc1), Path("1232-01-09.pdf"))
|
||||
# Account for 3.14 padding changes
|
||||
expected_year: str = d1.strftime("%Y")
|
||||
expected_filename: Path = Path(f"{expected_year}-01-09.pdf")
|
||||
|
||||
self.assertEqual(generate_filename(doc1), expected_filename)
|
||||
|
||||
doc1.added = timezone.make_aware(datetime.datetime(2020, 11, 16, 1, 1, 1))
|
||||
|
||||
|
||||
@@ -21,7 +21,7 @@ class TestDateLocalization:
|
||||
14,
|
||||
30,
|
||||
5,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
)
|
||||
|
||||
TEST_DATETIME_STRING: str = "2023-10-26T14:30:05+00:00"
|
||||
|
||||
@@ -4,7 +4,6 @@ from io import StringIO
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
from auditlog.models import LogEntry
|
||||
from django.contrib.contenttypes.models import ContentType
|
||||
from django.core.management import call_command
|
||||
@@ -20,7 +19,6 @@ from documents.tests.utils import FileSystemAssertsMixin
|
||||
sample_file: Path = Path(__file__).parent / "samples" / "simple.pdf"
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
@override_settings(FILENAME_FORMAT="{correspondent}/{title}")
|
||||
class TestArchiver(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
|
||||
def make_models(self):
|
||||
@@ -96,7 +94,6 @@ class TestArchiver(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
|
||||
self.assertEqual(doc2.archive_filename, "document_01.pdf")
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestMakeIndex(TestCase):
|
||||
@mock.patch("documents.management.commands.document_index.index_reindex")
|
||||
def test_reindex(self, m) -> None:
|
||||
@@ -109,7 +106,6 @@ class TestMakeIndex(TestCase):
|
||||
m.assert_called_once()
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestRenamer(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
|
||||
@override_settings(FILENAME_FORMAT="")
|
||||
def test_rename(self) -> None:
|
||||
@@ -144,7 +140,6 @@ class TestCreateClassifier(TestCase):
|
||||
m.assert_called_once()
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestSanityChecker(DirectoriesMixin, TestCase):
|
||||
def test_no_issues(self) -> None:
|
||||
with self.assertLogs() as capture:
|
||||
@@ -170,7 +165,6 @@ class TestSanityChecker(DirectoriesMixin, TestCase):
|
||||
self.assertIn("Checksum mismatch. Stored: abc, actual:", capture.output[1])
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestConvertMariaDBUUID(TestCase):
|
||||
@mock.patch("django.db.connection.schema_editor")
|
||||
def test_convert(self, m) -> None:
|
||||
@@ -184,7 +178,6 @@ class TestConvertMariaDBUUID(TestCase):
|
||||
self.assertIn("Successfully converted", stdout.getvalue())
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestPruneAuditLogs(TestCase):
|
||||
def test_prune_audit_logs(self) -> None:
|
||||
LogEntry.objects.create(
|
||||
|
||||
@@ -577,7 +577,6 @@ class TestTagsFromPath:
|
||||
assert len(tag_ids) == 0
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestCommandValidation:
|
||||
"""Tests for command argument validation."""
|
||||
|
||||
@@ -606,7 +605,6 @@ class TestCommandValidation:
|
||||
cmd.handle(directory=str(sample_pdf), oneshot=True, testing=False)
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
@pytest.mark.usefixtures("mock_supported_extensions")
|
||||
class TestCommandOneshot:
|
||||
"""Tests for oneshot mode."""
|
||||
@@ -777,7 +775,6 @@ def start_consumer(
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
@pytest.mark.django_db
|
||||
class TestCommandWatch:
|
||||
"""Integration tests for the watch loop."""
|
||||
@@ -899,7 +896,6 @@ class TestCommandWatch:
|
||||
assert not thread.is_alive()
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
@pytest.mark.django_db
|
||||
class TestCommandWatchPolling:
|
||||
"""Tests for polling mode."""
|
||||
@@ -932,7 +928,6 @@ class TestCommandWatchPolling:
|
||||
mock_consume_file_delay.delay.assert_called()
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
@pytest.mark.django_db
|
||||
class TestCommandWatchRecursive:
|
||||
"""Tests for recursive watching."""
|
||||
@@ -996,7 +991,6 @@ class TestCommandWatchRecursive:
|
||||
assert len(overrides.tag_ids) == 2
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
@pytest.mark.django_db
|
||||
class TestCommandWatchEdgeCases:
|
||||
"""Tests for edge cases and error handling."""
|
||||
|
||||
@@ -7,7 +7,6 @@ from pathlib import Path
|
||||
from unittest import mock
|
||||
from zipfile import ZipFile
|
||||
|
||||
import pytest
|
||||
from allauth.socialaccount.models import SocialAccount
|
||||
from allauth.socialaccount.models import SocialApp
|
||||
from allauth.socialaccount.models import SocialToken
|
||||
@@ -46,7 +45,6 @@ from documents.tests.utils import paperless_environment
|
||||
from paperless_mail.models import MailAccount
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestExportImport(
|
||||
DirectoriesMixin,
|
||||
FileSystemAssertsMixin,
|
||||
@@ -848,7 +846,6 @@ class TestExportImport(
|
||||
self.assertEqual(Document.objects.all().count(), 4)
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestCryptExportImport(
|
||||
DirectoriesMixin,
|
||||
FileSystemAssertsMixin,
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
from io import StringIO
|
||||
|
||||
import pytest
|
||||
from django.core.management import CommandError
|
||||
from django.core.management import call_command
|
||||
from django.test import TestCase
|
||||
@@ -8,7 +7,6 @@ from django.test import TestCase
|
||||
from documents.models import Document
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestFuzzyMatchCommand(TestCase):
|
||||
MSG_REGEX = r"Document \d fuzzy match to \d \(confidence \d\d\.\d\d\d\)"
|
||||
|
||||
@@ -51,6 +49,19 @@ class TestFuzzyMatchCommand(TestCase):
|
||||
self.call_command("--ratio", "101")
|
||||
self.assertIn("The ratio must be between 0 and 100", str(e.exception))
|
||||
|
||||
def test_invalid_process_count(self) -> None:
|
||||
"""
|
||||
GIVEN:
|
||||
- Invalid process count less than 0 above upper
|
||||
WHEN:
|
||||
- Command is called
|
||||
THEN:
|
||||
- Error is raised indicating issue
|
||||
"""
|
||||
with self.assertRaises(CommandError) as e:
|
||||
self.call_command("--processes", "0")
|
||||
self.assertIn("There must be at least 1 process", str(e.exception))
|
||||
|
||||
def test_no_matches(self) -> None:
|
||||
"""
|
||||
GIVEN:
|
||||
@@ -140,7 +151,7 @@ class TestFuzzyMatchCommand(TestCase):
|
||||
mime_type="application/pdf",
|
||||
filename="final_test.pdf",
|
||||
)
|
||||
stdout, _ = self.call_command("--no-progress-bar", "--processes", "1")
|
||||
stdout, _ = self.call_command()
|
||||
lines = [x.strip() for x in stdout.splitlines() if x.strip()]
|
||||
self.assertEqual(len(lines), 3)
|
||||
for line in lines:
|
||||
@@ -183,12 +194,7 @@ class TestFuzzyMatchCommand(TestCase):
|
||||
|
||||
self.assertEqual(Document.objects.count(), 3)
|
||||
|
||||
stdout, _ = self.call_command(
|
||||
"--delete",
|
||||
"--no-progress-bar",
|
||||
"--processes",
|
||||
"1",
|
||||
)
|
||||
stdout, _ = self.call_command("--delete")
|
||||
|
||||
self.assertIn(
|
||||
"The command is configured to delete documents. Use with caution",
|
||||
|
||||
@@ -4,7 +4,6 @@ from io import StringIO
|
||||
from pathlib import Path
|
||||
from zipfile import ZipFile
|
||||
|
||||
import pytest
|
||||
from django.contrib.auth.models import User
|
||||
from django.core.management import call_command
|
||||
from django.core.management.base import CommandError
|
||||
@@ -19,7 +18,6 @@ from documents.tests.utils import FileSystemAssertsMixin
|
||||
from documents.tests.utils import SampleDirMixin
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestCommandImport(
|
||||
DirectoriesMixin,
|
||||
FileSystemAssertsMixin,
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
import pytest
|
||||
from django.core.management import call_command
|
||||
from django.core.management.base import CommandError
|
||||
from django.test import TestCase
|
||||
@@ -11,7 +10,6 @@ from documents.models import Tag
|
||||
from documents.tests.utils import DirectoriesMixin
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestRetagger(DirectoriesMixin, TestCase):
|
||||
def make_models(self) -> None:
|
||||
self.sp1 = StoragePath.objects.create(
|
||||
|
||||
@@ -2,7 +2,6 @@ import os
|
||||
from io import StringIO
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
from django.contrib.auth.models import User
|
||||
from django.core.management import call_command
|
||||
from django.test import TestCase
|
||||
@@ -10,7 +9,6 @@ from django.test import TestCase
|
||||
from documents.tests.utils import DirectoriesMixin
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestManageSuperUser(DirectoriesMixin, TestCase):
|
||||
def call_command(self, environ):
|
||||
out = StringIO()
|
||||
|
||||
@@ -2,7 +2,6 @@ import shutil
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
from django.core.management import call_command
|
||||
from django.test import TestCase
|
||||
|
||||
@@ -13,7 +12,6 @@ from documents.tests.utils import DirectoriesMixin
|
||||
from documents.tests.utils import FileSystemAssertsMixin
|
||||
|
||||
|
||||
@pytest.mark.management
|
||||
class TestMakeThumbnails(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
|
||||
def make_models(self) -> None:
|
||||
self.d1 = Document.objects.create(
|
||||
|
||||
@@ -4570,7 +4570,7 @@ class TestDateWorkflowLocalization(
|
||||
14,
|
||||
30,
|
||||
5,
|
||||
tzinfo=datetime.timezone.utc,
|
||||
tzinfo=datetime.UTC,
|
||||
)
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
|
||||
@@ -33,11 +33,11 @@ from documents.plugins.helpers import ProgressStatusOptions
|
||||
def setup_directories():
|
||||
dirs = namedtuple("Dirs", ())
|
||||
|
||||
dirs.data_dir = Path(tempfile.mkdtemp()).resolve()
|
||||
dirs.scratch_dir = Path(tempfile.mkdtemp()).resolve()
|
||||
dirs.media_dir = Path(tempfile.mkdtemp()).resolve()
|
||||
dirs.consumption_dir = Path(tempfile.mkdtemp()).resolve()
|
||||
dirs.static_dir = Path(tempfile.mkdtemp()).resolve()
|
||||
dirs.data_dir = Path(tempfile.mkdtemp())
|
||||
dirs.scratch_dir = Path(tempfile.mkdtemp())
|
||||
dirs.media_dir = Path(tempfile.mkdtemp())
|
||||
dirs.consumption_dir = Path(tempfile.mkdtemp())
|
||||
dirs.static_dir = Path(tempfile.mkdtemp())
|
||||
dirs.index_dir = dirs.data_dir / "index"
|
||||
dirs.originals_dir = dirs.media_dir / "documents" / "originals"
|
||||
dirs.thumbnail_dir = dirs.media_dir / "documents" / "thumbnails"
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from enum import Enum
|
||||
from enum import StrEnum
|
||||
from typing import TYPE_CHECKING
|
||||
from typing import Any
|
||||
|
||||
@@ -11,7 +11,7 @@ if TYPE_CHECKING:
|
||||
from django.http import HttpRequest
|
||||
|
||||
|
||||
class VersionResolutionError(str, Enum):
|
||||
class VersionResolutionError(StrEnum):
|
||||
INVALID = "invalid"
|
||||
NOT_FOUND = "not_found"
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@ msgid ""
|
||||
msgstr ""
|
||||
"Project-Id-Version: paperless-ngx\n"
|
||||
"Report-Msgid-Bugs-To: \n"
|
||||
"POT-Creation-Date: 2026-02-27 22:38+0000\n"
|
||||
"POT-Creation-Date: 2026-02-26 18:09+0000\n"
|
||||
"PO-Revision-Date: 2022-02-17 04:17\n"
|
||||
"Last-Translator: \n"
|
||||
"Language-Team: English\n"
|
||||
@@ -1856,151 +1856,151 @@ msgstr ""
|
||||
msgid "paperless application settings"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:746
|
||||
#: paperless/settings.py:819
|
||||
msgid "English (US)"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:747
|
||||
#: paperless/settings.py:820
|
||||
msgid "Arabic"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:748
|
||||
#: paperless/settings.py:821
|
||||
msgid "Afrikaans"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:749
|
||||
#: paperless/settings.py:822
|
||||
msgid "Belarusian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:750
|
||||
#: paperless/settings.py:823
|
||||
msgid "Bulgarian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:751
|
||||
#: paperless/settings.py:824
|
||||
msgid "Catalan"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:752
|
||||
#: paperless/settings.py:825
|
||||
msgid "Czech"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:753
|
||||
#: paperless/settings.py:826
|
||||
msgid "Danish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:754
|
||||
#: paperless/settings.py:827
|
||||
msgid "German"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:755
|
||||
#: paperless/settings.py:828
|
||||
msgid "Greek"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:756
|
||||
#: paperless/settings.py:829
|
||||
msgid "English (GB)"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:757
|
||||
#: paperless/settings.py:830
|
||||
msgid "Spanish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:758
|
||||
#: paperless/settings.py:831
|
||||
msgid "Persian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:759
|
||||
#: paperless/settings.py:832
|
||||
msgid "Finnish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:760
|
||||
#: paperless/settings.py:833
|
||||
msgid "French"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:761
|
||||
#: paperless/settings.py:834
|
||||
msgid "Hungarian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:762
|
||||
#: paperless/settings.py:835
|
||||
msgid "Indonesian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:763
|
||||
#: paperless/settings.py:836
|
||||
msgid "Italian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:764
|
||||
#: paperless/settings.py:837
|
||||
msgid "Japanese"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:765
|
||||
#: paperless/settings.py:838
|
||||
msgid "Korean"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:766
|
||||
#: paperless/settings.py:839
|
||||
msgid "Luxembourgish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:767
|
||||
#: paperless/settings.py:840
|
||||
msgid "Norwegian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:768
|
||||
#: paperless/settings.py:841
|
||||
msgid "Dutch"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:769
|
||||
#: paperless/settings.py:842
|
||||
msgid "Polish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:770
|
||||
#: paperless/settings.py:843
|
||||
msgid "Portuguese (Brazil)"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:771
|
||||
#: paperless/settings.py:844
|
||||
msgid "Portuguese"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:772
|
||||
#: paperless/settings.py:845
|
||||
msgid "Romanian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:773
|
||||
#: paperless/settings.py:846
|
||||
msgid "Russian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:774
|
||||
#: paperless/settings.py:847
|
||||
msgid "Slovak"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:775
|
||||
#: paperless/settings.py:848
|
||||
msgid "Slovenian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:776
|
||||
#: paperless/settings.py:849
|
||||
msgid "Serbian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:777
|
||||
#: paperless/settings.py:850
|
||||
msgid "Swedish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:778
|
||||
#: paperless/settings.py:851
|
||||
msgid "Turkish"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:779
|
||||
#: paperless/settings.py:852
|
||||
msgid "Ukrainian"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:780
|
||||
#: paperless/settings.py:853
|
||||
msgid "Vietnamese"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:781
|
||||
#: paperless/settings.py:854
|
||||
msgid "Chinese Simplified"
|
||||
msgstr ""
|
||||
|
||||
#: paperless/settings/__init__.py:782
|
||||
#: paperless/settings.py:855
|
||||
msgid "Chinese Traditional"
|
||||
msgstr ""
|
||||
|
||||
|
||||
@@ -202,43 +202,3 @@ def audit_log_check(app_configs, **kwargs):
|
||||
)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
@register()
|
||||
def check_deprecated_db_settings(
|
||||
app_configs: object,
|
||||
**kwargs: object,
|
||||
) -> list[Warning]:
|
||||
"""Check for deprecated database environment variables.
|
||||
|
||||
Detects legacy advanced options that should be migrated to
|
||||
PAPERLESS_DB_OPTIONS. Returns one Warning per deprecated variable found.
|
||||
"""
|
||||
deprecated_vars: dict[str, str] = {
|
||||
"PAPERLESS_DB_TIMEOUT": "timeout",
|
||||
"PAPERLESS_DB_POOLSIZE": "pool.min_size / pool.max_size",
|
||||
"PAPERLESS_DBSSLMODE": "sslmode",
|
||||
"PAPERLESS_DBSSLROOTCERT": "sslrootcert",
|
||||
"PAPERLESS_DBSSLCERT": "sslcert",
|
||||
"PAPERLESS_DBSSLKEY": "sslkey",
|
||||
}
|
||||
|
||||
warnings: list[Warning] = []
|
||||
|
||||
for var_name, db_option_key in deprecated_vars.items():
|
||||
if not os.getenv(var_name):
|
||||
continue
|
||||
warnings.append(
|
||||
Warning(
|
||||
f"Deprecated environment variable: {var_name}",
|
||||
hint=(
|
||||
f"{var_name} is no longer supported and will be removed in v3.2. "
|
||||
f"Set the equivalent option via PAPERLESS_DB_OPTIONS instead. "
|
||||
f'Example: PAPERLESS_DB_OPTIONS=\'{{"{db_option_key}": "<value>"}}\'. '
|
||||
"See https://docs.paperless-ngx.com/migration/ for the full reference."
|
||||
),
|
||||
id="paperless.W001",
|
||||
),
|
||||
)
|
||||
|
||||
return warnings
|
||||
|
||||
@@ -17,8 +17,6 @@ from dateparser.languages.loader import LocaleDataLoader
|
||||
from django.utils.translation import gettext_lazy as _
|
||||
from dotenv import load_dotenv
|
||||
|
||||
from paperless.settings.custom import parse_db_settings
|
||||
|
||||
logger = logging.getLogger("paperless.settings")
|
||||
|
||||
# Tap paperless.conf if it's available
|
||||
@@ -284,7 +282,7 @@ DEBUG = __get_boolean("PAPERLESS_DEBUG", "NO")
|
||||
# Directories #
|
||||
###############################################################################
|
||||
|
||||
BASE_DIR: Path = Path(__file__).resolve().parent.parent.parent
|
||||
BASE_DIR: Path = Path(__file__).resolve().parent.parent
|
||||
|
||||
STATIC_ROOT = __get_path("PAPERLESS_STATICDIR", BASE_DIR.parent / "static")
|
||||
|
||||
@@ -724,8 +722,83 @@ EMAIL_CERTIFICATE_FILE = __get_optional_path("PAPERLESS_EMAIL_CERTIFICATE_LOCATI
|
||||
###############################################################################
|
||||
# Database #
|
||||
###############################################################################
|
||||
def _parse_db_settings() -> dict:
|
||||
databases = {
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.sqlite3",
|
||||
"NAME": DATA_DIR / "db.sqlite3",
|
||||
"OPTIONS": {},
|
||||
},
|
||||
}
|
||||
if os.getenv("PAPERLESS_DBHOST"):
|
||||
# Have sqlite available as a second option for management commands
|
||||
# This is important when migrating to/from sqlite
|
||||
databases["sqlite"] = databases["default"].copy()
|
||||
|
||||
DATABASES = parse_db_settings(DATA_DIR)
|
||||
databases["default"] = {
|
||||
"HOST": os.getenv("PAPERLESS_DBHOST"),
|
||||
"NAME": os.getenv("PAPERLESS_DBNAME", "paperless"),
|
||||
"USER": os.getenv("PAPERLESS_DBUSER", "paperless"),
|
||||
"PASSWORD": os.getenv("PAPERLESS_DBPASS", "paperless"),
|
||||
"OPTIONS": {},
|
||||
}
|
||||
if os.getenv("PAPERLESS_DBPORT"):
|
||||
databases["default"]["PORT"] = os.getenv("PAPERLESS_DBPORT")
|
||||
|
||||
# Leave room for future extensibility
|
||||
if os.getenv("PAPERLESS_DBENGINE") == "mariadb":
|
||||
engine = "django.db.backends.mysql"
|
||||
# Contrary to Postgres, Django does not natively support connection pooling for MariaDB.
|
||||
# However, since MariaDB uses threads instead of forks, establishing connections is significantly faster
|
||||
# compared to PostgreSQL, so the lack of pooling is not an issue
|
||||
options = {
|
||||
"read_default_file": "/etc/mysql/my.cnf",
|
||||
"charset": "utf8mb4",
|
||||
"ssl_mode": os.getenv("PAPERLESS_DBSSLMODE", "PREFERRED"),
|
||||
"ssl": {
|
||||
"ca": os.getenv("PAPERLESS_DBSSLROOTCERT", None),
|
||||
"cert": os.getenv("PAPERLESS_DBSSLCERT", None),
|
||||
"key": os.getenv("PAPERLESS_DBSSLKEY", None),
|
||||
},
|
||||
}
|
||||
|
||||
else: # Default to PostgresDB
|
||||
engine = "django.db.backends.postgresql"
|
||||
options = {
|
||||
"sslmode": os.getenv("PAPERLESS_DBSSLMODE", "prefer"),
|
||||
"sslrootcert": os.getenv("PAPERLESS_DBSSLROOTCERT", None),
|
||||
"sslcert": os.getenv("PAPERLESS_DBSSLCERT", None),
|
||||
"sslkey": os.getenv("PAPERLESS_DBSSLKEY", None),
|
||||
}
|
||||
if int(os.getenv("PAPERLESS_DB_POOLSIZE", 0)) > 0:
|
||||
options.update(
|
||||
{
|
||||
"pool": {
|
||||
"min_size": 1,
|
||||
"max_size": int(os.getenv("PAPERLESS_DB_POOLSIZE")),
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
databases["default"]["ENGINE"] = engine
|
||||
databases["default"]["OPTIONS"].update(options)
|
||||
|
||||
if os.getenv("PAPERLESS_DB_TIMEOUT") is not None:
|
||||
if databases["default"]["ENGINE"] == "django.db.backends.sqlite3":
|
||||
databases["default"]["OPTIONS"].update(
|
||||
{"timeout": int(os.getenv("PAPERLESS_DB_TIMEOUT"))},
|
||||
)
|
||||
else:
|
||||
databases["default"]["OPTIONS"].update(
|
||||
{"connect_timeout": int(os.getenv("PAPERLESS_DB_TIMEOUT"))},
|
||||
)
|
||||
databases["sqlite"]["OPTIONS"].update(
|
||||
{"timeout": int(os.getenv("PAPERLESS_DB_TIMEOUT"))},
|
||||
)
|
||||
return databases
|
||||
|
||||
|
||||
DATABASES = _parse_db_settings()
|
||||
|
||||
if os.getenv("PAPERLESS_DBENGINE") == "mariadb":
|
||||
# Silence Django error on old MariaDB versions.
|
||||
@@ -1,122 +0,0 @@
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from paperless.settings.parsers import get_choice_from_env
|
||||
from paperless.settings.parsers import get_int_from_env
|
||||
from paperless.settings.parsers import parse_dict_from_str
|
||||
|
||||
|
||||
def parse_db_settings(data_dir: Path) -> dict[str, dict[str, Any]]:
|
||||
"""Parse database settings from environment variables.
|
||||
|
||||
Core connection variables (no deprecation):
|
||||
- PAPERLESS_DBENGINE (sqlite/postgresql/mariadb)
|
||||
- PAPERLESS_DBHOST, PAPERLESS_DBPORT
|
||||
- PAPERLESS_DBNAME, PAPERLESS_DBUSER, PAPERLESS_DBPASS
|
||||
|
||||
Advanced options can be set via:
|
||||
- Legacy individual env vars (deprecated in v3.0, removed in v3.2)
|
||||
- PAPERLESS_DB_OPTIONS (recommended v3+ approach)
|
||||
|
||||
Args:
|
||||
data_dir: The data directory path for SQLite database location.
|
||||
|
||||
Returns:
|
||||
A databases dict suitable for Django DATABASES setting.
|
||||
"""
|
||||
try:
|
||||
engine = get_choice_from_env(
|
||||
"PAPERLESS_DBENGINE",
|
||||
{"sqlite", "postgresql", "mariadb"},
|
||||
default="sqlite",
|
||||
)
|
||||
except ValueError:
|
||||
# MariaDB users already had to set PAPERLESS_DBENGINE, so it was picked up above
|
||||
# SQLite users didn't need to set anything
|
||||
engine = "postgresql" if "PAPERLESS_DBHOST" in os.environ else "sqlite"
|
||||
|
||||
db_config: dict[str, Any]
|
||||
base_options: dict[str, Any]
|
||||
|
||||
match engine:
|
||||
case "sqlite":
|
||||
db_config = {
|
||||
"ENGINE": "django.db.backends.sqlite3",
|
||||
"NAME": str((data_dir / "db.sqlite3").resolve()),
|
||||
}
|
||||
base_options = {}
|
||||
|
||||
case "postgresql":
|
||||
db_config = {
|
||||
"ENGINE": "django.db.backends.postgresql",
|
||||
"HOST": os.getenv("PAPERLESS_DBHOST"),
|
||||
"NAME": os.getenv("PAPERLESS_DBNAME", "paperless"),
|
||||
"USER": os.getenv("PAPERLESS_DBUSER", "paperless"),
|
||||
"PASSWORD": os.getenv("PAPERLESS_DBPASS", "paperless"),
|
||||
}
|
||||
|
||||
base_options = {
|
||||
"sslmode": os.getenv("PAPERLESS_DBSSLMODE", "prefer"),
|
||||
"sslrootcert": os.getenv("PAPERLESS_DBSSLROOTCERT"),
|
||||
"sslcert": os.getenv("PAPERLESS_DBSSLCERT"),
|
||||
"sslkey": os.getenv("PAPERLESS_DBSSLKEY"),
|
||||
}
|
||||
|
||||
if (pool_size := get_int_from_env("PAPERLESS_DB_POOLSIZE")) is not None:
|
||||
base_options["pool"] = {
|
||||
"min_size": 1,
|
||||
"max_size": pool_size,
|
||||
}
|
||||
|
||||
case "mariadb":
|
||||
db_config = {
|
||||
"ENGINE": "django.db.backends.mysql",
|
||||
"HOST": os.getenv("PAPERLESS_DBHOST"),
|
||||
"NAME": os.getenv("PAPERLESS_DBNAME", "paperless"),
|
||||
"USER": os.getenv("PAPERLESS_DBUSER", "paperless"),
|
||||
"PASSWORD": os.getenv("PAPERLESS_DBPASS", "paperless"),
|
||||
}
|
||||
|
||||
base_options = {
|
||||
"read_default_file": "/etc/mysql/my.cnf",
|
||||
"charset": "utf8mb4",
|
||||
"collation": "utf8mb4_unicode_ci",
|
||||
"ssl_mode": os.getenv("PAPERLESS_DBSSLMODE", "PREFERRED"),
|
||||
"ssl": {
|
||||
"ca": os.getenv("PAPERLESS_DBSSLROOTCERT"),
|
||||
"cert": os.getenv("PAPERLESS_DBSSLCERT"),
|
||||
"key": os.getenv("PAPERLESS_DBSSLKEY"),
|
||||
},
|
||||
}
|
||||
case _: # pragma: no cover
|
||||
raise NotImplementedError(engine)
|
||||
|
||||
# Handle port setting for external databases
|
||||
if (
|
||||
engine in ("postgresql", "mariadb")
|
||||
and (port := get_int_from_env("PAPERLESS_DBPORT")) is not None
|
||||
):
|
||||
db_config["PORT"] = port
|
||||
|
||||
# Handle timeout setting (common across all engines, different key names)
|
||||
if (timeout := get_int_from_env("PAPERLESS_DB_TIMEOUT")) is not None:
|
||||
timeout_key = "timeout" if engine == "sqlite" else "connect_timeout"
|
||||
base_options[timeout_key] = timeout
|
||||
|
||||
# Apply PAPERLESS_DB_OPTIONS overrides
|
||||
db_config["OPTIONS"] = parse_dict_from_str(
|
||||
os.getenv("PAPERLESS_DB_OPTIONS"),
|
||||
defaults=base_options,
|
||||
separator=";",
|
||||
type_map={
|
||||
# SQLite options
|
||||
"timeout": int,
|
||||
# Postgres/MariaDB options
|
||||
"connect_timeout": int,
|
||||
"pool.min_size": int,
|
||||
"pool.max_size": int,
|
||||
},
|
||||
)
|
||||
|
||||
return {"default": db_config}
|
||||
@@ -1,192 +0,0 @@
|
||||
import copy
|
||||
import os
|
||||
from collections.abc import Callable
|
||||
from collections.abc import Mapping
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from typing import TypeVar
|
||||
from typing import overload
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
|
||||
def str_to_bool(value: str) -> bool:
|
||||
"""
|
||||
Converts a string representation of truth to a boolean value.
|
||||
|
||||
Recognizes 'true', '1', 't', 'y', 'yes' as True, and
|
||||
'false', '0', 'f', 'n', 'no' as False. Case-insensitive.
|
||||
|
||||
Args:
|
||||
value: The string to convert.
|
||||
|
||||
Returns:
|
||||
The boolean representation of the string.
|
||||
|
||||
Raises:
|
||||
ValueError: If the string is not a recognized boolean value.
|
||||
"""
|
||||
val_lower = value.strip().lower()
|
||||
if val_lower in ("true", "1", "t", "y", "yes"):
|
||||
return True
|
||||
elif val_lower in ("false", "0", "f", "n", "no"):
|
||||
return False
|
||||
raise ValueError(f"Cannot convert '{value}' to a boolean.")
|
||||
|
||||
|
||||
@overload
|
||||
def get_int_from_env(key: str) -> int | None: ...
|
||||
|
||||
|
||||
@overload
|
||||
def get_int_from_env(key: str, default: None) -> int | None: ...
|
||||
|
||||
|
||||
@overload
|
||||
def get_int_from_env(key: str, default: int) -> int: ...
|
||||
|
||||
|
||||
def get_int_from_env(key: str, default: int | None = None) -> int | None:
|
||||
"""
|
||||
Return an integer value based on the environment variable.
|
||||
If default is provided, returns that value when key is missing.
|
||||
If default is None, returns None when key is missing.
|
||||
"""
|
||||
if key not in os.environ:
|
||||
return default
|
||||
|
||||
return int(os.environ[key])
|
||||
|
||||
|
||||
def parse_dict_from_str(
|
||||
env_str: str | None,
|
||||
defaults: dict[str, Any] | None = None,
|
||||
type_map: Mapping[str, Callable[[str], Any]] | None = None,
|
||||
separator: str = ",",
|
||||
) -> dict[str, Any]:
|
||||
"""
|
||||
Parses a key-value string into a dictionary, applying defaults and casting types.
|
||||
|
||||
Supports nested keys via dot-notation, e.g.:
|
||||
"database.host=localhost,database.port=5432"
|
||||
|
||||
Args:
|
||||
env_str: The string from the environment variable (e.g., "port=9090,debug=true").
|
||||
defaults: A dictionary of default values (can contain nested dicts).
|
||||
type_map: A dictionary mapping keys (dot-notation allowed) to a type or a parsing
|
||||
function (e.g., {'port': int, 'debug': bool, 'database.port': int}).
|
||||
The special `bool` type triggers custom boolean parsing.
|
||||
separator: The character used to separate key-value pairs. Defaults to ','.
|
||||
|
||||
Returns:
|
||||
A dictionary with the parsed and correctly-typed settings.
|
||||
|
||||
Raises:
|
||||
ValueError: If a value cannot be cast to its specified type.
|
||||
"""
|
||||
|
||||
def _set_nested(d: dict, keys: list[str], value: Any) -> None:
|
||||
"""Set a nested value, creating intermediate dicts as needed."""
|
||||
cur = d
|
||||
for k in keys[:-1]:
|
||||
if k not in cur or not isinstance(cur[k], dict):
|
||||
cur[k] = {}
|
||||
cur = cur[k]
|
||||
cur[keys[-1]] = value
|
||||
|
||||
def _get_nested(d: dict, keys: list[str]) -> Any:
|
||||
"""Get nested value or raise KeyError if not present."""
|
||||
cur = d
|
||||
for k in keys:
|
||||
if not isinstance(cur, dict) or k not in cur:
|
||||
raise KeyError
|
||||
cur = cur[k]
|
||||
return cur
|
||||
|
||||
def _has_nested(d: dict, keys: list[str]) -> bool:
|
||||
try:
|
||||
_get_nested(d, keys)
|
||||
return True
|
||||
except KeyError:
|
||||
return False
|
||||
|
||||
settings: dict[str, Any] = copy.deepcopy(defaults) if defaults else {}
|
||||
_type_map = type_map if type_map else {}
|
||||
|
||||
if not env_str:
|
||||
return settings
|
||||
|
||||
# Parse the environment string using the specified separator
|
||||
pairs = [p.strip() for p in env_str.split(separator) if p.strip()]
|
||||
for pair in pairs:
|
||||
if "=" not in pair:
|
||||
# ignore malformed pairs
|
||||
continue
|
||||
key, val = pair.split("=", 1)
|
||||
key = key.strip()
|
||||
val = val.strip()
|
||||
if not key:
|
||||
continue
|
||||
parts = key.split(".")
|
||||
_set_nested(settings, parts, val)
|
||||
|
||||
# Apply type casting to the updated settings (supports nested keys in type_map)
|
||||
for key, caster in _type_map.items():
|
||||
key_parts = key.split(".")
|
||||
if _has_nested(settings, key_parts):
|
||||
raw_val = _get_nested(settings, key_parts)
|
||||
# Only cast if it's a string (i.e. from env parsing). If defaults already provided
|
||||
# a different type we leave it as-is.
|
||||
if isinstance(raw_val, str):
|
||||
try:
|
||||
if caster is bool:
|
||||
parsed = str_to_bool(raw_val)
|
||||
elif caster is Path:
|
||||
parsed = Path(raw_val).resolve()
|
||||
else:
|
||||
parsed = caster(raw_val)
|
||||
except (ValueError, TypeError) as e:
|
||||
caster_name = getattr(caster, "__name__", repr(caster))
|
||||
raise ValueError(
|
||||
f"Error casting key '{key}' with value '{raw_val}' "
|
||||
f"to type '{caster_name}'",
|
||||
) from e
|
||||
_set_nested(settings, key_parts, parsed)
|
||||
|
||||
return settings
|
||||
|
||||
|
||||
def get_choice_from_env(
|
||||
env_key: str,
|
||||
choices: set[str],
|
||||
default: str | None = None,
|
||||
) -> str:
|
||||
"""
|
||||
Gets and validates an environment variable against a set of allowed choices.
|
||||
|
||||
Args:
|
||||
env_key: The environment variable key to validate
|
||||
choices: Set of valid choices for the environment variable
|
||||
default: Optional default value if environment variable is not set
|
||||
|
||||
Returns:
|
||||
The validated environment variable value
|
||||
|
||||
Raises:
|
||||
ValueError: If the environment variable value is not in choices
|
||||
or if no default is provided and env var is missing
|
||||
"""
|
||||
value = os.environ.get(env_key, default)
|
||||
|
||||
if value is None:
|
||||
raise ValueError(
|
||||
f"Environment variable '{env_key}' is required but not set.",
|
||||
)
|
||||
|
||||
if value not in choices:
|
||||
raise ValueError(
|
||||
f"Environment variable '{env_key}' has invalid value '{value}'. "
|
||||
f"Valid choices are: {', '.join(sorted(choices))}",
|
||||
)
|
||||
|
||||
return value
|
||||
@@ -1,266 +0,0 @@
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from paperless.settings.custom import parse_db_settings
|
||||
|
||||
|
||||
class TestParseDbSettings:
|
||||
"""Test suite for parse_db_settings function."""
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("env_vars", "expected_database_settings"),
|
||||
[
|
||||
pytest.param(
|
||||
{},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.sqlite3",
|
||||
"NAME": None, # Will be replaced with tmp_path
|
||||
"OPTIONS": {},
|
||||
},
|
||||
},
|
||||
id="default-sqlite",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "sqlite",
|
||||
"PAPERLESS_DB_OPTIONS": "timeout=30",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.sqlite3",
|
||||
"NAME": None, # Will be replaced with tmp_path
|
||||
"OPTIONS": {
|
||||
"timeout": 30,
|
||||
},
|
||||
},
|
||||
},
|
||||
id="sqlite-with-timeout-override",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "postgresql",
|
||||
"PAPERLESS_DBHOST": "localhost",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.postgresql",
|
||||
"HOST": "localhost",
|
||||
"NAME": "paperless",
|
||||
"USER": "paperless",
|
||||
"PASSWORD": "paperless",
|
||||
"OPTIONS": {
|
||||
"sslmode": "prefer",
|
||||
"sslrootcert": None,
|
||||
"sslcert": None,
|
||||
"sslkey": None,
|
||||
},
|
||||
},
|
||||
},
|
||||
id="postgresql-defaults",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "postgresql",
|
||||
"PAPERLESS_DBHOST": "paperless-db-host",
|
||||
"PAPERLESS_DBPORT": "1111",
|
||||
"PAPERLESS_DBNAME": "customdb",
|
||||
"PAPERLESS_DBUSER": "customuser",
|
||||
"PAPERLESS_DBPASS": "custompass",
|
||||
"PAPERLESS_DB_OPTIONS": "pool.max_size=50;pool.min_size=2;sslmode=require",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.postgresql",
|
||||
"HOST": "paperless-db-host",
|
||||
"PORT": 1111,
|
||||
"NAME": "customdb",
|
||||
"USER": "customuser",
|
||||
"PASSWORD": "custompass",
|
||||
"OPTIONS": {
|
||||
"sslmode": "require",
|
||||
"sslrootcert": None,
|
||||
"sslcert": None,
|
||||
"sslkey": None,
|
||||
"pool": {
|
||||
"min_size": 2,
|
||||
"max_size": 50,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
id="postgresql-overrides",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "postgresql",
|
||||
"PAPERLESS_DBHOST": "pghost",
|
||||
"PAPERLESS_DB_POOLSIZE": "10",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.postgresql",
|
||||
"HOST": "pghost",
|
||||
"NAME": "paperless",
|
||||
"USER": "paperless",
|
||||
"PASSWORD": "paperless",
|
||||
"OPTIONS": {
|
||||
"sslmode": "prefer",
|
||||
"sslrootcert": None,
|
||||
"sslcert": None,
|
||||
"sslkey": None,
|
||||
"pool": {
|
||||
"min_size": 1,
|
||||
"max_size": 10,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
id="postgresql-legacy-poolsize",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "postgresql",
|
||||
"PAPERLESS_DBHOST": "pghost",
|
||||
"PAPERLESS_DBSSLMODE": "require",
|
||||
"PAPERLESS_DBSSLROOTCERT": "/certs/ca.crt",
|
||||
"PAPERLESS_DB_TIMEOUT": "30",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.postgresql",
|
||||
"HOST": "pghost",
|
||||
"NAME": "paperless",
|
||||
"USER": "paperless",
|
||||
"PASSWORD": "paperless",
|
||||
"OPTIONS": {
|
||||
"sslmode": "require",
|
||||
"sslrootcert": "/certs/ca.crt",
|
||||
"sslcert": None,
|
||||
"sslkey": None,
|
||||
"connect_timeout": 30,
|
||||
},
|
||||
},
|
||||
},
|
||||
id="postgresql-legacy-ssl-and-timeout",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "mariadb",
|
||||
"PAPERLESS_DBHOST": "localhost",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.mysql",
|
||||
"HOST": "localhost",
|
||||
"NAME": "paperless",
|
||||
"USER": "paperless",
|
||||
"PASSWORD": "paperless",
|
||||
"OPTIONS": {
|
||||
"read_default_file": "/etc/mysql/my.cnf",
|
||||
"charset": "utf8mb4",
|
||||
"collation": "utf8mb4_unicode_ci",
|
||||
"ssl_mode": "PREFERRED",
|
||||
"ssl": {
|
||||
"ca": None,
|
||||
"cert": None,
|
||||
"key": None,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
id="mariadb-defaults",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "mariadb",
|
||||
"PAPERLESS_DBHOST": "paperless-mariadb-host",
|
||||
"PAPERLESS_DBPORT": "5555",
|
||||
"PAPERLESS_DBUSER": "my-cool-user",
|
||||
"PAPERLESS_DBPASS": "my-secure-password",
|
||||
"PAPERLESS_DB_OPTIONS": "ssl.ca=/path/to/ca.pem;ssl_mode=REQUIRED",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.mysql",
|
||||
"HOST": "paperless-mariadb-host",
|
||||
"PORT": 5555,
|
||||
"NAME": "paperless",
|
||||
"USER": "my-cool-user",
|
||||
"PASSWORD": "my-secure-password",
|
||||
"OPTIONS": {
|
||||
"read_default_file": "/etc/mysql/my.cnf",
|
||||
"charset": "utf8mb4",
|
||||
"collation": "utf8mb4_unicode_ci",
|
||||
"ssl_mode": "REQUIRED",
|
||||
"ssl": {
|
||||
"ca": "/path/to/ca.pem",
|
||||
"cert": None,
|
||||
"key": None,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
id="mariadb-overrides",
|
||||
),
|
||||
pytest.param(
|
||||
{
|
||||
"PAPERLESS_DBENGINE": "mariadb",
|
||||
"PAPERLESS_DBHOST": "mariahost",
|
||||
"PAPERLESS_DBSSLMODE": "REQUIRED",
|
||||
"PAPERLESS_DBSSLROOTCERT": "/certs/ca.pem",
|
||||
"PAPERLESS_DBSSLCERT": "/certs/client.pem",
|
||||
"PAPERLESS_DBSSLKEY": "/certs/client.key",
|
||||
"PAPERLESS_DB_TIMEOUT": "25",
|
||||
},
|
||||
{
|
||||
"default": {
|
||||
"ENGINE": "django.db.backends.mysql",
|
||||
"HOST": "mariahost",
|
||||
"NAME": "paperless",
|
||||
"USER": "paperless",
|
||||
"PASSWORD": "paperless",
|
||||
"OPTIONS": {
|
||||
"read_default_file": "/etc/mysql/my.cnf",
|
||||
"charset": "utf8mb4",
|
||||
"collation": "utf8mb4_unicode_ci",
|
||||
"ssl_mode": "REQUIRED",
|
||||
"ssl": {
|
||||
"ca": "/certs/ca.pem",
|
||||
"cert": "/certs/client.pem",
|
||||
"key": "/certs/client.key",
|
||||
},
|
||||
"connect_timeout": 25,
|
||||
},
|
||||
},
|
||||
},
|
||||
id="mariadb-legacy-ssl-and-timeout",
|
||||
),
|
||||
],
|
||||
)
|
||||
def test_parse_db_settings(
|
||||
self,
|
||||
tmp_path: Path,
|
||||
mocker: MockerFixture,
|
||||
env_vars: dict[str, str],
|
||||
expected_database_settings: dict[str, dict],
|
||||
) -> None:
|
||||
"""Test various database configurations with defaults and overrides."""
|
||||
# Clear environment and set test vars
|
||||
mocker.patch.dict(os.environ, env_vars, clear=True)
|
||||
|
||||
# Update expected paths with actual tmp_path
|
||||
if (
|
||||
"default" in expected_database_settings
|
||||
and expected_database_settings["default"]["NAME"] is None
|
||||
):
|
||||
expected_database_settings["default"]["NAME"] = str(
|
||||
tmp_path / "db.sqlite3",
|
||||
)
|
||||
|
||||
settings = parse_db_settings(tmp_path)
|
||||
|
||||
assert settings == expected_database_settings
|
||||
@@ -1,414 +0,0 @@
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from paperless.settings.parsers import get_choice_from_env
|
||||
from paperless.settings.parsers import get_int_from_env
|
||||
from paperless.settings.parsers import parse_dict_from_str
|
||||
from paperless.settings.parsers import str_to_bool
|
||||
|
||||
|
||||
class TestStringToBool:
|
||||
@pytest.mark.parametrize(
|
||||
"true_value",
|
||||
[
|
||||
pytest.param("true", id="lowercase_true"),
|
||||
pytest.param("1", id="digit_1"),
|
||||
pytest.param("T", id="capital_T"),
|
||||
pytest.param("y", id="lowercase_y"),
|
||||
pytest.param("YES", id="uppercase_YES"),
|
||||
pytest.param(" True ", id="whitespace_true"),
|
||||
],
|
||||
)
|
||||
def test_true_conversion(self, true_value: str):
|
||||
"""Test that various 'true' strings correctly evaluate to True."""
|
||||
assert str_to_bool(true_value) is True
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"false_value",
|
||||
[
|
||||
pytest.param("false", id="lowercase_false"),
|
||||
pytest.param("0", id="digit_0"),
|
||||
pytest.param("f", id="capital_f"),
|
||||
pytest.param("N", id="capital_N"),
|
||||
pytest.param("no", id="lowercase_no"),
|
||||
pytest.param(" False ", id="whitespace_false"),
|
||||
],
|
||||
)
|
||||
def test_false_conversion(self, false_value: str):
|
||||
"""Test that various 'false' strings correctly evaluate to False."""
|
||||
assert str_to_bool(false_value) is False
|
||||
|
||||
def test_invalid_conversion(self):
|
||||
"""Test that an invalid string raises a ValueError."""
|
||||
with pytest.raises(ValueError, match="Cannot convert 'maybe' to a boolean\\."):
|
||||
str_to_bool("maybe")
|
||||
|
||||
|
||||
class TestParseDictFromString:
|
||||
def test_empty_and_none_input(self):
|
||||
"""Test behavior with None or empty string input."""
|
||||
assert parse_dict_from_str(None) == {}
|
||||
assert parse_dict_from_str("") == {}
|
||||
defaults = {"a": 1}
|
||||
res = parse_dict_from_str(None, defaults=defaults)
|
||||
assert res == defaults
|
||||
# Ensure it returns a copy, not the original object
|
||||
assert res is not defaults
|
||||
|
||||
def test_basic_parsing(self):
|
||||
"""Test simple key-value parsing without defaults or types."""
|
||||
env_str = "key1=val1, key2=val2"
|
||||
expected = {"key1": "val1", "key2": "val2"}
|
||||
assert parse_dict_from_str(env_str) == expected
|
||||
|
||||
def test_with_defaults(self):
|
||||
"""Test that environment values override defaults correctly."""
|
||||
defaults = {"host": "localhost", "port": 8000, "user": "default"}
|
||||
env_str = "port=9090, host=db.example.com"
|
||||
expected = {"host": "db.example.com", "port": "9090", "user": "default"}
|
||||
result = parse_dict_from_str(env_str, defaults=defaults)
|
||||
assert result == expected
|
||||
|
||||
def test_type_casting(self):
|
||||
"""Test successful casting of values to specified types."""
|
||||
env_str = "port=9090, debug=true, timeout=12.5, user=admin"
|
||||
type_map = {"port": int, "debug": bool, "timeout": float}
|
||||
expected = {"port": 9090, "debug": True, "timeout": 12.5, "user": "admin"}
|
||||
result = parse_dict_from_str(env_str, type_map=type_map)
|
||||
assert result == expected
|
||||
|
||||
def test_type_casting_with_defaults(self):
|
||||
"""Test casting when values come from both defaults and env string."""
|
||||
defaults = {"port": 8000, "debug": False, "retries": 3}
|
||||
env_str = "port=9090, debug=true"
|
||||
type_map = {"port": int, "debug": bool, "retries": int}
|
||||
|
||||
# The 'retries' value comes from defaults and is already an int,
|
||||
# so it should not be processed by the caster.
|
||||
expected = {"port": 9090, "debug": True, "retries": 3}
|
||||
result = parse_dict_from_str(env_str, defaults=defaults, type_map=type_map)
|
||||
assert result == expected
|
||||
assert isinstance(result["retries"], int)
|
||||
|
||||
def test_path_casting(self, tmp_path: Path):
|
||||
"""Test successful casting of a string to a resolved pathlib.Path object."""
|
||||
# Create a dummy file to resolve against
|
||||
test_file = tmp_path / "test_file.txt"
|
||||
test_file.touch()
|
||||
|
||||
env_str = f"config_path={test_file}"
|
||||
type_map = {"config_path": Path}
|
||||
result = parse_dict_from_str(env_str, type_map=type_map)
|
||||
|
||||
# The result should be a resolved Path object
|
||||
assert isinstance(result["config_path"], Path)
|
||||
assert result["config_path"] == test_file.resolve()
|
||||
|
||||
def test_custom_separator(self):
|
||||
"""Test parsing with a custom separator like a semicolon."""
|
||||
env_str = "host=db; port=5432; user=test"
|
||||
expected = {"host": "db", "port": "5432", "user": "test"}
|
||||
result = parse_dict_from_str(env_str, separator=";")
|
||||
assert result == expected
|
||||
|
||||
def test_edge_cases_in_string(self):
|
||||
"""Test malformed strings to ensure robustness."""
|
||||
# Malformed pair 'debug' is skipped, extra comma is ignored
|
||||
env_str = "key=val,, debug, foo=bar"
|
||||
expected = {"key": "val", "foo": "bar"}
|
||||
assert parse_dict_from_str(env_str) == expected
|
||||
|
||||
# Value can contain the equals sign
|
||||
env_str = "url=postgres://user:pass@host:5432/db"
|
||||
expected = {"url": "postgres://user:pass@host:5432/db"}
|
||||
assert parse_dict_from_str(env_str) == expected
|
||||
|
||||
def test_casting_error_handling(self):
|
||||
"""Test that a ValueError is raised for invalid casting."""
|
||||
env_str = "port=not-a-number"
|
||||
type_map = {"port": int}
|
||||
|
||||
with pytest.raises(ValueError) as excinfo:
|
||||
parse_dict_from_str(env_str, type_map=type_map)
|
||||
|
||||
assert "Error casting key 'port'" in str(excinfo.value)
|
||||
assert "value 'not-a-number'" in str(excinfo.value)
|
||||
assert "to type 'int'" in str(excinfo.value)
|
||||
|
||||
def test_bool_casting_error(self):
|
||||
"""Test that an invalid boolean string raises a ValueError."""
|
||||
env_str = "debug=maybe"
|
||||
type_map = {"debug": bool}
|
||||
with pytest.raises(ValueError, match="Error casting key 'debug'"):
|
||||
parse_dict_from_str(env_str, type_map=type_map)
|
||||
|
||||
def test_nested_key_parsing_basic(self):
|
||||
"""Basic nested key parsing using dot-notation."""
|
||||
env_str = "database.host=db.example.com, database.port=5432, logging.level=INFO"
|
||||
result = parse_dict_from_str(env_str)
|
||||
assert result == {
|
||||
"database": {"host": "db.example.com", "port": "5432"},
|
||||
"logging": {"level": "INFO"},
|
||||
}
|
||||
|
||||
def test_nested_overrides_defaults_and_deepcopy(self):
|
||||
"""Nested env keys override defaults and defaults are deep-copied."""
|
||||
defaults = {"database": {"host": "127.0.0.1", "port": 3306, "user": "default"}}
|
||||
env_str = "database.host=db.example.com, debug=true"
|
||||
result = parse_dict_from_str(
|
||||
env_str,
|
||||
defaults=defaults,
|
||||
type_map={"debug": bool},
|
||||
)
|
||||
|
||||
assert result["database"]["host"] == "db.example.com"
|
||||
# Unchanged default preserved
|
||||
assert result["database"]["port"] == 3306
|
||||
assert result["database"]["user"] == "default"
|
||||
# Default object was deep-copied (no same nested object identity)
|
||||
assert result is not defaults
|
||||
assert result["database"] is not defaults["database"]
|
||||
|
||||
def test_nested_type_casting(self):
|
||||
"""Type casting for nested keys (dot-notation) should work."""
|
||||
env_str = "database.host=db.example.com, database.port=5433, debug=false"
|
||||
type_map = {"database.port": int, "debug": bool}
|
||||
result = parse_dict_from_str(env_str, type_map=type_map)
|
||||
|
||||
assert result["database"]["host"] == "db.example.com"
|
||||
assert result["database"]["port"] == 5433
|
||||
assert isinstance(result["database"]["port"], int)
|
||||
assert result["debug"] is False
|
||||
assert isinstance(result["debug"], bool)
|
||||
|
||||
def test_nested_casting_error_message(self):
|
||||
"""Error messages should include the full dotted key name on failure."""
|
||||
env_str = "database.port=not-a-number"
|
||||
type_map = {"database.port": int}
|
||||
with pytest.raises(ValueError) as excinfo:
|
||||
parse_dict_from_str(env_str, type_map=type_map)
|
||||
|
||||
msg = str(excinfo.value)
|
||||
assert "Error casting key 'database.port'" in msg
|
||||
assert "value 'not-a-number'" in msg
|
||||
assert "to type 'int'" in msg
|
||||
|
||||
def test_type_map_does_not_recast_non_string_defaults(self):
|
||||
"""If a default already provides a non-string value, the caster should skip it."""
|
||||
defaults = {"database": {"port": 3306}}
|
||||
type_map = {"database.port": int}
|
||||
result = parse_dict_from_str(None, defaults=defaults, type_map=type_map)
|
||||
assert result["database"]["port"] == 3306
|
||||
assert isinstance(result["database"]["port"], int)
|
||||
|
||||
|
||||
class TestGetIntFromEnv:
|
||||
@pytest.mark.parametrize(
|
||||
("env_value", "expected"),
|
||||
[
|
||||
pytest.param("42", 42, id="positive"),
|
||||
pytest.param("-10", -10, id="negative"),
|
||||
pytest.param("0", 0, id="zero"),
|
||||
pytest.param("999", 999, id="large_positive"),
|
||||
pytest.param("-999", -999, id="large_negative"),
|
||||
],
|
||||
)
|
||||
def test_existing_env_var_valid_ints(self, mocker, env_value, expected):
|
||||
"""Test that existing environment variables with valid integers return correct values."""
|
||||
mocker.patch.dict(os.environ, {"INT_VAR": env_value})
|
||||
assert get_int_from_env("INT_VAR") == expected
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("default", "expected"),
|
||||
[
|
||||
pytest.param(100, 100, id="positive_default"),
|
||||
pytest.param(0, 0, id="zero_default"),
|
||||
pytest.param(-50, -50, id="negative_default"),
|
||||
pytest.param(None, None, id="none_default"),
|
||||
],
|
||||
)
|
||||
def test_missing_env_var_with_defaults(self, mocker, default, expected):
|
||||
"""Test that missing environment variables return provided defaults."""
|
||||
mocker.patch.dict(os.environ, {}, clear=True)
|
||||
assert get_int_from_env("MISSING_VAR", default=default) == expected
|
||||
|
||||
def test_missing_env_var_no_default(self, mocker):
|
||||
"""Test that missing environment variable with no default returns None."""
|
||||
mocker.patch.dict(os.environ, {}, clear=True)
|
||||
assert get_int_from_env("MISSING_VAR") is None
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
"invalid_value",
|
||||
[
|
||||
pytest.param("not_a_number", id="text"),
|
||||
pytest.param("42.5", id="float"),
|
||||
pytest.param("42a", id="alpha_suffix"),
|
||||
pytest.param("", id="empty"),
|
||||
pytest.param(" ", id="whitespace"),
|
||||
pytest.param("true", id="boolean"),
|
||||
pytest.param("1.0", id="decimal"),
|
||||
],
|
||||
)
|
||||
def test_invalid_int_values_raise_error(self, mocker, invalid_value):
|
||||
"""Test that invalid integer values raise ValueError."""
|
||||
mocker.patch.dict(os.environ, {"INVALID_INT": invalid_value})
|
||||
with pytest.raises(ValueError):
|
||||
get_int_from_env("INVALID_INT")
|
||||
|
||||
|
||||
class TestGetEnvChoice:
|
||||
@pytest.fixture
|
||||
def valid_choices(self) -> set[str]:
|
||||
"""Fixture providing a set of valid environment choices."""
|
||||
return {"development", "staging", "production"}
|
||||
|
||||
def test_returns_valid_env_value(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test that function returns the environment value when it's valid."""
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": "development"})
|
||||
|
||||
result = get_choice_from_env("TEST_ENV", valid_choices)
|
||||
|
||||
assert result == "development"
|
||||
|
||||
def test_returns_default_when_env_not_set(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test that function returns default value when env var is not set."""
|
||||
mocker.patch.dict("os.environ", {}, clear=True)
|
||||
|
||||
result = get_choice_from_env("TEST_ENV", valid_choices, default="staging")
|
||||
|
||||
assert result == "staging"
|
||||
|
||||
def test_raises_error_when_env_not_set_and_no_default(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test that function raises ValueError when env var is missing and no default."""
|
||||
mocker.patch.dict("os.environ", {}, clear=True)
|
||||
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
get_choice_from_env("TEST_ENV", valid_choices)
|
||||
|
||||
assert "Environment variable 'TEST_ENV' is required but not set" in str(
|
||||
exc_info.value,
|
||||
)
|
||||
|
||||
def test_raises_error_when_env_value_invalid(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test that function raises ValueError when env value is not in choices."""
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": "invalid_value"})
|
||||
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
get_choice_from_env("TEST_ENV", valid_choices)
|
||||
|
||||
error_msg = str(exc_info.value)
|
||||
assert (
|
||||
"Environment variable 'TEST_ENV' has invalid value 'invalid_value'"
|
||||
in error_msg
|
||||
)
|
||||
assert "Valid choices are:" in error_msg
|
||||
assert "development" in error_msg
|
||||
assert "staging" in error_msg
|
||||
assert "production" in error_msg
|
||||
|
||||
def test_raises_error_when_default_invalid(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test that function raises ValueError when default value is not in choices."""
|
||||
mocker.patch.dict("os.environ", {}, clear=True)
|
||||
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
get_choice_from_env("TEST_ENV", valid_choices, default="invalid_default")
|
||||
|
||||
error_msg = str(exc_info.value)
|
||||
assert (
|
||||
"Environment variable 'TEST_ENV' has invalid value 'invalid_default'"
|
||||
in error_msg
|
||||
)
|
||||
|
||||
def test_case_sensitive_validation(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test that validation is case sensitive."""
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": "DEVELOPMENT"})
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
get_choice_from_env("TEST_ENV", valid_choices)
|
||||
|
||||
def test_empty_string_env_value(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test behavior with empty string environment value."""
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": ""})
|
||||
|
||||
with pytest.raises(ValueError) as exc_info:
|
||||
get_choice_from_env("TEST_ENV", valid_choices)
|
||||
|
||||
assert "has invalid value ''" in str(exc_info.value)
|
||||
|
||||
def test_whitespace_env_value(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test behavior with whitespace-only environment value."""
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": " development "})
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
get_choice_from_env("TEST_ENV", valid_choices)
|
||||
|
||||
def test_single_choice_set(self, mocker: MockerFixture) -> None:
|
||||
"""Test function works correctly with single choice set."""
|
||||
single_choice: set[str] = {"production"}
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": "production"})
|
||||
|
||||
result = get_choice_from_env("TEST_ENV", single_choice)
|
||||
|
||||
assert result == "production"
|
||||
|
||||
def test_large_choice_set(self, mocker: MockerFixture) -> None:
|
||||
"""Test function works correctly with large choice set."""
|
||||
large_choices: set[str] = {f"option_{i}" for i in range(100)}
|
||||
mocker.patch.dict("os.environ", {"TEST_ENV": "option_50"})
|
||||
|
||||
result = get_choice_from_env("TEST_ENV", large_choices)
|
||||
|
||||
assert result == "option_50"
|
||||
|
||||
def test_different_env_keys(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
valid_choices: set[str],
|
||||
) -> None:
|
||||
"""Test function works with different environment variable keys."""
|
||||
test_cases = [
|
||||
("DJANGO_ENV", "development"),
|
||||
("DATABASE_BACKEND", "staging"),
|
||||
("LOG_LEVEL", "production"),
|
||||
("APP_MODE", "development"),
|
||||
]
|
||||
|
||||
for env_key, env_value in test_cases:
|
||||
mocker.patch.dict("os.environ", {env_key: env_value})
|
||||
result = get_choice_from_env(env_key, valid_choices)
|
||||
assert result == env_value
|
||||
@@ -78,15 +78,11 @@ class TestCustomAccountAdapter(TestCase):
|
||||
adapter = get_adapter()
|
||||
|
||||
# Test when PAPERLESS_URL is None
|
||||
with override_settings(
|
||||
PAPERLESS_URL=None,
|
||||
ACCOUNT_DEFAULT_HTTP_PROTOCOL="https",
|
||||
):
|
||||
expected_url = f"https://foo.org{reverse('account_reset_password_from_key', kwargs={'uidb36': 'UID', 'key': 'KEY'})}"
|
||||
self.assertEqual(
|
||||
adapter.get_reset_password_from_key_url("UID-KEY"),
|
||||
expected_url,
|
||||
)
|
||||
expected_url = f"https://foo.org{reverse('account_reset_password_from_key', kwargs={'uidb36': 'UID', 'key': 'KEY'})}"
|
||||
self.assertEqual(
|
||||
adapter.get_reset_password_from_key_url("UID-KEY"),
|
||||
expected_url,
|
||||
)
|
||||
|
||||
# Test when PAPERLESS_URL is not None
|
||||
with override_settings(PAPERLESS_URL="https://bar.com"):
|
||||
|
||||
@@ -2,17 +2,13 @@ import os
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
|
||||
import pytest
|
||||
from django.core.checks import Warning
|
||||
from django.test import TestCase
|
||||
from django.test import override_settings
|
||||
from pytest_mock import MockerFixture
|
||||
|
||||
from documents.tests.utils import DirectoriesMixin
|
||||
from documents.tests.utils import FileSystemAssertsMixin
|
||||
from paperless.checks import audit_log_check
|
||||
from paperless.checks import binaries_check
|
||||
from paperless.checks import check_deprecated_db_settings
|
||||
from paperless.checks import debug_mode_check
|
||||
from paperless.checks import paths_check
|
||||
from paperless.checks import settings_values_check
|
||||
@@ -241,157 +237,3 @@ class TestAuditLogChecks(TestCase):
|
||||
("auditlog table was found but audit log is disabled."),
|
||||
msg.msg,
|
||||
)
|
||||
|
||||
|
||||
DEPRECATED_VARS: dict[str, str] = {
|
||||
"PAPERLESS_DB_TIMEOUT": "timeout",
|
||||
"PAPERLESS_DB_POOLSIZE": "pool.min_size / pool.max_size",
|
||||
"PAPERLESS_DBSSLMODE": "sslmode",
|
||||
"PAPERLESS_DBSSLROOTCERT": "sslrootcert",
|
||||
"PAPERLESS_DBSSLCERT": "sslcert",
|
||||
"PAPERLESS_DBSSLKEY": "sslkey",
|
||||
}
|
||||
|
||||
|
||||
class TestDeprecatedDbSettings:
|
||||
"""Test suite for the check_deprecated_db_settings system check."""
|
||||
|
||||
def test_no_deprecated_vars_returns_empty(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""No warnings when none of the deprecated vars are present."""
|
||||
# clear=True ensures vars from the outer test environment do not leak in
|
||||
mocker.patch.dict(os.environ, {}, clear=True)
|
||||
result = check_deprecated_db_settings(None)
|
||||
assert result == []
|
||||
|
||||
@pytest.mark.parametrize(
|
||||
("env_var", "db_option_key"),
|
||||
[
|
||||
("PAPERLESS_DB_TIMEOUT", "timeout"),
|
||||
("PAPERLESS_DB_POOLSIZE", "pool.min_size / pool.max_size"),
|
||||
("PAPERLESS_DBSSLMODE", "sslmode"),
|
||||
("PAPERLESS_DBSSLROOTCERT", "sslrootcert"),
|
||||
("PAPERLESS_DBSSLCERT", "sslcert"),
|
||||
("PAPERLESS_DBSSLKEY", "sslkey"),
|
||||
],
|
||||
ids=[
|
||||
"db-timeout",
|
||||
"db-poolsize",
|
||||
"ssl-mode",
|
||||
"ssl-rootcert",
|
||||
"ssl-cert",
|
||||
"ssl-key",
|
||||
],
|
||||
)
|
||||
def test_single_deprecated_var_produces_one_warning(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
env_var: str,
|
||||
db_option_key: str,
|
||||
) -> None:
|
||||
"""Each deprecated var in isolation produces exactly one warning."""
|
||||
mocker.patch.dict(os.environ, {env_var: "some_value"}, clear=True)
|
||||
result = check_deprecated_db_settings(None)
|
||||
|
||||
assert len(result) == 1
|
||||
warning = result[0]
|
||||
assert isinstance(warning, Warning)
|
||||
assert warning.id == "paperless.W001"
|
||||
assert env_var in warning.hint
|
||||
assert db_option_key in warning.hint
|
||||
|
||||
def test_multiple_deprecated_vars_produce_one_warning_each(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""Each deprecated var present in the environment gets its own warning."""
|
||||
set_vars = {
|
||||
"PAPERLESS_DB_TIMEOUT": "30",
|
||||
"PAPERLESS_DB_POOLSIZE": "10",
|
||||
"PAPERLESS_DBSSLMODE": "require",
|
||||
}
|
||||
mocker.patch.dict(os.environ, set_vars, clear=True)
|
||||
result = check_deprecated_db_settings(None)
|
||||
|
||||
assert len(result) == len(set_vars)
|
||||
assert all(isinstance(w, Warning) for w in result)
|
||||
assert all(w.id == "paperless.W001" for w in result)
|
||||
all_hints = " ".join(w.hint for w in result)
|
||||
for var_name in set_vars:
|
||||
assert var_name in all_hints
|
||||
|
||||
def test_all_deprecated_vars_produces_one_warning_each(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""All deprecated vars set simultaneously produces one warning per var."""
|
||||
all_vars = dict.fromkeys(DEPRECATED_VARS, "some_value")
|
||||
mocker.patch.dict(os.environ, all_vars, clear=True)
|
||||
result = check_deprecated_db_settings(None)
|
||||
|
||||
assert len(result) == len(DEPRECATED_VARS)
|
||||
assert all(isinstance(w, Warning) for w in result)
|
||||
assert all(w.id == "paperless.W001" for w in result)
|
||||
|
||||
def test_unset_vars_not_mentioned_in_warnings(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""Vars absent from the environment do not appear in any warning."""
|
||||
mocker.patch.dict(
|
||||
os.environ,
|
||||
{"PAPERLESS_DB_TIMEOUT": "30"},
|
||||
clear=True,
|
||||
)
|
||||
result = check_deprecated_db_settings(None)
|
||||
|
||||
assert len(result) == 1
|
||||
assert "PAPERLESS_DB_TIMEOUT" in result[0].hint
|
||||
unset_vars = [v for v in DEPRECATED_VARS if v != "PAPERLESS_DB_TIMEOUT"]
|
||||
for var_name in unset_vars:
|
||||
assert var_name not in result[0].hint
|
||||
|
||||
def test_empty_string_var_not_treated_as_set(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""A var set to an empty string is not flagged as a deprecated setting."""
|
||||
mocker.patch.dict(
|
||||
os.environ,
|
||||
{"PAPERLESS_DB_TIMEOUT": ""},
|
||||
clear=True,
|
||||
)
|
||||
result = check_deprecated_db_settings(None)
|
||||
assert result == []
|
||||
|
||||
def test_warning_mentions_migration_target(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""Each warning hints at PAPERLESS_DB_OPTIONS as the migration target."""
|
||||
mocker.patch.dict(
|
||||
os.environ,
|
||||
{"PAPERLESS_DBSSLMODE": "require"},
|
||||
clear=True,
|
||||
)
|
||||
result = check_deprecated_db_settings(None)
|
||||
|
||||
assert len(result) == 1
|
||||
assert "PAPERLESS_DB_OPTIONS" in result[0].hint
|
||||
|
||||
def test_warning_message_identifies_var(
|
||||
self,
|
||||
mocker: MockerFixture,
|
||||
) -> None:
|
||||
"""The warning message (not just the hint) identifies the offending var."""
|
||||
mocker.patch.dict(
|
||||
os.environ,
|
||||
{"PAPERLESS_DBSSLCERT": "/path/to/cert.pem"},
|
||||
clear=True,
|
||||
)
|
||||
result = check_deprecated_db_settings(None)
|
||||
|
||||
assert len(result) == 1
|
||||
assert "PAPERLESS_DBSSLCERT" in result[0].msg
|
||||
|
||||
@@ -9,6 +9,7 @@ from celery.schedules import crontab
|
||||
from paperless.settings import _parse_base_paths
|
||||
from paperless.settings import _parse_beat_schedule
|
||||
from paperless.settings import _parse_dateparser_languages
|
||||
from paperless.settings import _parse_db_settings
|
||||
from paperless.settings import _parse_ignore_dates
|
||||
from paperless.settings import _parse_paperless_url
|
||||
from paperless.settings import _parse_redis_url
|
||||
@@ -377,6 +378,64 @@ class TestCeleryScheduleParsing(TestCase):
|
||||
)
|
||||
|
||||
|
||||
class TestDBSettings(TestCase):
|
||||
def test_db_timeout_with_sqlite(self) -> None:
|
||||
"""
|
||||
GIVEN:
|
||||
- PAPERLESS_DB_TIMEOUT is set
|
||||
WHEN:
|
||||
- Settings are parsed
|
||||
THEN:
|
||||
- PAPERLESS_DB_TIMEOUT set for sqlite
|
||||
"""
|
||||
with mock.patch.dict(
|
||||
os.environ,
|
||||
{
|
||||
"PAPERLESS_DB_TIMEOUT": "10",
|
||||
},
|
||||
):
|
||||
databases = _parse_db_settings()
|
||||
|
||||
self.assertDictEqual(
|
||||
{
|
||||
"timeout": 10.0,
|
||||
},
|
||||
databases["default"]["OPTIONS"],
|
||||
)
|
||||
|
||||
def test_db_timeout_with_not_sqlite(self) -> None:
|
||||
"""
|
||||
GIVEN:
|
||||
- PAPERLESS_DB_TIMEOUT is set but db is not sqlite
|
||||
WHEN:
|
||||
- Settings are parsed
|
||||
THEN:
|
||||
- PAPERLESS_DB_TIMEOUT set correctly in non-sqlite db & for fallback sqlite db
|
||||
"""
|
||||
with mock.patch.dict(
|
||||
os.environ,
|
||||
{
|
||||
"PAPERLESS_DBHOST": "127.0.0.1",
|
||||
"PAPERLESS_DB_TIMEOUT": "10",
|
||||
},
|
||||
):
|
||||
databases = _parse_db_settings()
|
||||
|
||||
self.assertDictEqual(
|
||||
databases["default"]["OPTIONS"],
|
||||
databases["default"]["OPTIONS"]
|
||||
| {
|
||||
"connect_timeout": 10.0,
|
||||
},
|
||||
)
|
||||
self.assertDictEqual(
|
||||
{
|
||||
"timeout": 10.0,
|
||||
},
|
||||
databases["sqlite"]["OPTIONS"],
|
||||
)
|
||||
|
||||
|
||||
class TestPaperlessURLSettings(TestCase):
|
||||
def test_paperless_url(self) -> None:
|
||||
"""
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
from django.test import override_settings
|
||||
from django.conf import settings
|
||||
|
||||
|
||||
def test_favicon_view(client):
|
||||
@@ -11,14 +11,15 @@ def test_favicon_view(client):
|
||||
favicon_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
favicon_path.write_bytes(b"FAKE ICON DATA")
|
||||
|
||||
with override_settings(STATIC_ROOT=static_dir):
|
||||
response = client.get("/favicon.ico")
|
||||
assert response.status_code == 200
|
||||
assert response["Content-Type"] == "image/x-icon"
|
||||
assert b"".join(response.streaming_content) == b"FAKE ICON DATA"
|
||||
settings.STATIC_ROOT = static_dir
|
||||
|
||||
response = client.get("/favicon.ico")
|
||||
assert response.status_code == 200
|
||||
assert response["Content-Type"] == "image/x-icon"
|
||||
assert b"".join(response.streaming_content) == b"FAKE ICON DATA"
|
||||
|
||||
|
||||
def test_favicon_view_missing_file(client):
|
||||
with override_settings(STATIC_ROOT=Path(tempfile.mkdtemp())):
|
||||
response = client.get("/favicon.ico")
|
||||
assert response.status_code == 404
|
||||
settings.STATIC_ROOT = Path(tempfile.mkdtemp())
|
||||
response = client.get("/favicon.ico")
|
||||
assert response.status_code == 404
|
||||
|
||||
@@ -5,7 +5,6 @@ from pathlib import Path
|
||||
from bleach import clean
|
||||
from bleach import linkify
|
||||
from django.conf import settings
|
||||
from django.utils import timezone
|
||||
from django.utils.timezone import is_naive
|
||||
from django.utils.timezone import make_aware
|
||||
from gotenberg_client import GotenbergClient
|
||||
@@ -333,9 +332,7 @@ class MailDocumentParser(DocumentParser):
|
||||
if data["attachments"]:
|
||||
data["attachments_label"] = "Attachments"
|
||||
|
||||
data["date"] = clean_html(
|
||||
timezone.localtime(mail.date).strftime("%Y-%m-%d %H:%M"),
|
||||
)
|
||||
data["date"] = clean_html(mail.date.astimezone().strftime("%Y-%m-%d %H:%M"))
|
||||
data["content"] = clean_html(mail.text.strip())
|
||||
|
||||
from django.template.loader import render_to_string
|
||||
|
||||
28
src/paperless_mail/templates/package-lock.json
generated
28
src/paperless_mail/templates/package-lock.json
generated
@@ -195,9 +195,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/brace-expansion": {
|
||||
"version": "2.0.2",
|
||||
"resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.2.tgz",
|
||||
"integrity": "sha512-Jt0vHyM+jmUBqojB7E1NIYadt0vI0Qxjxd2TErW94wDz+E2LAm5vKMXXwg6ZZBTHPuUlDgQHKXvjGBdfcF1ZDQ==",
|
||||
"version": "2.0.1",
|
||||
"resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.1.tgz",
|
||||
"integrity": "sha512-XnAIvQ8eM+kC6aULx6wuQiwVsnzsi9d3WxzV3FpWTGA19F621kwdbsAcFKXgKUHZWsy+mY6iL1sHTxWEFCytDA==",
|
||||
"dev": true,
|
||||
"dependencies": {
|
||||
"balanced-match": "^1.0.0"
|
||||
@@ -615,12 +615,12 @@
|
||||
}
|
||||
},
|
||||
"node_modules/minimatch": {
|
||||
"version": "9.0.9",
|
||||
"resolved": "https://registry.npmjs.org/minimatch/-/minimatch-9.0.9.tgz",
|
||||
"integrity": "sha512-OBwBN9AL4dqmETlpS2zasx+vTeWclWzkblfZk7KTA5j3jeOONz/tRCnZomUyvNg83wL5Zv9Ss6HMJXAgL8R2Yg==",
|
||||
"version": "9.0.4",
|
||||
"resolved": "https://registry.npmjs.org/minimatch/-/minimatch-9.0.4.tgz",
|
||||
"integrity": "sha512-KqWh+VchfxcMNRAJjj2tnsSJdNbHsVgnkBhTNrW7AjVo6OvLtxw8zfT9oLw1JSohlFzJ8jCoTgaoXvJ+kHt6fw==",
|
||||
"dev": true,
|
||||
"dependencies": {
|
||||
"brace-expansion": "^2.0.2"
|
||||
"brace-expansion": "^2.0.1"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=16 || 14 >=14.17"
|
||||
@@ -1520,9 +1520,9 @@
|
||||
"dev": true
|
||||
},
|
||||
"brace-expansion": {
|
||||
"version": "2.0.2",
|
||||
"resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.2.tgz",
|
||||
"integrity": "sha512-Jt0vHyM+jmUBqojB7E1NIYadt0vI0Qxjxd2TErW94wDz+E2LAm5vKMXXwg6ZZBTHPuUlDgQHKXvjGBdfcF1ZDQ==",
|
||||
"version": "2.0.1",
|
||||
"resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.1.tgz",
|
||||
"integrity": "sha512-XnAIvQ8eM+kC6aULx6wuQiwVsnzsi9d3WxzV3FpWTGA19F621kwdbsAcFKXgKUHZWsy+mY6iL1sHTxWEFCytDA==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"balanced-match": "^1.0.0"
|
||||
@@ -1831,12 +1831,12 @@
|
||||
}
|
||||
},
|
||||
"minimatch": {
|
||||
"version": "9.0.9",
|
||||
"resolved": "https://registry.npmjs.org/minimatch/-/minimatch-9.0.9.tgz",
|
||||
"integrity": "sha512-OBwBN9AL4dqmETlpS2zasx+vTeWclWzkblfZk7KTA5j3jeOONz/tRCnZomUyvNg83wL5Zv9Ss6HMJXAgL8R2Yg==",
|
||||
"version": "9.0.4",
|
||||
"resolved": "https://registry.npmjs.org/minimatch/-/minimatch-9.0.4.tgz",
|
||||
"integrity": "sha512-KqWh+VchfxcMNRAJjj2tnsSJdNbHsVgnkBhTNrW7AjVo6OvLtxw8zfT9oLw1JSohlFzJ8jCoTgaoXvJ+kHt6fw==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"brace-expansion": "^2.0.2"
|
||||
"brace-expansion": "^2.0.1"
|
||||
}
|
||||
},
|
||||
"minipass": {
|
||||
|
||||
@@ -6,7 +6,6 @@ from unittest import mock
|
||||
import httpx
|
||||
import pytest
|
||||
from django.test.html import parse_html
|
||||
from django.utils import timezone
|
||||
from pytest_django.fixtures import SettingsWrapper
|
||||
from pytest_httpx import HTTPXMock
|
||||
from pytest_mock import MockerFixture
|
||||
@@ -635,14 +634,13 @@ class TestParser:
|
||||
THEN:
|
||||
- Resulting HTML is as expected
|
||||
"""
|
||||
with timezone.override("UTC"):
|
||||
mail = mail_parser.parse_file_to_message(html_email_file)
|
||||
html_file = mail_parser.mail_to_html(mail)
|
||||
mail = mail_parser.parse_file_to_message(html_email_file)
|
||||
html_file = mail_parser.mail_to_html(mail)
|
||||
|
||||
expected_html = parse_html(html_email_html_file.read_text())
|
||||
actual_html = parse_html(html_file.read_text())
|
||||
expected_html = parse_html(html_email_html_file.read_text())
|
||||
actual_html = parse_html(html_file.read_text())
|
||||
|
||||
assert expected_html == actual_html
|
||||
assert expected_html == actual_html
|
||||
|
||||
def test_generate_pdf_from_mail(
|
||||
self,
|
||||
|
||||
@@ -1,6 +1,5 @@
|
||||
import shutil
|
||||
import tempfile
|
||||
import unicodedata
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
from unittest import mock
|
||||
@@ -848,18 +847,8 @@ class TestParser(DirectoriesMixin, FileSystemAssertsMixin, TestCase):
|
||||
"application/pdf",
|
||||
)
|
||||
|
||||
# OCR output for RTL text varies across platforms/versions due to
|
||||
# bidi controls and presentation forms; normalize before assertion.
|
||||
normalized_text = "".join(
|
||||
char
|
||||
for char in unicodedata.normalize("NFKC", parser.get_text())
|
||||
if unicodedata.category(char) != "Cf" and not char.isspace()
|
||||
)
|
||||
|
||||
self.assertIn("ةرازو", normalized_text)
|
||||
self.assertTrue(
|
||||
any(token in normalized_text for token in ("ةیلخادلا", "الاخليد")),
|
||||
)
|
||||
# Copied from the PDF to here. Don't even look at it
|
||||
self.assertIn("ةﯾﻠﺧﺎدﻻ ةرازو", parser.get_text())
|
||||
|
||||
@mock.patch("ocrmypdf.ocr")
|
||||
def test_gs_rendering_error(self, m) -> None:
|
||||
|
||||
@@ -18,10 +18,7 @@ nav = [
|
||||
"setup.md",
|
||||
"usage.md",
|
||||
"configuration.md",
|
||||
{ Administration = [
|
||||
"administration.md",
|
||||
{ "v3 Migration Guide" = "migration-v3.md" },
|
||||
] },
|
||||
"administration.md",
|
||||
"advanced_usage.md",
|
||||
"api.md",
|
||||
"development.md",
|
||||
|
||||
Reference in New Issue
Block a user