Merge remote-tracking branch 'origin/dev'

This commit is contained in:
Trenton H
2024-02-10 11:12:27 -08:00
141 changed files with 7351 additions and 4664 deletions

View File

@@ -517,6 +517,18 @@ existing tables) with:
an older system may fix issues that can arise while setting up Paperless-ngx but
`utf8mb3` can cause issues with consumption (where `utf8mb4` does not).
### Missing timezones
MySQL as well as MariaDB do not have any timezone information by default (though some
docker images such as the official MariaDB image take care of this for you) which will
cause unexpected behavior with date-based queries.
To fix this, execute one of the following commands:
MySQL: `mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql -p`
MariaDB: `mariadb-tzinfo-to-sql /usr/share/zoneinfo | mariadb -u root mysql -p`
## Barcodes {#barcodes}
Paperless is able to utilize barcodes for automatically performing some tasks.
@@ -628,3 +640,42 @@ single-sided split marker page, the split document(s) will have an empty page at
whatever else was on the backside of the split marker page.) You can work around that by having
a split marker page that has the split barcode on _both_ sides. This way, the extra page will
get automatically removed.
## SSO and third party authentication with Paperless-ngx
Paperless-ngx has a built-in authentication system from Django but you can easily integrate an
external authentication solution using one of the following methods:
### Remote User authentication
This is a simple option that uses remote user authentication made available by certain SSO
applications. See the relevant configuration options for more information:
[PAPERLESS_ENABLE_HTTP_REMOTE_USER](configuration.md#PAPERLESS_ENABLE_HTTP_REMOTE_USER) and
[PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME](configuration.md#PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME)
### OpenID Connect and social authentication
Version 2.5.0 of Paperless-ngx added support for integrating other authentication systems via
the [django-allauth](https://github.com/pennersr/django-allauth) package. Once set up, users
can either log in or (optionally) sign up using any third party systems you integrate. See the
relevant [configuration settings](configuration.md#PAPERLESS_SOCIALACCOUNT_PROVIDERS) and
[django-allauth docs](https://docs.allauth.org/en/latest/socialaccount/configuration.html)
for more information.
As an example, to set up login via Github, the following environment variables would need to be
set:
```conf
PAPERLESS_APPS="allauth.socialaccount.providers.github"
PAPERLESS_SOCIALACCOUNT_PROVIDERS='{"github": {"APPS": [{"provider_id": "github","name": "Github","client_id": "<CLIENT_ID>","secret": "<CLIENT_SECRET>"}]}}'
```
Or, to use OpenID Connect ("OIDC"), via Keycloak in this example:
```conf
PAPERLESS_APPS="allauth.socialaccount.providers.openid_connect"
PAPERLESS_SOCIALACCOUNT_PROVIDERS='
{"openid_connect": {"APPS": [{"provider_id": "keycloak","name": "Keycloak","client_id": "paperless","secret": "<CLIENT_SECRET>","settings": { "server_url": "https://<KEYCLOAK_SERVER>/realms/<REALM>/.well-known/openid-configuration"}}]}}'
```
More details about configuration option for various providers can be found in the allauth documentation: https://docs.allauth.org/en/latest/socialaccount/providers/index.html#provider-specifics

View File

@@ -139,7 +139,7 @@ document. Paperless only reports PDF metadata at this point.
## Authorization
The REST api provides three different forms of authentication.
The REST api provides four different forms of authentication.
1. Basic authentication
@@ -177,6 +177,12 @@ The REST api provides three different forms of authentication.
Tokens can also be managed in the Django admin.
4. Remote User authentication
If enabled (see
[configuration](configuration.md#PAPERLESS_ENABLE_HTTP_REMOTE_USER_API)),
you can authenticate against the API using Remote User auth.
## Searching for documents
Full text searching is available on the `/api/documents/` endpoint. Two
@@ -185,7 +191,7 @@ results:
- `/api/documents/?query=your%20search%20query`: Search for a document
using a full text query. For details on the syntax, see [Basic Usage - Searching](usage.md#basic-usage_searching).
- `/api/documents/?more_like=1234`: Search for documents similar to
- `/api/documents/?more_like_id=1234`: Search for documents similar to
the document with id 1234.
Pagination works exactly the same as it does for normal requests on this
@@ -324,6 +330,65 @@ granted). You can pass the parameter `full_perms=true` to API calls to view the
full permissions of objects in a format that mirrors the `set_permissions`
parameter above.
## Bulk Editing
The API supports various bulk-editing operations which are executed asynchronously.
### Documents
For bulk operations on documents, use the endpoint `/api/bulk_edit/` which accepts
a json payload of the format:
```json
{
"documents": [LIST_OF_DOCUMENT_IDS],
"method": METHOD, // see below
"parameters": args // see below
}
```
The following methods are supported:
- `set_correspondent`
- Requires `parameters`: `{ "correspondent": CORRESPONDENT_ID }`
- `set_document_type`
- Requires `parameters`: `{ "document_type": DOCUMENT_TYPE_ID }`
- `set_storage_path`
- Requires `parameters`: `{ "storage_path": STORAGE_PATH_ID }`
- `add_tag`
- Requires `parameters`: `{ "tag": TAG_ID }`
- `remove_tag`
- Requires `parameters`: `{ "tag": TAG_ID }`
- `modify_tags`
- Requires `parameters`: `{ "add_tags": [LIST_OF_TAG_IDS] }` and / or `{ "remove_tags": [LIST_OF_TAG_IDS] }`
- `delete`
- No `parameters` required
- `redo_ocr`
- No `parameters` required
- `set_permissions`
- Requires `parameters`:
- `"permissions": PERMISSIONS_OBJ` (see format [above](#permissions)) and / or
- `"owner": OWNER_ID or null`
- `"merge": true or false` (defaults to false)
- The `merge` flag determines if the supplied permissions will overwrite all existing permissions (including
removing them) or be merged with existing permissions.
### Objects
Bulk editing for objects (tags, document types etc.) currently supports set permissions or delete
operations, using the endpoint: `/api/bulk_edit_objects/`, which requires a json payload of the format:
```json
{
"objects": [LIST_OF_OBJECT_IDS],
"object_type": "tags", "correspondents", "document_types" or "storage_paths",
"operation": "set_permissions" or "delete",
"owner": OWNER_ID, // optional
"permissions": { "view": { "users": [] ... }, "change": { ... } }, // (see 'set_permissions' format above)
"merge": true / false // defaults to false, see above
}
```
## API Versioning
The REST API is versioned since Paperless-ngx 1.3.0.
@@ -380,3 +445,13 @@ Initial API version.
color to use for a specific tag, which is either black or white
depending on the brightness of `Tag.color`.
- Removed field `Tag.colour`.
#### Version 3
- Permissions endpoints have been added.
- The format of the `/api/ui_settings/` has changed.
#### Version 4
- Consumption templates were refactored to workflows and API endpoints
changed as such.

View File

@@ -34,6 +34,8 @@ matcher.
`redis://<username>:<password>@<host>:<port>`
- With the requirepass option PAPERLESS_REDIS =
`redis://:<password>@<host>:<port>`
- To include the redis database index PAPERLESS_REDIS =
`redis://<username>:<password>@<host>:<port>/<DBIndex>`
[More information on securing your Redis
Instance](https://redis.io/docs/getting-started/#securing-redis).
@@ -463,9 +465,21 @@ applications.
Defaults to "false" which disables this feature.
#### [`PAPERLESS_ENABLE_HTTP_REMOTE_USER_API=<bool>`](#PAPERLESS_ENABLE_HTTP_REMOTE_USER_API) {#PAPERLESS_ENABLE_HTTP_REMOTE_USER_API}
: Allows authentication via HTTP_REMOTE_USER directly against the API
!!! warning
See the warning above about securing your installation when using remote user header authentication. This setting is separate from
`PAPERLESS_ENABLE_HTTP_REMOTE_USER` to avoid introducing a security vulnerability to existing reverse proxy setups. As above,
ensure that your reverse proxy does not simply pass the `Remote-User` header from the internet to paperless.
Defaults to "false" which disables this feature.
#### [`PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>`](#PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME) {#PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME}
: If "PAPERLESS_ENABLE_HTTP_REMOTE_USER" is enabled, this
: If "PAPERLESS_ENABLE_HTTP_REMOTE_USER" or `PAPERLESS_ENABLE_HTTP_REMOTE_USER_API` are enabled, this
property allows to customize the name of the HTTP header from which
the authenticated username is extracted. Values are in terms of
[HttpRequest.META](https://docs.djangoproject.com/en/4.1/ref/request-response/#django.http.HttpRequest.META).
@@ -522,6 +536,42 @@ This is for use with self-signed certificates against local IMAP servers.
Settings this value has security implications for the security of your email.
Understand what it does and be sure you need to before setting.
#### [`PAPERLESS_SOCIALACCOUNT_PROVIDERS=<json>`](#PAPERLESS_SOCIALACCOUNT_PROVIDERS) {#PAPERLESS_SOCIALACCOUNT_PROVIDERS}
: This variable is used to setup login and signup via social account providers which are compatible with django-allauth.
See the corresponding [django-allauth documentation](https://docs.allauth.org/en/0.60.0/socialaccount/providers/index.html)
for a list of provider configurations. You will also likely need to include the relevant Django 'application' inside the
[PAPERLESS_APPS](#PAPERLESS_APPS) setting.
Defaults to None, which does not enable any third party authentication systems.
#### [`PAPERLESS_SOCIAL_AUTO_SIGNUP=<bool>`](#PAPERLESS_SOCIAL_AUTO_SIGNUP) {#PAPERLESS_SOCIAL_AUTO_SIGNUP}
: Attempt to signup the user using retrieved email, username etc from the third party authentication
system. See the corresponding
[django-allauth documentation](https://docs.allauth.org/en/0.60.0/socialaccount/configuration.html)
Defaults to False
#### [`PAPERLESS_SOCIALACCOUNT_ALLOW_SIGNUPS=<bool>`](#PAPERLESS_SOCIALACCOUNT_ALLOW_SIGNUPS) {#PAPERLESS_SOCIALACCOUNT_ALLOW_SIGNUPS}
: Allow users to signup for a new Paperless-ngx account using any setup third party authentication systems.
Defaults to True
#### [`PAPERLESS_ACCOUNT_ALLOW_SIGNUPS=<bool>`](#PAPERLESS_ACCOUNT_ALLOW_SIGNUPS) {#PAPERLESS_ACCOUNT_ALLOW_SIGNUPS}
: Allow users to signup for a new Paperless-ngx account.
Defaults to False
#### [`PAPERLESS_ACCOUNT_DEFAULT_HTTP_PROTOCOL=<string>`](#PAPERLESS_ACCOUNT_DEFAULT_HTTP_PROTOCOL) {#PAPERLESS_ACCOUNT_DEFAULT_HTTP_PROTOCOL}
: The protocol used when generating URLs, e.g. login callback URLs. See the corresponding
[django-allauth documentation](https://docs.allauth.org/en/latest/account/configuration.html)
Defaults to 'https'
## OCR settings {#ocr}
Paperless uses [OCRmyPDF](https://ocrmypdf.readthedocs.io/en/latest/)
@@ -892,6 +942,14 @@ documents.
Default is none, which disables the temporary directory.
#### [`PAPERLESS_APPS=<string>`](#PAPERLESS_APPS) {#PAPERLESS_APPS}
: A comma-separated list of Django apps to be included in Django's
[`INSTALLED_APPS`](https://docs.djangoproject.com/en/5.0/ref/applications/). This setting should
be used with caution!
Defaults to None, which does not add any additional apps.
## Document Consumption {#consume_config}
#### [`PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool>`](#PAPERLESS_CONSUMER_DELETE_DUPLICATES) {#PAPERLESS_CONSUMER_DELETE_DUPLICATES}
@@ -1162,6 +1220,55 @@ combination with PAPERLESS_CONSUMER_BARCODE_UPSCALE bigger than 1.0.
Defaults to "300"
#### [`PAPERLESS_CONSUMER_ENABLE_TAG_BARCODE=<bool>`](#PAPERLESS_CONSUMER_ENABLE_TAG_BARCODE) {#PAPERLESS_CONSUMER_ENABLE_TAG_BARCODE}
: Enables the detection of barcodes in the scanned document and
assigns or creates tags if a properly formatted barcode is detected.
The barcode must match one of the (configurable) regular expressions.
If the barcode text contains ',' (comma), it is split into multiple
barcodes which are individually processed for tagging.
Matching is case insensitive.
Defaults to false.
#### [`PAPERLESS_CONSUMER_TAG_BARCODE_MAPPING=<json dict>`](#PAPERLESS_CONSUMER_TAG_BARCODE_MAPPING) {#PAPERLESS_CONSUMER_TAG_BARCODE_MAPPING}
: Defines a dictionary of filter regex and substitute expressions.
Syntax: {"<regex>": "<substitute>" [,...]]}
A barcode is considered for tagging if the barcode text matches
at least one of the provided <regex> pattern.
If a match is found, the <substitute> rule is applied. This allows very
versatile reformatting and mapping of barcode pattern to tag values.
If a tag is not found it will be created.
Defaults to:
{"TAG:(.*)": "\\g<1>"} which defines
- a regex TAG:(.*) which includes barcodes beginning with TAG:
followed by any text that gets stored into match group #1 and
- a substitute \\g<1> that replaces the original barcode text
by the content in match group #1.
Consequently, the tag is the barcode text without its TAG: prefix.
More examples:
{"ASN12.*": "JOHN", "ASN13.*": "SMITH"} for example maps
- ASN12nnnn barcodes to the tag JOHN and
- ASN13nnnn barcodes to the tag SMITH.
{"T-J": "JOHN", "T-S": "SMITH", "T-D": "DOE"} directly maps
- T-J barcodes to the tag JOHN,
- T-S barcodes to the tag SMITH and
- T-D barcodes to the tag DOE.
Please refer to the Python regex documentation for more information.
## Audit Trail
#### [`PAPERLESS_AUDIT_LOG_ENABLED=<bool>`](#PAPERLESS_AUDIT_LOG_ENABLED) {#PAPERLESS_AUDIT_LOG_ENABLED}
@@ -1332,6 +1439,12 @@ started by the container.
You can read more about this in the [advanced documentation](advanced_usage.md#celery-monitoring).
#### [`PAPERLESS_SUPERVISORD_WORKING_DIR=<defined>`](#PAPERLESS_SUPERVISORD_WORKING_DIR) {#PAPERLESS_SUPERVISORD_WORKING_DIR}
: If this environment variable is defined, the `supervisord.log` and `supervisord.pid` file will be created under the specified path in `PAPERLESS_SUPERVISORD_WORKING_DIR`. Setting `PAPERLESS_SUPERVISORD_WORKING_DIR=/tmp` and `PYTHONPYCACHEPREFIX=/tmp/pycache` would allow paperless to work on a read-only filesystem.
Please take note that the `PAPERLESS_DATA_DIR` and `PAPERLESS_MEDIA_ROOT` paths still have to be writable, just like the `PAPERLESS_SUPERVISORD_WORKING_DIR`. The can be archived by using bind or volume mounts. Only works in the container is run as user *paperless*
## Frontend Settings
#### [`PAPERLESS_APP_TITLE=<bool>`](#PAPERLESS_APP_TITLE) {#PAPERLESS_APP_TITLE}