mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Documentation: Fix list indentation (#8050)
--------- Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
This commit is contained in:
parent
149d770ad1
commit
605aa50b00
@ -7,9 +7,9 @@
|
|||||||
"trailingComma": "es5",
|
"trailingComma": "es5",
|
||||||
"overrides": [
|
"overrides": [
|
||||||
{
|
{
|
||||||
"files": ["index.md", "administration.md"],
|
"files": ["docs/*.md"],
|
||||||
"options": {
|
"options": {
|
||||||
"tabWidth": 4
|
"tabWidth": 4,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
@ -25,20 +25,20 @@ documents.
|
|||||||
|
|
||||||
The following algorithms are available:
|
The following algorithms are available:
|
||||||
|
|
||||||
- **None:** No matching will be performed.
|
- **None:** No matching will be performed.
|
||||||
- **Any:** Looks for any occurrence of any word provided in match in
|
- **Any:** Looks for any occurrence of any word provided in match in
|
||||||
the PDF. If you define the match as `Bank1 Bank2`, it will match
|
the PDF. If you define the match as `Bank1 Bank2`, it will match
|
||||||
documents containing either of these terms.
|
documents containing either of these terms.
|
||||||
- **All:** Requires that every word provided appears in the PDF,
|
- **All:** Requires that every word provided appears in the PDF,
|
||||||
albeit not in the order provided.
|
albeit not in the order provided.
|
||||||
- **Exact:** Matches only if the match appears exactly as provided
|
- **Exact:** Matches only if the match appears exactly as provided
|
||||||
(i.e. preserve ordering) in the PDF.
|
(i.e. preserve ordering) in the PDF.
|
||||||
- **Regular expression:** Parses the match as a regular expression and
|
- **Regular expression:** Parses the match as a regular expression and
|
||||||
tries to find a match within the document.
|
tries to find a match within the document.
|
||||||
- **Fuzzy match:** Uses a partial matching based on locating the tag text
|
- **Fuzzy match:** Uses a partial matching based on locating the tag text
|
||||||
inside the document, using a [partial ratio](https://rapidfuzz.github.io/RapidFuzz/Usage/fuzz.html#partial-ratio)
|
inside the document, using a [partial ratio](https://rapidfuzz.github.io/RapidFuzz/Usage/fuzz.html#partial-ratio)
|
||||||
- **Auto:** Tries to automatically match new documents. This does not
|
- **Auto:** Tries to automatically match new documents. This does not
|
||||||
require you to set a match. See the [notes below](#automatic-matching).
|
require you to set a match. See the [notes below](#automatic-matching).
|
||||||
|
|
||||||
When using the _any_ or _all_ matching algorithms, you can search for
|
When using the _any_ or _all_ matching algorithms, you can search for
|
||||||
terms that consist of multiple words by enclosing them in double quotes.
|
terms that consist of multiple words by enclosing them in double quotes.
|
||||||
@ -69,33 +69,33 @@ Paperless tries to hide much of the involved complexity with this
|
|||||||
approach. However, there are a couple caveats you need to keep in mind
|
approach. However, there are a couple caveats you need to keep in mind
|
||||||
when using this feature:
|
when using this feature:
|
||||||
|
|
||||||
- Changes to your documents are not immediately reflected by the
|
- Changes to your documents are not immediately reflected by the
|
||||||
matching algorithm. The neural network needs to be _trained_ on your
|
matching algorithm. The neural network needs to be _trained_ on your
|
||||||
documents after changes. Paperless periodically (default: once each
|
documents after changes. Paperless periodically (default: once each
|
||||||
hour) checks for changes and does this automatically for you.
|
hour) checks for changes and does this automatically for you.
|
||||||
- The Auto matching algorithm only takes documents into account which
|
- The Auto matching algorithm only takes documents into account which
|
||||||
are NOT placed in your inbox (i.e. have any inbox tags assigned to
|
are NOT placed in your inbox (i.e. have any inbox tags assigned to
|
||||||
them). This ensures that the neural network only learns from
|
them). This ensures that the neural network only learns from
|
||||||
documents which you have correctly tagged before.
|
documents which you have correctly tagged before.
|
||||||
- The matching algorithm can only work if there is a correlation
|
- The matching algorithm can only work if there is a correlation
|
||||||
between the tag, correspondent, document type, or storage path and
|
between the tag, correspondent, document type, or storage path and
|
||||||
the document itself. Your bank statements usually contain your bank
|
the document itself. Your bank statements usually contain your bank
|
||||||
account number and the name of the bank, so this works reasonably
|
account number and the name of the bank, so this works reasonably
|
||||||
well, However, tags such as "TODO" cannot be automatically
|
well, However, tags such as "TODO" cannot be automatically
|
||||||
assigned.
|
assigned.
|
||||||
- The matching algorithm needs a reasonable number of documents to
|
- The matching algorithm needs a reasonable number of documents to
|
||||||
identify when to assign tags, correspondents, storage paths, and
|
identify when to assign tags, correspondents, storage paths, and
|
||||||
types. If one out of a thousand documents has the correspondent
|
types. If one out of a thousand documents has the correspondent
|
||||||
"Very obscure web shop I bought something five years ago", it will
|
"Very obscure web shop I bought something five years ago", it will
|
||||||
probably not assign this correspondent automatically if you buy
|
probably not assign this correspondent automatically if you buy
|
||||||
something from them again. The more documents, the better.
|
something from them again. The more documents, the better.
|
||||||
- Paperless also needs a reasonable amount of negative examples to
|
- Paperless also needs a reasonable amount of negative examples to
|
||||||
decide when not to assign a certain tag, correspondent, document
|
decide when not to assign a certain tag, correspondent, document
|
||||||
type, or storage path. This will usually be the case as you start
|
type, or storage path. This will usually be the case as you start
|
||||||
filling up paperless with documents. Example: If all your documents
|
filling up paperless with documents. Example: If all your documents
|
||||||
are either from "Webshop" or "Bank", paperless will assign one
|
are either from "Webshop" or "Bank", paperless will assign one
|
||||||
of these correspondents to ANY new document, if both are set to
|
of these correspondents to ANY new document, if both are set to
|
||||||
automatic matching.
|
automatic matching.
|
||||||
|
|
||||||
## Hooking into the consumption process {#consume-hooks}
|
## Hooking into the consumption process {#consume-hooks}
|
||||||
|
|
||||||
@ -242,12 +242,12 @@ webserver:
|
|||||||
|
|
||||||
Troubleshooting:
|
Troubleshooting:
|
||||||
|
|
||||||
- Monitor the Docker Compose log
|
- Monitor the Docker Compose log
|
||||||
`cd ~/paperless-ngx; docker compose logs -f`
|
`cd ~/paperless-ngx; docker compose logs -f`
|
||||||
- Check your script's permission e.g. in case of permission error
|
- Check your script's permission e.g. in case of permission error
|
||||||
`sudo chmod 755 post-consumption-example.sh`
|
`sudo chmod 755 post-consumption-example.sh`
|
||||||
- Pipe your scripts's output to a log file e.g.
|
- Pipe your scripts's output to a log file e.g.
|
||||||
`echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log`
|
`echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log`
|
||||||
|
|
||||||
## File name handling {#file-name-handling}
|
## File name handling {#file-name-handling}
|
||||||
|
|
||||||
@ -302,35 +302,35 @@ will create a directory structure as follows:
|
|||||||
|
|
||||||
Paperless provides the following variables for use within filenames:
|
Paperless provides the following variables for use within filenames:
|
||||||
|
|
||||||
- `{{ asn }}`: The archive serial number of the document, or "none".
|
- `{{ asn }}`: The archive serial number of the document, or "none".
|
||||||
- `{{ correspondent }}`: The name of the correspondent, or "none".
|
- `{{ correspondent }}`: The name of the correspondent, or "none".
|
||||||
- `{{ document_type }}`: The name of the document type, or "none".
|
- `{{ document_type }}`: The name of the document type, or "none".
|
||||||
- `{{ tag_list }}`: A comma separated list of all tags assigned to the
|
- `{{ tag_list }}`: A comma separated list of all tags assigned to the
|
||||||
document.
|
document.
|
||||||
- `{{ title }}`: The title of the document.
|
- `{{ title }}`: The title of the document.
|
||||||
- `{{ created }}`: The full date (ISO format) the document was created.
|
- `{{ created }}`: The full date (ISO format) the document was created.
|
||||||
- `{{ created_year }}`: Year created only, formatted as the year with
|
- `{{ created_year }}`: Year created only, formatted as the year with
|
||||||
century.
|
century.
|
||||||
- `{{ created_year_short }}`: Year created only, formatted as the year
|
- `{{ created_year_short }}`: Year created only, formatted as the year
|
||||||
without century, zero padded.
|
without century, zero padded.
|
||||||
- `{{ created_month }}`: Month created only (number 01-12).
|
- `{{ created_month }}`: Month created only (number 01-12).
|
||||||
- `{{ created_month_name }}`: Month created name, as per locale
|
- `{{ created_month_name }}`: Month created name, as per locale
|
||||||
- `{{ created_month_name_short }}`: Month created abbreviated name, as per
|
- `{{ created_month_name_short }}`: Month created abbreviated name, as per
|
||||||
locale
|
locale
|
||||||
- `{{ created_day }}`: Day created only (number 01-31).
|
- `{{ created_day }}`: Day created only (number 01-31).
|
||||||
- `{{ added }}`: The full date (ISO format) the document was added to
|
- `{{ added }}`: The full date (ISO format) the document was added to
|
||||||
paperless.
|
paperless.
|
||||||
- `{{ added_year }}`: Year added only.
|
- `{{ added_year }}`: Year added only.
|
||||||
- `{{ added_year_short }}`: Year added only, formatted as the year without
|
- `{{ added_year_short }}`: Year added only, formatted as the year without
|
||||||
century, zero padded.
|
century, zero padded.
|
||||||
- `{{ added_month }}`: Month added only (number 01-12).
|
- `{{ added_month }}`: Month added only (number 01-12).
|
||||||
- `{{ added_month_name }}`: Month added name, as per locale
|
- `{{ added_month_name }}`: Month added name, as per locale
|
||||||
- `{{ added_month_name_short }}`: Month added abbreviated name, as per
|
- `{{ added_month_name_short }}`: Month added abbreviated name, as per
|
||||||
locale
|
locale
|
||||||
- `{{ added_day }}`: Day added only (number 01-31).
|
- `{{ added_day }}`: Day added only (number 01-31).
|
||||||
- `{{ owner_username }}`: Username of document owner, if any, or "none"
|
- `{{ owner_username }}`: Username of document owner, if any, or "none"
|
||||||
- `{{ original_name }}`: Document original filename, minus the extension, if any, or "none"
|
- `{{ original_name }}`: Document original filename, minus the extension, if any, or "none"
|
||||||
- `{{ doc_pk }}`: The paperless identifier (primary key) for the document.
|
- `{{ doc_pk }}`: The paperless identifier (primary key) for the document.
|
||||||
|
|
||||||
!!! warning
|
!!! warning
|
||||||
|
|
||||||
@ -381,10 +381,10 @@ before empty placeholders are removed as well, empty directories are omitted.
|
|||||||
When a single storage layout is not sufficient for your use case, storage paths allow for more complex
|
When a single storage layout is not sufficient for your use case, storage paths allow for more complex
|
||||||
structure to set precisely where each document is stored in the file system.
|
structure to set precisely where each document is stored in the file system.
|
||||||
|
|
||||||
- Each storage path is a [`PAPERLESS_FILENAME_FORMAT`](configuration.md#PAPERLESS_FILENAME_FORMAT) and
|
- Each storage path is a [`PAPERLESS_FILENAME_FORMAT`](configuration.md#PAPERLESS_FILENAME_FORMAT) and
|
||||||
follows the rules described above
|
follows the rules described above
|
||||||
- Each document is assigned a storage path using the matching algorithms described above, but can be
|
- Each document is assigned a storage path using the matching algorithms described above, but can be
|
||||||
overwritten at any time
|
overwritten at any time
|
||||||
|
|
||||||
For example, you could define the following two storage paths:
|
For example, you could define the following two storage paths:
|
||||||
|
|
||||||
@ -435,8 +435,8 @@ with more complex logic.
|
|||||||
|
|
||||||
#### Additional Variables
|
#### Additional Variables
|
||||||
|
|
||||||
- `{{ tag_name_list }}`: A list of tag names applied to the document, ordered by the tag name. Note this is a list, not a single string
|
- `{{ tag_name_list }}`: A list of tag names applied to the document, ordered by the tag name. Note this is a list, not a single string
|
||||||
- `{{ custom_fields }}`: A mapping of custom field names to their type and value. A user can access the mapping by field name or check if a field is applied by checking its existence in the variable.
|
- `{{ custom_fields }}`: A mapping of custom field names to their type and value. A user can access the mapping by field name or check if a field is applied by checking its existence in the variable.
|
||||||
|
|
||||||
!!! tip
|
!!! tip
|
||||||
|
|
||||||
@ -532,15 +532,15 @@ installation, you can use volumes to accomplish this:
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
services:
|
services:
|
||||||
# ...
|
|
||||||
webserver:
|
|
||||||
environment:
|
|
||||||
- PAPERLESS_ENABLE_FLOWER
|
|
||||||
ports:
|
|
||||||
- 5555:5555 # (2)!
|
|
||||||
# ...
|
# ...
|
||||||
volumes:
|
webserver:
|
||||||
- /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro # (1)!
|
environment:
|
||||||
|
- PAPERLESS_ENABLE_FLOWER
|
||||||
|
ports:
|
||||||
|
- 5555:5555 # (2)!
|
||||||
|
# ...
|
||||||
|
volumes:
|
||||||
|
- /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro # (1)!
|
||||||
```
|
```
|
||||||
|
|
||||||
1. Note the `:ro` tag means the file will be mounted as read only.
|
1. Note the `:ro` tag means the file will be mounted as read only.
|
||||||
@ -571,11 +571,11 @@ For example, using Docker Compose:
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
services:
|
services:
|
||||||
# ...
|
|
||||||
webserver:
|
|
||||||
# ...
|
# ...
|
||||||
volumes:
|
webserver:
|
||||||
- /path/to/my/scripts:/custom-cont-init.d:ro # (1)!
|
# ...
|
||||||
|
volumes:
|
||||||
|
- /path/to/my/scripts:/custom-cont-init.d:ro # (1)!
|
||||||
```
|
```
|
||||||
|
|
||||||
1. Note the `:ro` tag means the folder will be mounted as read only. This is for extra security against changes
|
1. Note the `:ro` tag means the folder will be mounted as read only. This is for extra security against changes
|
||||||
@ -623,16 +623,16 @@ Paperless is able to utilize barcodes for automatically performing some tasks.
|
|||||||
|
|
||||||
At this time, the library utilized for detection of barcodes supports the following types:
|
At this time, the library utilized for detection of barcodes supports the following types:
|
||||||
|
|
||||||
- AN-13/UPC-A
|
- AN-13/UPC-A
|
||||||
- UPC-E
|
- UPC-E
|
||||||
- EAN-8
|
- EAN-8
|
||||||
- Code 128
|
- Code 128
|
||||||
- Code 93
|
- Code 93
|
||||||
- Code 39
|
- Code 39
|
||||||
- Codabar
|
- Codabar
|
||||||
- Interleaved 2 of 5
|
- Interleaved 2 of 5
|
||||||
- QR Code
|
- QR Code
|
||||||
- SQ Code
|
- SQ Code
|
||||||
|
|
||||||
You may check for updates on the [zbar library homepage](https://github.com/mchehab/zbar).
|
You may check for updates on the [zbar library homepage](https://github.com/mchehab/zbar).
|
||||||
For usage in Paperless, the type of barcode does not matter, only the contents of it.
|
For usage in Paperless, the type of barcode does not matter, only the contents of it.
|
||||||
@ -819,9 +819,9 @@ If using docker, you'll need to add the following volume mounts to your `docker-
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
webserver:
|
webserver:
|
||||||
volumes:
|
volumes:
|
||||||
- /home/user/.gnupg/pubring.gpg:/usr/src/paperless/.gnupg/pubring.gpg
|
- /home/user/.gnupg/pubring.gpg:/usr/src/paperless/.gnupg/pubring.gpg
|
||||||
- <path to gpg-agent.extra socket>:/usr/src/paperless/.gnupg/S.gpg-agent
|
- <path to gpg-agent.extra socket>:/usr/src/paperless/.gnupg/S.gpg-agent
|
||||||
```
|
```
|
||||||
|
|
||||||
For a 'bare-metal' installation no further configuration is necessary. If you
|
For a 'bare-metal' installation no further configuration is necessary. If you
|
||||||
@ -829,9 +829,9 @@ want to use a separate `GNUPG_HOME`, you can do so by configuring the [PAPERLESS
|
|||||||
|
|
||||||
### Troubleshooting
|
### Troubleshooting
|
||||||
|
|
||||||
- Make sure, that `gpg-agent` is running on your host machine
|
- Make sure, that `gpg-agent` is running on your host machine
|
||||||
- Make sure, that encryption and decryption works from inside the container using the `gpg` commands from above.
|
- Make sure, that encryption and decryption works from inside the container using the `gpg` commands from above.
|
||||||
- Check that all files in `/usr/src/paperless/.gnupg` have correct permissions
|
- Check that all files in `/usr/src/paperless/.gnupg` have correct permissions
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
paperless@9da1865df327:~/.gnupg$ ls -al
|
paperless@9da1865df327:~/.gnupg$ ls -al
|
||||||
|
332
docs/api.md
332
docs/api.md
@ -8,23 +8,23 @@ most of the available filters and ordering fields.
|
|||||||
|
|
||||||
The API provides the following main endpoints:
|
The API provides the following main endpoints:
|
||||||
|
|
||||||
- `/api/correspondents/`: Full CRUD support.
|
- `/api/correspondents/`: Full CRUD support.
|
||||||
- `/api/custom_fields/`: Full CRUD support.
|
- `/api/custom_fields/`: Full CRUD support.
|
||||||
- `/api/documents/`: Full CRUD support, except POSTing new documents.
|
- `/api/documents/`: Full CRUD support, except POSTing new documents.
|
||||||
See [below](#file-uploads).
|
See [below](#file-uploads).
|
||||||
- `/api/document_types/`: Full CRUD support.
|
- `/api/document_types/`: Full CRUD support.
|
||||||
- `/api/groups/`: Full CRUD support.
|
- `/api/groups/`: Full CRUD support.
|
||||||
- `/api/logs/`: Read-Only.
|
- `/api/logs/`: Read-Only.
|
||||||
- `/api/mail_accounts/`: Full CRUD support.
|
- `/api/mail_accounts/`: Full CRUD support.
|
||||||
- `/api/mail_rules/`: Full CRUD support.
|
- `/api/mail_rules/`: Full CRUD support.
|
||||||
- `/api/profile/`: GET, PATCH
|
- `/api/profile/`: GET, PATCH
|
||||||
- `/api/share_links/`: Full CRUD support.
|
- `/api/share_links/`: Full CRUD support.
|
||||||
- `/api/storage_paths/`: Full CRUD support.
|
- `/api/storage_paths/`: Full CRUD support.
|
||||||
- `/api/tags/`: Full CRUD support.
|
- `/api/tags/`: Full CRUD support.
|
||||||
- `/api/tasks/`: Read-only.
|
- `/api/tasks/`: Read-only.
|
||||||
- `/api/users/`: Full CRUD support.
|
- `/api/users/`: Full CRUD support.
|
||||||
- `/api/workflows/`: Full CRUD support.
|
- `/api/workflows/`: Full CRUD support.
|
||||||
- `/api/search/` GET, see [below](#global-search).
|
- `/api/search/` GET, see [below](#global-search).
|
||||||
|
|
||||||
All of these endpoints except for the logging endpoint allow you to
|
All of these endpoints except for the logging endpoint allow you to
|
||||||
fetch (and edit and delete where appropriate) individual objects by
|
fetch (and edit and delete where appropriate) individual objects by
|
||||||
@ -33,32 +33,32 @@ appending their primary key to the path, e.g. `/api/documents/454/`.
|
|||||||
The objects served by the document endpoint contain the following
|
The objects served by the document endpoint contain the following
|
||||||
fields:
|
fields:
|
||||||
|
|
||||||
- `id`: ID of the document. Read-only.
|
- `id`: ID of the document. Read-only.
|
||||||
- `title`: Title of the document.
|
- `title`: Title of the document.
|
||||||
- `content`: Plain text content of the document.
|
- `content`: Plain text content of the document.
|
||||||
- `tags`: List of IDs of tags assigned to this document, or empty
|
- `tags`: List of IDs of tags assigned to this document, or empty
|
||||||
list.
|
list.
|
||||||
- `document_type`: Document type of this document, or null.
|
- `document_type`: Document type of this document, or null.
|
||||||
- `correspondent`: Correspondent of this document or null.
|
- `correspondent`: Correspondent of this document or null.
|
||||||
- `created`: The date time at which this document was created.
|
- `created`: The date time at which this document was created.
|
||||||
- `created_date`: The date (YYYY-MM-DD) at which this document was
|
- `created_date`: The date (YYYY-MM-DD) at which this document was
|
||||||
created. Optional. If also passed with created, this is ignored.
|
created. Optional. If also passed with created, this is ignored.
|
||||||
- `modified`: The date at which this document was last edited in
|
- `modified`: The date at which this document was last edited in
|
||||||
paperless. Read-only.
|
paperless. Read-only.
|
||||||
- `added`: The date at which this document was added to paperless.
|
- `added`: The date at which this document was added to paperless.
|
||||||
Read-only.
|
Read-only.
|
||||||
- `archive_serial_number`: The identifier of this document in a
|
- `archive_serial_number`: The identifier of this document in a
|
||||||
physical document archive.
|
physical document archive.
|
||||||
- `original_file_name`: Verbose filename of the original document.
|
- `original_file_name`: Verbose filename of the original document.
|
||||||
Read-only.
|
Read-only.
|
||||||
- `archived_file_name`: Verbose filename of the archived document.
|
- `archived_file_name`: Verbose filename of the archived document.
|
||||||
Read-only. Null if no archived document is available.
|
Read-only. Null if no archived document is available.
|
||||||
- `notes`: Array of notes associated with the document.
|
- `notes`: Array of notes associated with the document.
|
||||||
- `page_count`: Number of pages.
|
- `page_count`: Number of pages.
|
||||||
- `set_permissions`: Allows setting document permissions. Optional,
|
- `set_permissions`: Allows setting document permissions. Optional,
|
||||||
write-only. See [below](#permissions).
|
write-only. See [below](#permissions).
|
||||||
- `custom_fields`: Array of custom fields & values, specified as
|
- `custom_fields`: Array of custom fields & values, specified as
|
||||||
`{ field: CUSTOM_FIELD_ID, value: VALUE }`
|
`{ field: CUSTOM_FIELD_ID, value: VALUE }`
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
|
|
||||||
@ -69,11 +69,11 @@ fields:
|
|||||||
In addition to that, the document endpoint offers these additional
|
In addition to that, the document endpoint offers these additional
|
||||||
actions on individual documents:
|
actions on individual documents:
|
||||||
|
|
||||||
- `/api/documents/<pk>/download/`: Download the document.
|
- `/api/documents/<pk>/download/`: Download the document.
|
||||||
- `/api/documents/<pk>/preview/`: Display the document inline, without
|
- `/api/documents/<pk>/preview/`: Display the document inline, without
|
||||||
downloading it.
|
downloading it.
|
||||||
- `/api/documents/<pk>/thumb/`: Download the PNG thumbnail of a
|
- `/api/documents/<pk>/thumb/`: Download the PNG thumbnail of a
|
||||||
document.
|
document.
|
||||||
|
|
||||||
Paperless generates archived PDF/A documents from consumed files and
|
Paperless generates archived PDF/A documents from consumed files and
|
||||||
stores both the original files as well as the archived files. By
|
stores both the original files as well as the archived files. By
|
||||||
@ -107,30 +107,30 @@ Access the metadata of a document with an ID `id` at
|
|||||||
|
|
||||||
The endpoint reports the following data:
|
The endpoint reports the following data:
|
||||||
|
|
||||||
- `original_checksum`: MD5 checksum of the original document.
|
- `original_checksum`: MD5 checksum of the original document.
|
||||||
- `original_size`: Size of the original document, in bytes.
|
- `original_size`: Size of the original document, in bytes.
|
||||||
- `original_mime_type`: Mime type of the original document.
|
- `original_mime_type`: Mime type of the original document.
|
||||||
- `media_filename`: Current filename of the document, under which it
|
- `media_filename`: Current filename of the document, under which it
|
||||||
is stored inside the media directory.
|
is stored inside the media directory.
|
||||||
- `has_archive_version`: True, if this document is archived, false
|
- `has_archive_version`: True, if this document is archived, false
|
||||||
otherwise.
|
otherwise.
|
||||||
- `original_metadata`: A list of metadata associated with the original
|
- `original_metadata`: A list of metadata associated with the original
|
||||||
document. See below.
|
document. See below.
|
||||||
- `archive_checksum`: MD5 checksum of the archived document, or null.
|
- `archive_checksum`: MD5 checksum of the archived document, or null.
|
||||||
- `archive_size`: Size of the archived document in bytes, or null.
|
- `archive_size`: Size of the archived document in bytes, or null.
|
||||||
- `archive_metadata`: Metadata associated with the archived document,
|
- `archive_metadata`: Metadata associated with the archived document,
|
||||||
or null. See below.
|
or null. See below.
|
||||||
|
|
||||||
File metadata is reported as a list of objects in the following form:
|
File metadata is reported as a list of objects in the following form:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
[
|
[
|
||||||
{
|
{
|
||||||
"namespace": "http://ns.adobe.com/pdf/1.3/",
|
"namespace": "http://ns.adobe.com/pdf/1.3/",
|
||||||
"prefix": "pdf",
|
"prefix": "pdf",
|
||||||
"key": "Producer",
|
"key": "Producer",
|
||||||
"value": "SparklePDF, Fancy edition"
|
"value": "SparklePDF, Fancy edition"
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -140,9 +140,9 @@ document. Paperless only reports PDF metadata at this point.
|
|||||||
|
|
||||||
## Documents additional endpoints
|
## Documents additional endpoints
|
||||||
|
|
||||||
- `/api/documents/<id>/notes/`: Retrieve notes for a document.
|
- `/api/documents/<id>/notes/`: Retrieve notes for a document.
|
||||||
- `/api/documents/<id>/share_links/`: Retrieve share links for a document.
|
- `/api/documents/<id>/share_links/`: Retrieve share links for a document.
|
||||||
- `/api/documents/<id>/history/`: Retrieve history of changes for a document.
|
- `/api/documents/<id>/history/`: Retrieve history of changes for a document.
|
||||||
|
|
||||||
## Authorization
|
## Authorization
|
||||||
|
|
||||||
@ -228,10 +228,10 @@ Full text searching is available on the `/api/documents/` endpoint. Two
|
|||||||
specific query parameters cause the API to return full text search
|
specific query parameters cause the API to return full text search
|
||||||
results:
|
results:
|
||||||
|
|
||||||
- `/api/documents/?query=your%20search%20query`: Search for a document
|
- `/api/documents/?query=your%20search%20query`: Search for a document
|
||||||
using a full text query. For details on the syntax, see [Basic Usage - Searching](usage.md#basic-usage_searching).
|
using a full text query. For details on the syntax, see [Basic Usage - Searching](usage.md#basic-usage_searching).
|
||||||
- `/api/documents/?more_like_id=1234`: Search for documents similar to
|
- `/api/documents/?more_like_id=1234`: Search for documents similar to
|
||||||
the document with id 1234.
|
the document with id 1234.
|
||||||
|
|
||||||
Pagination works exactly the same as it does for normal requests on this
|
Pagination works exactly the same as it does for normal requests on this
|
||||||
endpoint.
|
endpoint.
|
||||||
@ -268,12 +268,12 @@ attribute with various information about the search results:
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
- `score` is an indication how well this document matches the query
|
- `score` is an indication how well this document matches the query
|
||||||
relative to the other search results.
|
relative to the other search results.
|
||||||
- `highlights` is an excerpt from the document content and highlights
|
- `highlights` is an excerpt from the document content and highlights
|
||||||
the search terms with `<span>` tags as shown above.
|
the search terms with `<span>` tags as shown above.
|
||||||
- `rank` is the index of the search results. The first result will
|
- `rank` is the index of the search results. The first result will
|
||||||
have rank 0.
|
have rank 0.
|
||||||
|
|
||||||
### Filtering by custom fields
|
### Filtering by custom fields
|
||||||
|
|
||||||
@ -284,33 +284,33 @@ use cases:
|
|||||||
1. Documents with a custom field "due" (date) between Aug 1, 2024 and
|
1. Documents with a custom field "due" (date) between Aug 1, 2024 and
|
||||||
Sept 1, 2024 (inclusive):
|
Sept 1, 2024 (inclusive):
|
||||||
|
|
||||||
`?custom_field_query=["due", "range", ["2024-08-01", "2024-09-01"]]`
|
`?custom_field_query=["due", "range", ["2024-08-01", "2024-09-01"]]`
|
||||||
|
|
||||||
2. Documents with a custom field "customer" (text) that equals "bob"
|
2. Documents with a custom field "customer" (text) that equals "bob"
|
||||||
(case sensitive):
|
(case sensitive):
|
||||||
|
|
||||||
`?custom_field_query=["customer", "exact", "bob"]`
|
`?custom_field_query=["customer", "exact", "bob"]`
|
||||||
|
|
||||||
3. Documents with a custom field "answered" (boolean) set to `true`:
|
3. Documents with a custom field "answered" (boolean) set to `true`:
|
||||||
|
|
||||||
`?custom_field_query=["answered", "exact", true]`
|
`?custom_field_query=["answered", "exact", true]`
|
||||||
|
|
||||||
4. Documents with a custom field "favorite animal" (select) set to either
|
4. Documents with a custom field "favorite animal" (select) set to either
|
||||||
"cat" or "dog":
|
"cat" or "dog":
|
||||||
|
|
||||||
`?custom_field_query=["favorite animal", "in", ["cat", "dog"]]`
|
`?custom_field_query=["favorite animal", "in", ["cat", "dog"]]`
|
||||||
|
|
||||||
5. Documents with a custom field "address" (text) that is empty:
|
5. Documents with a custom field "address" (text) that is empty:
|
||||||
|
|
||||||
`?custom_field_query=["OR", ["address", "isnull", true], ["address", "exact", ""]]`
|
`?custom_field_query=["OR", ["address", "isnull", true], ["address", "exact", ""]]`
|
||||||
|
|
||||||
6. Documents that don't have a field called "foo":
|
6. Documents that don't have a field called "foo":
|
||||||
|
|
||||||
`?custom_field_query=["foo", "exists", false]`
|
`?custom_field_query=["foo", "exists", false]`
|
||||||
|
|
||||||
7. Documents that have document links "references" to both document 3 and 7:
|
7. Documents that have document links "references" to both document 3 and 7:
|
||||||
|
|
||||||
`?custom_field_query=["references", "contains", [3, 7]]`
|
`?custom_field_query=["references", "contains", [3, 7]]`
|
||||||
|
|
||||||
All field types support basic operations including `exact`, `in`, `isnull`,
|
All field types support basic operations including `exact`, `in`, `isnull`,
|
||||||
and `exists`. String, URL, and monetary fields support case-insensitive
|
and `exists`. String, URL, and monetary fields support case-insensitive
|
||||||
@ -326,8 +326,8 @@ Get auto completions for a partial search term.
|
|||||||
|
|
||||||
Query parameters:
|
Query parameters:
|
||||||
|
|
||||||
- `term`: The incomplete term.
|
- `term`: The incomplete term.
|
||||||
- `limit`: Amount of results. Defaults to 10.
|
- `limit`: Amount of results. Defaults to 10.
|
||||||
|
|
||||||
Results returned by the endpoint are ordered by importance of the term
|
Results returned by the endpoint are ordered by importance of the term
|
||||||
in the document index. The first result is the term that has the highest
|
in the document index. The first result is the term that has the highest
|
||||||
@ -351,19 +351,19 @@ from there.
|
|||||||
|
|
||||||
The endpoint supports the following optional form fields:
|
The endpoint supports the following optional form fields:
|
||||||
|
|
||||||
- `title`: Specify a title that the consumer should use for the
|
- `title`: Specify a title that the consumer should use for the
|
||||||
document.
|
document.
|
||||||
- `created`: Specify a DateTime where the document was created (e.g.
|
- `created`: Specify a DateTime where the document was created (e.g.
|
||||||
"2016-04-19" or "2016-04-19 06:15:00+02:00").
|
"2016-04-19" or "2016-04-19 06:15:00+02:00").
|
||||||
- `correspondent`: Specify the ID of a correspondent that the consumer
|
- `correspondent`: Specify the ID of a correspondent that the consumer
|
||||||
should use for the document.
|
should use for the document.
|
||||||
- `document_type`: Similar to correspondent.
|
- `document_type`: Similar to correspondent.
|
||||||
- `storage_path`: Similar to correspondent.
|
- `storage_path`: Similar to correspondent.
|
||||||
- `tags`: Similar to correspondent. Specify this multiple times to
|
- `tags`: Similar to correspondent. Specify this multiple times to
|
||||||
have multiple tags added to the document.
|
have multiple tags added to the document.
|
||||||
- `archive_serial_number`: An optional archive serial number to set.
|
- `archive_serial_number`: An optional archive serial number to set.
|
||||||
- `custom_fields`: An array of custom field ids to assign (with an empty
|
- `custom_fields`: An array of custom field ids to assign (with an empty
|
||||||
value) to the document.
|
value) to the document.
|
||||||
|
|
||||||
The endpoint will immediately return HTTP 200 if the document consumption
|
The endpoint will immediately return HTTP 200 if the document consumption
|
||||||
process was started successfully, with the UUID of the consumption task
|
process was started successfully, with the UUID of the consumption task
|
||||||
@ -429,50 +429,50 @@ a json payload of the format:
|
|||||||
|
|
||||||
The following methods are supported:
|
The following methods are supported:
|
||||||
|
|
||||||
- `set_correspondent`
|
- `set_correspondent`
|
||||||
- Requires `parameters`: `{ "correspondent": CORRESPONDENT_ID }`
|
- Requires `parameters`: `{ "correspondent": CORRESPONDENT_ID }`
|
||||||
- `set_document_type`
|
- `set_document_type`
|
||||||
- Requires `parameters`: `{ "document_type": DOCUMENT_TYPE_ID }`
|
- Requires `parameters`: `{ "document_type": DOCUMENT_TYPE_ID }`
|
||||||
- `set_storage_path`
|
- `set_storage_path`
|
||||||
- Requires `parameters`: `{ "storage_path": STORAGE_PATH_ID }`
|
- Requires `parameters`: `{ "storage_path": STORAGE_PATH_ID }`
|
||||||
- `add_tag`
|
- `add_tag`
|
||||||
- Requires `parameters`: `{ "tag": TAG_ID }`
|
- Requires `parameters`: `{ "tag": TAG_ID }`
|
||||||
- `remove_tag`
|
- `remove_tag`
|
||||||
- Requires `parameters`: `{ "tag": TAG_ID }`
|
- Requires `parameters`: `{ "tag": TAG_ID }`
|
||||||
- `modify_tags`
|
- `modify_tags`
|
||||||
- Requires `parameters`: `{ "add_tags": [LIST_OF_TAG_IDS] }` and / or `{ "remove_tags": [LIST_OF_TAG_IDS] }`
|
- Requires `parameters`: `{ "add_tags": [LIST_OF_TAG_IDS] }` and / or `{ "remove_tags": [LIST_OF_TAG_IDS] }`
|
||||||
- `delete`
|
- `delete`
|
||||||
- No `parameters` required
|
- No `parameters` required
|
||||||
- `reprocess`
|
- `reprocess`
|
||||||
- No `parameters` required
|
- No `parameters` required
|
||||||
- `set_permissions`
|
- `set_permissions`
|
||||||
- Requires `parameters`:
|
- Requires `parameters`:
|
||||||
- `"set_permissions": PERMISSIONS_OBJ` (see format [above](#permissions)) and / or
|
- `"set_permissions": PERMISSIONS_OBJ` (see format [above](#permissions)) and / or
|
||||||
- `"owner": OWNER_ID or null`
|
- `"owner": OWNER_ID or null`
|
||||||
- `"merge": true or false` (defaults to false)
|
- `"merge": true or false` (defaults to false)
|
||||||
- The `merge` flag determines if the supplied permissions will overwrite all existing permissions (including
|
- The `merge` flag determines if the supplied permissions will overwrite all existing permissions (including
|
||||||
removing them) or be merged with existing permissions.
|
removing them) or be merged with existing permissions.
|
||||||
- `merge`
|
- `merge`
|
||||||
- No additional `parameters` required.
|
- No additional `parameters` required.
|
||||||
- The ordering of the merged document is determined by the list of IDs.
|
- The ordering of the merged document is determined by the list of IDs.
|
||||||
- Optional `parameters`:
|
- Optional `parameters`:
|
||||||
- `"metadata_document_id": DOC_ID` apply metadata (tags, correspondent, etc.) from this document to the merged document.
|
- `"metadata_document_id": DOC_ID` apply metadata (tags, correspondent, etc.) from this document to the merged document.
|
||||||
- `"delete_originals": true` to delete the original documents. This requires the calling user being the owner of
|
- `"delete_originals": true` to delete the original documents. This requires the calling user being the owner of
|
||||||
all documents that are merged.
|
all documents that are merged.
|
||||||
- `split`
|
- `split`
|
||||||
- Requires `parameters`:
|
- Requires `parameters`:
|
||||||
- `"pages": [..]` The list should be a list of pages and/or a ranges, separated by commas e.g. `"[1,2-3,4,5-7]"`
|
- `"pages": [..]` The list should be a list of pages and/or a ranges, separated by commas e.g. `"[1,2-3,4,5-7]"`
|
||||||
- Optional `parameters`:
|
- Optional `parameters`:
|
||||||
- `"delete_originals": true` to delete the original document after consumption. This requires the calling user being the owner of
|
- `"delete_originals": true` to delete the original document after consumption. This requires the calling user being the owner of
|
||||||
the document.
|
the document.
|
||||||
- The split operation only accepts a single document.
|
- The split operation only accepts a single document.
|
||||||
- `rotate`
|
- `rotate`
|
||||||
- Requires `parameters`:
|
- Requires `parameters`:
|
||||||
- `"degrees": DEGREES`. Must be an integer i.e. 90, 180, 270
|
- `"degrees": DEGREES`. Must be an integer i.e. 90, 180, 270
|
||||||
- `delete_pages`
|
- `delete_pages`
|
||||||
- Requires `parameters`:
|
- Requires `parameters`:
|
||||||
- `"pages": [..]` The list should be a list of integers e.g. `"[2,3,4]"`
|
- `"pages": [..]` The list should be a list of integers e.g. `"[2,3,4]"`
|
||||||
- The delete_pages operation only accepts a single document.
|
- The delete_pages operation only accepts a single document.
|
||||||
|
|
||||||
### Objects
|
### Objects
|
||||||
|
|
||||||
@ -494,16 +494,16 @@ operations, using the endpoint: `/api/bulk_edit_objects/`, which requires a json
|
|||||||
|
|
||||||
The REST API is versioned since Paperless-ngx 1.3.0.
|
The REST API is versioned since Paperless-ngx 1.3.0.
|
||||||
|
|
||||||
- Versioning ensures that changes to the API don't break older
|
- Versioning ensures that changes to the API don't break older
|
||||||
clients.
|
clients.
|
||||||
- Clients specify the specific version of the API they wish to use
|
- Clients specify the specific version of the API they wish to use
|
||||||
with every request and Paperless will handle the request using the
|
with every request and Paperless will handle the request using the
|
||||||
specified API version.
|
specified API version.
|
||||||
- Even if the underlying data model changes, older API versions will
|
- Even if the underlying data model changes, older API versions will
|
||||||
always serve compatible data.
|
always serve compatible data.
|
||||||
- If no version is specified, Paperless will serve version 1 to ensure
|
- If no version is specified, Paperless will serve version 1 to ensure
|
||||||
compatibility with older clients that do not request a specific API
|
compatibility with older clients that do not request a specific API
|
||||||
version.
|
version.
|
||||||
|
|
||||||
API versions are specified by submitting an additional HTTP `Accept`
|
API versions are specified by submitting an additional HTTP `Accept`
|
||||||
header with every request:
|
header with every request:
|
||||||
@ -540,19 +540,19 @@ Initial API version.
|
|||||||
|
|
||||||
#### Version 2
|
#### Version 2
|
||||||
|
|
||||||
- Added field `Tag.color`. This read/write string field contains a hex
|
- Added field `Tag.color`. This read/write string field contains a hex
|
||||||
color such as `#a6cee3`.
|
color such as `#a6cee3`.
|
||||||
- Added read-only field `Tag.text_color`. This field contains the text
|
- Added read-only field `Tag.text_color`. This field contains the text
|
||||||
color to use for a specific tag, which is either black or white
|
color to use for a specific tag, which is either black or white
|
||||||
depending on the brightness of `Tag.color`.
|
depending on the brightness of `Tag.color`.
|
||||||
- Removed field `Tag.colour`.
|
- Removed field `Tag.colour`.
|
||||||
|
|
||||||
#### Version 3
|
#### Version 3
|
||||||
|
|
||||||
- Permissions endpoints have been added.
|
- Permissions endpoints have been added.
|
||||||
- The format of the `/api/ui_settings/` has changed.
|
- The format of the `/api/ui_settings/` has changed.
|
||||||
|
|
||||||
#### Version 4
|
#### Version 4
|
||||||
|
|
||||||
- Consumption templates were refactored to workflows and API endpoints
|
- Consumption templates were refactored to workflows and API endpoints
|
||||||
changed as such.
|
changed as such.
|
||||||
|
7560
docs/changelog.md
7560
docs/changelog.md
File diff suppressed because it is too large
Load Diff
@ -8,17 +8,17 @@ common [OCR](#ocr) related settings and some frontend settings. If set, these wi
|
|||||||
preference over the settings via environment variables. If not set, the environment setting
|
preference over the settings via environment variables. If not set, the environment setting
|
||||||
or applicable default will be utilized instead.
|
or applicable default will be utilized instead.
|
||||||
|
|
||||||
- If you run paperless on docker, `paperless.conf` is not used.
|
- If you run paperless on docker, `paperless.conf` is not used.
|
||||||
Rather, configure paperless by copying necessary options to
|
Rather, configure paperless by copying necessary options to
|
||||||
`docker-compose.env`.
|
`docker-compose.env`.
|
||||||
|
|
||||||
- If you are running paperless on anything else, paperless will search
|
- If you are running paperless on anything else, paperless will search
|
||||||
for the configuration file in these locations and use the first one
|
for the configuration file in these locations and use the first one
|
||||||
it finds:
|
it finds:
|
||||||
- The environment variable `PAPERLESS_CONFIGURATION_PATH`
|
- The environment variable `PAPERLESS_CONFIGURATION_PATH`
|
||||||
- `/path/to/paperless/paperless.conf`
|
- `/path/to/paperless/paperless.conf`
|
||||||
- `/etc/paperless.conf`
|
- `/etc/paperless.conf`
|
||||||
- `/usr/local/etc/paperless.conf`
|
- `/usr/local/etc/paperless.conf`
|
||||||
|
|
||||||
## Required services
|
## Required services
|
||||||
|
|
||||||
|
@ -6,23 +6,23 @@ on Paperless-ngx.
|
|||||||
Check out the source from GitHub. The repository is organized in the
|
Check out the source from GitHub. The repository is organized in the
|
||||||
following way:
|
following way:
|
||||||
|
|
||||||
- `main` always represents the latest release and will only see
|
- `main` always represents the latest release and will only see
|
||||||
changes when a new release is made.
|
changes when a new release is made.
|
||||||
- `dev` contains the code that will be in the next release.
|
- `dev` contains the code that will be in the next release.
|
||||||
- `feature-X` contains bigger changes that will be in some release, but
|
- `feature-X` contains bigger changes that will be in some release, but
|
||||||
not necessarily the next one.
|
not necessarily the next one.
|
||||||
|
|
||||||
When making functional changes to Paperless-ngx, _always_ make your changes
|
When making functional changes to Paperless-ngx, _always_ make your changes
|
||||||
on the `dev` branch.
|
on the `dev` branch.
|
||||||
|
|
||||||
Apart from that, the folder structure is as follows:
|
Apart from that, the folder structure is as follows:
|
||||||
|
|
||||||
- `docs/` - Documentation.
|
- `docs/` - Documentation.
|
||||||
- `src-ui/` - Code of the front end.
|
- `src-ui/` - Code of the front end.
|
||||||
- `src/` - Code of the back end.
|
- `src/` - Code of the back end.
|
||||||
- `scripts/` - Various scripts that help with different parts of
|
- `scripts/` - Various scripts that help with different parts of
|
||||||
development.
|
development.
|
||||||
- `docker/` - Files required to build the docker image.
|
- `docker/` - Files required to build the docker image.
|
||||||
|
|
||||||
## Contributing to Paperless-ngx
|
## Contributing to Paperless-ngx
|
||||||
|
|
||||||
@ -99,17 +99,17 @@ first-time setup.
|
|||||||
|
|
||||||
7. You can now either ...
|
7. You can now either ...
|
||||||
|
|
||||||
- install redis or
|
- install redis or
|
||||||
|
|
||||||
- use the included `scripts/start_services.sh` to use docker to fire
|
- use the included `scripts/start_services.sh` to use docker to fire
|
||||||
up a redis instance (and some other services such as tika,
|
up a redis instance (and some other services such as tika,
|
||||||
gotenberg and a database server) or
|
gotenberg and a database server) or
|
||||||
|
|
||||||
- spin up a bare redis container
|
- spin up a bare redis container
|
||||||
|
|
||||||
```
|
```
|
||||||
$ docker run -d -p 6379:6379 --restart unless-stopped redis:latest
|
$ docker run -d -p 6379:6379 --restart unless-stopped redis:latest
|
||||||
```
|
```
|
||||||
|
|
||||||
8. Continue with either back-end or front-end development – or both :-).
|
8. Continue with either back-end or front-end development – or both :-).
|
||||||
|
|
||||||
@ -122,9 +122,9 @@ work well for development, but you can use whatever you want.
|
|||||||
Configure the IDE to use the `src/`-folder as the base source folder.
|
Configure the IDE to use the `src/`-folder as the base source folder.
|
||||||
Configure the following launch configurations in your IDE:
|
Configure the following launch configurations in your IDE:
|
||||||
|
|
||||||
- `python3 manage.py runserver`
|
- `python3 manage.py runserver`
|
||||||
- `python3 manage.py document_consumer`
|
- `python3 manage.py document_consumer`
|
||||||
- `celery --app paperless worker -l DEBUG` (or any other log level)
|
- `celery --app paperless worker -l DEBUG` (or any other log level)
|
||||||
|
|
||||||
To start them all:
|
To start them all:
|
||||||
|
|
||||||
@ -150,11 +150,11 @@ $ ng build --configuration production
|
|||||||
|
|
||||||
### Testing
|
### Testing
|
||||||
|
|
||||||
- Run `pytest` in the `src/` directory to execute all tests. This also
|
- Run `pytest` in the `src/` directory to execute all tests. This also
|
||||||
generates a HTML coverage report. When runnings test, `paperless.conf`
|
generates a HTML coverage report. When runnings test, `paperless.conf`
|
||||||
is loaded as well. However, the tests rely on the default
|
is loaded as well. However, the tests rely on the default
|
||||||
configuration. This is not ideal. But for now, make sure no settings
|
configuration. This is not ideal. But for now, make sure no settings
|
||||||
except for DEBUG are overridden when testing.
|
except for DEBUG are overridden when testing.
|
||||||
|
|
||||||
!!! note
|
!!! note
|
||||||
|
|
||||||
@ -245,14 +245,14 @@ these parts have to be translated separately.
|
|||||||
|
|
||||||
### Front end localization
|
### Front end localization
|
||||||
|
|
||||||
- The AngularJS front end does localization according to the [Angular
|
- The AngularJS front end does localization according to the [Angular
|
||||||
documentation](https://angular.io/guide/i18n).
|
documentation](https://angular.io/guide/i18n).
|
||||||
- The source language of the project is "en_US".
|
- The source language of the project is "en_US".
|
||||||
- The source strings end up in the file `src-ui/messages.xlf`.
|
- The source strings end up in the file `src-ui/messages.xlf`.
|
||||||
- The translated strings need to be placed in the
|
- The translated strings need to be placed in the
|
||||||
`src-ui/src/locale/` folder.
|
`src-ui/src/locale/` folder.
|
||||||
- In order to extract added or changed strings from the source files,
|
- In order to extract added or changed strings from the source files,
|
||||||
call `ng extract-i18n`.
|
call `ng extract-i18n`.
|
||||||
|
|
||||||
Adding new languages requires adding the translated files in the
|
Adding new languages requires adding the translated files in the
|
||||||
`src-ui/src/locale/` folder and adjusting a couple files.
|
`src-ui/src/locale/` folder and adjusting a couple files.
|
||||||
@ -298,18 +298,18 @@ A majority of the strings that appear in the back end appear only when
|
|||||||
the admin is used. However, some of these are still shown on the front
|
the admin is used. However, some of these are still shown on the front
|
||||||
end (such as error messages).
|
end (such as error messages).
|
||||||
|
|
||||||
- The django application does localization according to the [Django
|
- The django application does localization according to the [Django
|
||||||
documentation](https://docs.djangoproject.com/en/3.1/topics/i18n/translation/).
|
documentation](https://docs.djangoproject.com/en/3.1/topics/i18n/translation/).
|
||||||
- The source language of the project is "en_US".
|
- The source language of the project is "en_US".
|
||||||
- Localization files end up in the folder `src/locale/`.
|
- Localization files end up in the folder `src/locale/`.
|
||||||
- In order to extract strings from the application, call
|
- In order to extract strings from the application, call
|
||||||
`python3 manage.py makemessages -l en_US`. This is important after
|
`python3 manage.py makemessages -l en_US`. This is important after
|
||||||
making changes to translatable strings.
|
making changes to translatable strings.
|
||||||
- The message files need to be compiled for them to show up in the
|
- The message files need to be compiled for them to show up in the
|
||||||
application. Call `python3 manage.py compilemessages` to do this.
|
application. Call `python3 manage.py compilemessages` to do this.
|
||||||
The generated files don't get committed into git, since these are
|
The generated files don't get committed into git, since these are
|
||||||
derived artifacts. The build pipeline takes care of executing this
|
derived artifacts. The build pipeline takes care of executing this
|
||||||
command.
|
command.
|
||||||
|
|
||||||
Adding new languages requires adding the translated files in the
|
Adding new languages requires adding the translated files in the
|
||||||
`src/locale/`-folder and adjusting the file
|
`src/locale/`-folder and adjusting the file
|
||||||
@ -378,10 +378,10 @@ base code.
|
|||||||
Paperless-ngx uses parsers to add documents. A parser is
|
Paperless-ngx uses parsers to add documents. A parser is
|
||||||
responsible for:
|
responsible for:
|
||||||
|
|
||||||
- Retrieving the content from the original
|
- Retrieving the content from the original
|
||||||
- Creating a thumbnail
|
- Creating a thumbnail
|
||||||
- _optional:_ Retrieving a created date from the original
|
- _optional:_ Retrieving a created date from the original
|
||||||
- _optional:_ Creating an archived document from the original
|
- _optional:_ Creating an archived document from the original
|
||||||
|
|
||||||
Custom parsers can be added to Paperless-ngx to support more file types. In
|
Custom parsers can be added to Paperless-ngx to support more file types. In
|
||||||
order to do that, you need to write the parser itself and announce its
|
order to do that, you need to write the parser itself and announce its
|
||||||
@ -439,14 +439,14 @@ def myparser_consumer_declaration(sender, **kwargs):
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
- `parser` is a reference to a class that extends `DocumentParser`.
|
- `parser` is a reference to a class that extends `DocumentParser`.
|
||||||
- `weight` is used whenever two or more parsers are able to parse a
|
- `weight` is used whenever two or more parsers are able to parse a
|
||||||
file: The parser with the higher weight wins. This can be used to
|
file: The parser with the higher weight wins. This can be used to
|
||||||
override the parsers provided by Paperless-ngx.
|
override the parsers provided by Paperless-ngx.
|
||||||
- `mime_types` is a dictionary. The keys are the mime types your
|
- `mime_types` is a dictionary. The keys are the mime types your
|
||||||
parser supports and the value is the default file extension that
|
parser supports and the value is the default file extension that
|
||||||
Paperless-ngx should use when storing files and serving them for
|
Paperless-ngx should use when storing files and serving them for
|
||||||
download. We could guess that from the file extensions, but some
|
download. We could guess that from the file extensions, but some
|
||||||
mime types have many extensions associated with them and the Python
|
mime types have many extensions associated with them and the Python
|
||||||
methods responsible for guessing the extension do not always return
|
methods responsible for guessing the extension do not always return
|
||||||
the same value.
|
the same value.
|
||||||
|
44
docs/faq.md
44
docs/faq.md
@ -40,28 +40,28 @@ system. On Linux, chances are high that this location is
|
|||||||
You can always drag those files out of that folder to use them
|
You can always drag those files out of that folder to use them
|
||||||
elsewhere. Here are a couple notes about that.
|
elsewhere. Here are a couple notes about that.
|
||||||
|
|
||||||
- Paperless-ngx never modifies your original documents. It keeps
|
- Paperless-ngx never modifies your original documents. It keeps
|
||||||
checksums of all documents and uses a scheduled sanity checker to
|
checksums of all documents and uses a scheduled sanity checker to
|
||||||
check that they remain the same.
|
check that they remain the same.
|
||||||
- By default, paperless uses the internal ID of each document as its
|
- By default, paperless uses the internal ID of each document as its
|
||||||
filename. This might not be very convenient for export. However, you
|
filename. This might not be very convenient for export. However, you
|
||||||
can adjust the way files are stored in paperless by
|
can adjust the way files are stored in paperless by
|
||||||
[configuring the filename format](advanced_usage.md#file-name-handling).
|
[configuring the filename format](advanced_usage.md#file-name-handling).
|
||||||
- [The exporter](administration.md#exporter) is
|
- [The exporter](administration.md#exporter) is
|
||||||
another easy way to get your files out of paperless with reasonable
|
another easy way to get your files out of paperless with reasonable
|
||||||
file names.
|
file names.
|
||||||
|
|
||||||
## _What file types does paperless-ngx support?_
|
## _What file types does paperless-ngx support?_
|
||||||
|
|
||||||
**A:** Currently, the following files are supported:
|
**A:** Currently, the following files are supported:
|
||||||
|
|
||||||
- PDF documents, PNG images, JPEG images, TIFF images, GIF images and
|
- PDF documents, PNG images, JPEG images, TIFF images, GIF images and
|
||||||
WebP images are processed with OCR and converted into PDF documents.
|
WebP images are processed with OCR and converted into PDF documents.
|
||||||
- Plain text documents are supported as well and are added verbatim to
|
- Plain text documents are supported as well and are added verbatim to
|
||||||
paperless.
|
paperless.
|
||||||
- With the optional Tika integration enabled (see [Tika configuration](https://docs.paperless-ngx.com/configuration#tika)),
|
- With the optional Tika integration enabled (see [Tika configuration](https://docs.paperless-ngx.com/configuration#tika)),
|
||||||
Paperless also supports various Office documents (.docx, .doc, odt,
|
Paperless also supports various Office documents (.docx, .doc, odt,
|
||||||
.ppt, .pptx, .odp, .xls, .xlsx, .ods).
|
.ppt, .pptx, .odp, .xls, .xlsx, .ods).
|
||||||
|
|
||||||
Paperless-ngx determines the type of a file by inspecting its content.
|
Paperless-ngx determines the type of a file by inspecting its content.
|
||||||
The file extensions do not matter.
|
The file extensions do not matter.
|
||||||
@ -127,11 +127,11 @@ ASGI-enabled web server as well that processes WebSocket connections,
|
|||||||
and configure Apache to redirect WebSocket connections to this server.
|
and configure Apache to redirect WebSocket connections to this server.
|
||||||
Multiple options for ASGI servers exist:
|
Multiple options for ASGI servers exist:
|
||||||
|
|
||||||
- `gunicorn` with `uvicorn` as the worker implementation (the default
|
- `gunicorn` with `uvicorn` as the worker implementation (the default
|
||||||
of paperless)
|
of paperless)
|
||||||
- `daphne` as a standalone server, which is the reference
|
- `daphne` as a standalone server, which is the reference
|
||||||
implementation for ASGI.
|
implementation for ASGI.
|
||||||
- `uvicorn` as a standalone server
|
- `uvicorn` as a standalone server
|
||||||
|
|
||||||
## _What about the Redis licensing change and using one of the open source forks_?
|
## _What about the Redis licensing change and using one of the open source forks_?
|
||||||
|
|
||||||
|
222
docs/setup.md
222
docs/setup.md
@ -2,11 +2,11 @@
|
|||||||
|
|
||||||
You can go multiple routes to setup and run Paperless:
|
You can go multiple routes to setup and run Paperless:
|
||||||
|
|
||||||
- [Use the easy install docker script](#docker_script)
|
- [Use the easy install docker script](#docker_script)
|
||||||
- [Pull the image from Docker Hub](#docker_hub)
|
- [Pull the image from Docker Hub](#docker_hub)
|
||||||
- [Build the Docker image yourself](#docker_build)
|
- [Build the Docker image yourself](#docker_build)
|
||||||
- [Install Paperless directly on your system manually (bare metal)](#bare_metal)
|
- [Install Paperless directly on your system manually (bare metal)](#bare_metal)
|
||||||
- A user-maintained list of commercial hosting providers can be found [in the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Related-Projects)
|
- A user-maintained list of commercial hosting providers can be found [in the wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Related-Projects)
|
||||||
|
|
||||||
The Docker routes are quick & easy. These are the recommended routes.
|
The Docker routes are quick & easy. These are the recommended routes.
|
||||||
This configures all the stuff from the above automatically so that it
|
This configures all the stuff from the above automatically so that it
|
||||||
@ -105,14 +105,14 @@ steps described in [Docker setup](#docker_hub) automatically.
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
ports:
|
ports:
|
||||||
- 8000:8000
|
- 8000:8000
|
||||||
```
|
```
|
||||||
|
|
||||||
Replace the part BEFORE the colon with a port of your choice:
|
Replace the part BEFORE the colon with a port of your choice:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
ports:
|
ports:
|
||||||
- 8010:8000
|
- 8010:8000
|
||||||
```
|
```
|
||||||
|
|
||||||
Don't change the part after the colon or edit other lines that
|
Don't change the part after the colon or edit other lines that
|
||||||
@ -129,11 +129,11 @@ steps described in [Docker setup](#docker_hub) automatically.
|
|||||||
If you want to run Paperless as a rootless container, you will need
|
If you want to run Paperless as a rootless container, you will need
|
||||||
to do the following in your `docker-compose.yml`:
|
to do the following in your `docker-compose.yml`:
|
||||||
|
|
||||||
- set the `user` running the container to map to the `paperless`
|
- set the `user` running the container to map to the `paperless`
|
||||||
user in the container. This value (`user_id` below), should be
|
user in the container. This value (`user_id` below), should be
|
||||||
the same id that `USERMAP_UID` and `USERMAP_GID` are set to in
|
the same id that `USERMAP_UID` and `USERMAP_GID` are set to in
|
||||||
the next step. See `USERMAP_UID` and `USERMAP_GID`
|
the next step. See `USERMAP_UID` and `USERMAP_GID`
|
||||||
[here](configuration.md#docker).
|
[here](configuration.md#docker).
|
||||||
|
|
||||||
Your entry for Paperless should contain something like:
|
Your entry for Paperless should contain something like:
|
||||||
|
|
||||||
@ -222,7 +222,7 @@ steps described in [Docker setup](#docker_hub) automatically.
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
webserver:
|
webserver:
|
||||||
image: ghcr.io/paperless-ngx/paperless-ngx:latest
|
image: ghcr.io/paperless-ngx/paperless-ngx:latest
|
||||||
```
|
```
|
||||||
|
|
||||||
and replace it with a line that instructs Docker Compose to build
|
and replace it with a line that instructs Docker Compose to build
|
||||||
@ -230,8 +230,8 @@ steps described in [Docker setup](#docker_hub) automatically.
|
|||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
webserver:
|
webserver:
|
||||||
build:
|
build:
|
||||||
context: .
|
context: .
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Follow steps 3 to 8 of [Docker Setup](#docker_hub). When asked to run
|
4. Follow steps 3 to 8 of [Docker Setup](#docker_hub). When asked to run
|
||||||
@ -257,20 +257,20 @@ are released, dependency support is confirmed, etc.
|
|||||||
|
|
||||||
1. Install dependencies. Paperless requires the following packages.
|
1. Install dependencies. Paperless requires the following packages.
|
||||||
|
|
||||||
- `python3`
|
- `python3`
|
||||||
- `python3-pip`
|
- `python3-pip`
|
||||||
- `python3-dev`
|
- `python3-dev`
|
||||||
- `default-libmysqlclient-dev` for MariaDB
|
- `default-libmysqlclient-dev` for MariaDB
|
||||||
- `pkg-config` for mysqlclient (python dependency)
|
- `pkg-config` for mysqlclient (python dependency)
|
||||||
- `fonts-liberation` for generating thumbnails for plain text
|
- `fonts-liberation` for generating thumbnails for plain text
|
||||||
files
|
files
|
||||||
- `imagemagick` >= 6 for PDF conversion
|
- `imagemagick` >= 6 for PDF conversion
|
||||||
- `gnupg` for handling encrypted documents
|
- `gnupg` for handling encrypted documents
|
||||||
- `libpq-dev` for PostgreSQL
|
- `libpq-dev` for PostgreSQL
|
||||||
- `libmagic-dev` for mime type detection
|
- `libmagic-dev` for mime type detection
|
||||||
- `mariadb-client` for MariaDB compile time
|
- `mariadb-client` for MariaDB compile time
|
||||||
- `libzbar0` for barcode detection
|
- `libzbar0` for barcode detection
|
||||||
- `poppler-utils` for barcode detection
|
- `poppler-utils` for barcode detection
|
||||||
|
|
||||||
Use this list for your preferred package management:
|
Use this list for your preferred package management:
|
||||||
|
|
||||||
@ -281,17 +281,17 @@ are released, dependency support is confirmed, etc.
|
|||||||
These dependencies are required for OCRmyPDF, which is used for text
|
These dependencies are required for OCRmyPDF, which is used for text
|
||||||
recognition.
|
recognition.
|
||||||
|
|
||||||
- `unpaper`
|
- `unpaper`
|
||||||
- `ghostscript`
|
- `ghostscript`
|
||||||
- `icc-profiles-free`
|
- `icc-profiles-free`
|
||||||
- `qpdf`
|
- `qpdf`
|
||||||
- `liblept5`
|
- `liblept5`
|
||||||
- `libxml2`
|
- `libxml2`
|
||||||
- `pngquant` (suggested for certain PDF image optimizations)
|
- `pngquant` (suggested for certain PDF image optimizations)
|
||||||
- `zlib1g`
|
- `zlib1g`
|
||||||
- `tesseract-ocr` >= 4.0.0 for OCR
|
- `tesseract-ocr` >= 4.0.0 for OCR
|
||||||
- `tesseract-ocr` language packs (`tesseract-ocr-eng`,
|
- `tesseract-ocr` language packs (`tesseract-ocr-eng`,
|
||||||
`tesseract-ocr-deu`, etc)
|
`tesseract-ocr-deu`, etc)
|
||||||
|
|
||||||
Use this list for your preferred package management:
|
Use this list for your preferred package management:
|
||||||
|
|
||||||
@ -301,15 +301,15 @@ are released, dependency support is confirmed, etc.
|
|||||||
|
|
||||||
On Raspberry Pi, these libraries are required as well:
|
On Raspberry Pi, these libraries are required as well:
|
||||||
|
|
||||||
- `libatlas-base-dev`
|
- `libatlas-base-dev`
|
||||||
- `libxslt1-dev`
|
- `libxslt1-dev`
|
||||||
- `mime-support`
|
- `mime-support`
|
||||||
|
|
||||||
You will also need these for installing some of the python dependencies:
|
You will also need these for installing some of the python dependencies:
|
||||||
|
|
||||||
- `build-essential`
|
- `build-essential`
|
||||||
- `python3-setuptools`
|
- `python3-setuptools`
|
||||||
- `python3-wheel`
|
- `python3-wheel`
|
||||||
|
|
||||||
Use this list for your preferred package management:
|
Use this list for your preferred package management:
|
||||||
|
|
||||||
@ -361,33 +361,33 @@ are released, dependency support is confirmed, etc.
|
|||||||
needs. Required settings for getting
|
needs. Required settings for getting
|
||||||
paperless running are:
|
paperless running are:
|
||||||
|
|
||||||
- [`PAPERLESS_REDIS`](configuration.md#PAPERLESS_REDIS) should point to your redis server, such as
|
- [`PAPERLESS_REDIS`](configuration.md#PAPERLESS_REDIS) should point to your redis server, such as
|
||||||
<redis://localhost:6379>.
|
<redis://localhost:6379>.
|
||||||
- [`PAPERLESS_DBENGINE`](configuration.md#PAPERLESS_DBENGINE) optional, and should be one of `postgres`,
|
- [`PAPERLESS_DBENGINE`](configuration.md#PAPERLESS_DBENGINE) optional, and should be one of `postgres`,
|
||||||
`mariadb`, or `sqlite`
|
`mariadb`, or `sqlite`
|
||||||
- [`PAPERLESS_DBHOST`](configuration.md#PAPERLESS_DBHOST) should be the hostname on which your
|
- [`PAPERLESS_DBHOST`](configuration.md#PAPERLESS_DBHOST) should be the hostname on which your
|
||||||
PostgreSQL server is running. Do not configure this to use
|
PostgreSQL server is running. Do not configure this to use
|
||||||
SQLite instead. Also configure port, database name, user and
|
SQLite instead. Also configure port, database name, user and
|
||||||
password as necessary.
|
password as necessary.
|
||||||
- [`PAPERLESS_CONSUMPTION_DIR`](configuration.md#PAPERLESS_CONSUMPTION_DIR) should point to a folder which
|
- [`PAPERLESS_CONSUMPTION_DIR`](configuration.md#PAPERLESS_CONSUMPTION_DIR) should point to a folder which
|
||||||
paperless should watch for documents. You might want to have
|
paperless should watch for documents. You might want to have
|
||||||
this somewhere else. Likewise, [`PAPERLESS_DATA_DIR`](configuration.md#PAPERLESS_DATA_DIR) and
|
this somewhere else. Likewise, [`PAPERLESS_DATA_DIR`](configuration.md#PAPERLESS_DATA_DIR) and
|
||||||
[`PAPERLESS_MEDIA_ROOT`](configuration.md#PAPERLESS_MEDIA_ROOT) define where paperless stores its data.
|
[`PAPERLESS_MEDIA_ROOT`](configuration.md#PAPERLESS_MEDIA_ROOT) define where paperless stores its data.
|
||||||
If you like, you can point both to the same directory.
|
If you like, you can point both to the same directory.
|
||||||
- [`PAPERLESS_SECRET_KEY`](configuration.md#PAPERLESS_SECRET_KEY) should be a random sequence of
|
- [`PAPERLESS_SECRET_KEY`](configuration.md#PAPERLESS_SECRET_KEY) should be a random sequence of
|
||||||
characters. It's used for authentication. Failure to do so
|
characters. It's used for authentication. Failure to do so
|
||||||
allows third parties to forge authentication credentials.
|
allows third parties to forge authentication credentials.
|
||||||
- [`PAPERLESS_URL`](configuration.md#PAPERLESS_URL) if you are behind a reverse proxy. This should
|
- [`PAPERLESS_URL`](configuration.md#PAPERLESS_URL) if you are behind a reverse proxy. This should
|
||||||
point to your domain. Please see
|
point to your domain. Please see
|
||||||
[configuration](configuration.md) for more
|
[configuration](configuration.md) for more
|
||||||
information.
|
information.
|
||||||
|
|
||||||
Many more adjustments can be made to paperless, especially the OCR
|
Many more adjustments can be made to paperless, especially the OCR
|
||||||
part. The following options are recommended for everyone:
|
part. The following options are recommended for everyone:
|
||||||
|
|
||||||
- Set [`PAPERLESS_OCR_LANGUAGE`](configuration.md#PAPERLESS_OCR_LANGUAGE) to the language most of your
|
- Set [`PAPERLESS_OCR_LANGUAGE`](configuration.md#PAPERLESS_OCR_LANGUAGE) to the language most of your
|
||||||
documents are written in.
|
documents are written in.
|
||||||
- Set [`PAPERLESS_TIME_ZONE`](configuration.md#PAPERLESS_TIME_ZONE) to your local time zone.
|
- Set [`PAPERLESS_TIME_ZONE`](configuration.md#PAPERLESS_TIME_ZONE) to your local time zone.
|
||||||
|
|
||||||
!!! warning
|
!!! warning
|
||||||
|
|
||||||
@ -395,9 +395,9 @@ are released, dependency support is confirmed, etc.
|
|||||||
|
|
||||||
7. Create the following directories if they are missing:
|
7. Create the following directories if they are missing:
|
||||||
|
|
||||||
- `/opt/paperless/media`
|
- `/opt/paperless/media`
|
||||||
- `/opt/paperless/data`
|
- `/opt/paperless/data`
|
||||||
- `/opt/paperless/consume`
|
- `/opt/paperless/consume`
|
||||||
|
|
||||||
Adjust as necessary if you configured different folders.
|
Adjust as necessary if you configured different folders.
|
||||||
Ensure that the paperless user has write permissions for every one
|
Ensure that the paperless user has write permissions for every one
|
||||||
@ -586,21 +586,21 @@ your setup depending on how you installed paperless.
|
|||||||
This setup describes how to update an existing paperless Docker
|
This setup describes how to update an existing paperless Docker
|
||||||
installation. The important things to keep in mind are as follows:
|
installation. The important things to keep in mind are as follows:
|
||||||
|
|
||||||
- Read the [changelog](changelog.md) and
|
- Read the [changelog](changelog.md) and
|
||||||
take note of breaking changes.
|
take note of breaking changes.
|
||||||
- You should decide if you want to stick with SQLite or want to
|
- You should decide if you want to stick with SQLite or want to
|
||||||
migrate your database to PostgreSQL. See [documentation](#sqlite_to_psql)
|
migrate your database to PostgreSQL. See [documentation](#sqlite_to_psql)
|
||||||
for details on
|
for details on
|
||||||
how to move your data from SQLite to PostgreSQL. Both work fine with
|
how to move your data from SQLite to PostgreSQL. Both work fine with
|
||||||
paperless. However, if you already have a database server running
|
paperless. However, if you already have a database server running
|
||||||
for other services, you might as well use it for paperless as well.
|
for other services, you might as well use it for paperless as well.
|
||||||
- The task scheduler of paperless, which is used to execute periodic
|
- The task scheduler of paperless, which is used to execute periodic
|
||||||
tasks such as email checking and maintenance, requires a
|
tasks such as email checking and maintenance, requires a
|
||||||
[redis](https://redis.io/) message broker instance. The
|
[redis](https://redis.io/) message broker instance. The
|
||||||
Docker Compose route takes care of that.
|
Docker Compose route takes care of that.
|
||||||
- The layout of the folder structure for your documents and data
|
- The layout of the folder structure for your documents and data
|
||||||
remains the same, so you can just plug your old docker volumes into
|
remains the same, so you can just plug your old docker volumes into
|
||||||
paperless-ngx and expect it to find everything where it should be.
|
paperless-ngx and expect it to find everything where it should be.
|
||||||
|
|
||||||
Migration to paperless-ngx is then performed in a few simple steps:
|
Migration to paperless-ngx is then performed in a few simple steps:
|
||||||
|
|
||||||
@ -763,30 +763,30 @@ Paperless runs on Raspberry Pi. However, some things are rather slow on
|
|||||||
the Pi and configuring some options in paperless can help improve
|
the Pi and configuring some options in paperless can help improve
|
||||||
performance immensely:
|
performance immensely:
|
||||||
|
|
||||||
- Stick with SQLite to save some resources.
|
- Stick with SQLite to save some resources.
|
||||||
- Consider setting [`PAPERLESS_OCR_PAGES`](configuration.md#PAPERLESS_OCR_PAGES) to 1, so that paperless will
|
- Consider setting [`PAPERLESS_OCR_PAGES`](configuration.md#PAPERLESS_OCR_PAGES) to 1, so that paperless will
|
||||||
only OCR the first page of your documents. In most cases, this page
|
only OCR the first page of your documents. In most cases, this page
|
||||||
contains enough information to be able to find it.
|
contains enough information to be able to find it.
|
||||||
- [`PAPERLESS_TASK_WORKERS`](configuration.md#PAPERLESS_TASK_WORKERS) and [`PAPERLESS_THREADS_PER_WORKER`](configuration.md#PAPERLESS_THREADS_PER_WORKER) are
|
- [`PAPERLESS_TASK_WORKERS`](configuration.md#PAPERLESS_TASK_WORKERS) and [`PAPERLESS_THREADS_PER_WORKER`](configuration.md#PAPERLESS_THREADS_PER_WORKER) are
|
||||||
configured to use all cores. The Raspberry Pi models 3 and up have 4
|
configured to use all cores. The Raspberry Pi models 3 and up have 4
|
||||||
cores, meaning that paperless will use 2 workers and 2 threads per
|
cores, meaning that paperless will use 2 workers and 2 threads per
|
||||||
worker. This may result in sluggish response times during
|
worker. This may result in sluggish response times during
|
||||||
consumption, so you might want to lower these settings (example: 2
|
consumption, so you might want to lower these settings (example: 2
|
||||||
workers and 1 thread to always have some computing power left for
|
workers and 1 thread to always have some computing power left for
|
||||||
other tasks).
|
other tasks).
|
||||||
- Keep [`PAPERLESS_OCR_MODE`](configuration.md#PAPERLESS_OCR_MODE) at its default value `skip` and consider
|
- Keep [`PAPERLESS_OCR_MODE`](configuration.md#PAPERLESS_OCR_MODE) at its default value `skip` and consider
|
||||||
OCR'ing your documents before feeding them into paperless. Some
|
OCR'ing your documents before feeding them into paperless. Some
|
||||||
scanners are able to do this!
|
scanners are able to do this!
|
||||||
- Set [`PAPERLESS_OCR_SKIP_ARCHIVE_FILE`](configuration.md#PAPERLESS_OCR_SKIP_ARCHIVE_FILE) to `with_text` to skip archive
|
- Set [`PAPERLESS_OCR_SKIP_ARCHIVE_FILE`](configuration.md#PAPERLESS_OCR_SKIP_ARCHIVE_FILE) to `with_text` to skip archive
|
||||||
file generation for already ocr'ed documents, or `always` to skip it
|
file generation for already ocr'ed documents, or `always` to skip it
|
||||||
for all documents.
|
for all documents.
|
||||||
- If you want to perform OCR on the device, consider using
|
- If you want to perform OCR on the device, consider using
|
||||||
`PAPERLESS_OCR_CLEAN=none`. This will speed up OCR times and use
|
`PAPERLESS_OCR_CLEAN=none`. This will speed up OCR times and use
|
||||||
less memory at the expense of slightly worse OCR results.
|
less memory at the expense of slightly worse OCR results.
|
||||||
- If using docker, consider setting [`PAPERLESS_WEBSERVER_WORKERS`](configuration.md#PAPERLESS_WEBSERVER_WORKERS) to 1. This will save some memory.
|
- If using docker, consider setting [`PAPERLESS_WEBSERVER_WORKERS`](configuration.md#PAPERLESS_WEBSERVER_WORKERS) to 1. This will save some memory.
|
||||||
- Consider setting [`PAPERLESS_ENABLE_NLTK`](configuration.md#PAPERLESS_ENABLE_NLTK) to false, to disable the
|
- Consider setting [`PAPERLESS_ENABLE_NLTK`](configuration.md#PAPERLESS_ENABLE_NLTK) to false, to disable the
|
||||||
more advanced language processing, which can take more memory and
|
more advanced language processing, which can take more memory and
|
||||||
processing time.
|
processing time.
|
||||||
|
|
||||||
For details, refer to [configuration](configuration.md).
|
For details, refer to [configuration](configuration.md).
|
||||||
|
|
||||||
|
@ -4,27 +4,27 @@
|
|||||||
|
|
||||||
Check for the following issues:
|
Check for the following issues:
|
||||||
|
|
||||||
- Ensure that the directory you're putting your documents in is the
|
- Ensure that the directory you're putting your documents in is the
|
||||||
folder paperless is watching. With docker, this setting is performed
|
folder paperless is watching. With docker, this setting is performed
|
||||||
in the `docker-compose.yml` file. Without Docker, look at the
|
in the `docker-compose.yml` file. Without Docker, look at the
|
||||||
`CONSUMPTION_DIR` setting. Don't adjust this setting if you're
|
`CONSUMPTION_DIR` setting. Don't adjust this setting if you're
|
||||||
using docker.
|
using docker.
|
||||||
|
|
||||||
- Ensure that redis is up and running. Paperless does its task
|
- Ensure that redis is up and running. Paperless does its task
|
||||||
processing asynchronously, and for documents to arrive at the task
|
processing asynchronously, and for documents to arrive at the task
|
||||||
processor, it needs redis to run.
|
processor, it needs redis to run.
|
||||||
|
|
||||||
- Ensure that the task processor is running. Docker does this
|
- Ensure that the task processor is running. Docker does this
|
||||||
automatically. Manually invoke the task processor by executing
|
automatically. Manually invoke the task processor by executing
|
||||||
|
|
||||||
```shell-session
|
```shell-session
|
||||||
$ celery --app paperless worker
|
$ celery --app paperless worker
|
||||||
```
|
```
|
||||||
|
|
||||||
- Look at the output of paperless and inspect it for any errors.
|
- Look at the output of paperless and inspect it for any errors.
|
||||||
|
|
||||||
- Go to the admin interface, and check if there are failed tasks. If
|
- Go to the admin interface, and check if there are failed tasks. If
|
||||||
so, the tasks will contain an error message.
|
so, the tasks will contain an error message.
|
||||||
|
|
||||||
## Consumer warns `OCR for XX failed`
|
## Consumer warns `OCR for XX failed`
|
||||||
|
|
||||||
@ -78,12 +78,12 @@ Ensure that `chown` is possible on these directories.
|
|||||||
This indicates that the Auto matching algorithm found no documents to
|
This indicates that the Auto matching algorithm found no documents to
|
||||||
learn from. This may have two reasons:
|
learn from. This may have two reasons:
|
||||||
|
|
||||||
- You don't use the Auto matching algorithm: The error can be safely
|
- You don't use the Auto matching algorithm: The error can be safely
|
||||||
ignored in this case.
|
ignored in this case.
|
||||||
- You are using the Auto matching algorithm: The classifier explicitly
|
- You are using the Auto matching algorithm: The classifier explicitly
|
||||||
excludes documents with Inbox tags. Verify that there are documents
|
excludes documents with Inbox tags. Verify that there are documents
|
||||||
in your archive without inbox tags. The algorithm will only learn
|
in your archive without inbox tags. The algorithm will only learn
|
||||||
from documents not in your inbox.
|
from documents not in your inbox.
|
||||||
|
|
||||||
## UserWarning in sklearn on every single document
|
## UserWarning in sklearn on every single document
|
||||||
|
|
||||||
@ -127,10 +127,10 @@ change in the `docker-compose.yml` file:
|
|||||||
# The gotenberg chromium route is used to convert .eml files. We do not
|
# The gotenberg chromium route is used to convert .eml files. We do not
|
||||||
# want to allow external content like tracking pixels or even javascript.
|
# want to allow external content like tracking pixels or even javascript.
|
||||||
command:
|
command:
|
||||||
- 'gotenberg'
|
- 'gotenberg'
|
||||||
- '--chromium-disable-javascript=true'
|
- '--chromium-disable-javascript=true'
|
||||||
- '--chromium-allow-list=file:///tmp/.*'
|
- '--chromium-allow-list=file:///tmp/.*'
|
||||||
- '--api-timeout=60'
|
- '--api-timeout=60'
|
||||||
```
|
```
|
||||||
|
|
||||||
## Permission denied errors in the consumption directory
|
## Permission denied errors in the consumption directory
|
||||||
|
338
docs/usage.md
338
docs/usage.md
@ -10,37 +10,37 @@ and provides many utilities for finding and managing your documents.
|
|||||||
Paperless essentially consists of two different parts for managing your
|
Paperless essentially consists of two different parts for managing your
|
||||||
documents:
|
documents:
|
||||||
|
|
||||||
- The _consumer_ watches a specified folder and adds all documents in
|
- The _consumer_ watches a specified folder and adds all documents in
|
||||||
that folder to paperless.
|
that folder to paperless.
|
||||||
- The _web server_ provides a UI that you use to manage and search for
|
- The _web server_ provides a UI that you use to manage and search for
|
||||||
your scanned documents.
|
your scanned documents.
|
||||||
|
|
||||||
Each document has a couple of fields that you can assign to them:
|
Each document has a couple of fields that you can assign to them:
|
||||||
|
|
||||||
- A _Document_ is a piece of paper that sometimes contains valuable
|
- A _Document_ is a piece of paper that sometimes contains valuable
|
||||||
information.
|
information.
|
||||||
- The _correspondent_ of a document is the person, institution or
|
- The _correspondent_ of a document is the person, institution or
|
||||||
company that a document either originates from, or is sent to.
|
company that a document either originates from, or is sent to.
|
||||||
- A _tag_ is a label that you can assign to documents. Think of labels
|
- A _tag_ is a label that you can assign to documents. Think of labels
|
||||||
as more powerful folders: Multiple documents can be grouped together
|
as more powerful folders: Multiple documents can be grouped together
|
||||||
with a single tag, however, a single document can also have multiple
|
with a single tag, however, a single document can also have multiple
|
||||||
tags. This is not possible with folders. The reason folders are not
|
tags. This is not possible with folders. The reason folders are not
|
||||||
implemented in paperless is simply that tags are much more versatile
|
implemented in paperless is simply that tags are much more versatile
|
||||||
than folders.
|
than folders.
|
||||||
- A _document type_ is used to demarcate the type of a document such
|
- A _document type_ is used to demarcate the type of a document such
|
||||||
as letter, bank statement, invoice, contract, etc. It is used to
|
as letter, bank statement, invoice, contract, etc. It is used to
|
||||||
identify what a document is about.
|
identify what a document is about.
|
||||||
- The _date added_ of a document is the date the document was scanned
|
- The _date added_ of a document is the date the document was scanned
|
||||||
into paperless. You cannot and should not change this date.
|
into paperless. You cannot and should not change this date.
|
||||||
- The _date created_ of a document is the date the document was
|
- The _date created_ of a document is the date the document was
|
||||||
initially issued. This can be the date you bought a product, the
|
initially issued. This can be the date you bought a product, the
|
||||||
date you signed a contract, or the date a letter was sent to you.
|
date you signed a contract, or the date a letter was sent to you.
|
||||||
- The _archive serial number_ (short: ASN) of a document is the
|
- The _archive serial number_ (short: ASN) of a document is the
|
||||||
identifier of the document in your physical document binders. See
|
identifier of the document in your physical document binders. See
|
||||||
[recommended workflow](#usage-recommended-workflow) below.
|
[recommended workflow](#usage-recommended-workflow) below.
|
||||||
- The _content_ of a document is the text that was OCR'ed from the
|
- The _content_ of a document is the text that was OCR'ed from the
|
||||||
document. This text is fed into the search engine and is used for
|
document. This text is fed into the search engine and is used for
|
||||||
matching tags, correspondents and document types.
|
matching tags, correspondents and document types.
|
||||||
|
|
||||||
## Adding documents to paperless
|
## Adding documents to paperless
|
||||||
|
|
||||||
@ -142,21 +142,21 @@ patterns can include wildcards and multiple patterns separated by a comma.
|
|||||||
The actions all ensure that the same mail is not consumed twice by
|
The actions all ensure that the same mail is not consumed twice by
|
||||||
different means. These are as follows:
|
different means. These are as follows:
|
||||||
|
|
||||||
- **Delete:** Immediately deletes mail that paperless has consumed
|
- **Delete:** Immediately deletes mail that paperless has consumed
|
||||||
documents from. Use with caution.
|
documents from. Use with caution.
|
||||||
- **Mark as read:** Mark consumed mail as read. Paperless will not
|
- **Mark as read:** Mark consumed mail as read. Paperless will not
|
||||||
consume documents from already read mails. If you read a mail before
|
consume documents from already read mails. If you read a mail before
|
||||||
paperless sees it, it will be ignored.
|
paperless sees it, it will be ignored.
|
||||||
- **Flag:** Sets the 'important' flag on mails with consumed
|
- **Flag:** Sets the 'important' flag on mails with consumed
|
||||||
documents. Paperless will not consume flagged mails.
|
documents. Paperless will not consume flagged mails.
|
||||||
- **Move to folder:** Moves consumed mails out of the way so that
|
- **Move to folder:** Moves consumed mails out of the way so that
|
||||||
paperless won't consume them again.
|
paperless won't consume them again.
|
||||||
- **Add custom Tag:** Adds a custom tag to mails with consumed
|
- **Add custom Tag:** Adds a custom tag to mails with consumed
|
||||||
documents (the IMAP standard calls these "keywords"). Paperless
|
documents (the IMAP standard calls these "keywords"). Paperless
|
||||||
will not consume mails already tagged. Not all mail servers support
|
will not consume mails already tagged. Not all mail servers support
|
||||||
this feature!
|
this feature!
|
||||||
|
|
||||||
- **Apple Mail support:** Apple Mail clients allow differently colored tags. For this to work use `apple:<color>` (e.g. _apple:green_) as a custom tag. Available colors are _red_, _orange_, _yellow_, _blue_, _green_, _violet_ and _grey_.
|
- **Apple Mail support:** Apple Mail clients allow differently colored tags. For this to work use `apple:<color>` (e.g. _apple:green_) as a custom tag. Available colors are _red_, _orange_, _yellow_, _blue_, _green_, _violet_ and _grey_.
|
||||||
|
|
||||||
!!! warning
|
!!! warning
|
||||||
|
|
||||||
@ -360,32 +360,32 @@ flowchart TD
|
|||||||
|
|
||||||
Workflows allow you to filter by:
|
Workflows allow you to filter by:
|
||||||
|
|
||||||
- Source, e.g. documents uploaded via consume folder, API (& the web UI) and mail fetch
|
- Source, e.g. documents uploaded via consume folder, API (& the web UI) and mail fetch
|
||||||
- File name, including wildcards e.g. \*.pdf will apply to all pdfs
|
- File name, including wildcards e.g. \*.pdf will apply to all pdfs
|
||||||
- File path, including wildcards. Note that enabling `PAPERLESS_CONSUMER_RECURSIVE` would allow, for
|
- File path, including wildcards. Note that enabling `PAPERLESS_CONSUMER_RECURSIVE` would allow, for
|
||||||
example, automatically assigning documents to different owners based on the upload directory.
|
example, automatically assigning documents to different owners based on the upload directory.
|
||||||
- Mail rule. Choosing this option will force 'mail fetch' to be the workflow source.
|
- Mail rule. Choosing this option will force 'mail fetch' to be the workflow source.
|
||||||
- Content matching (`Added` and `Updated` triggers only). Filter document content using the matching settings.
|
- Content matching (`Added` and `Updated` triggers only). Filter document content using the matching settings.
|
||||||
- Tags (`Added` and `Updated` triggers only). Filter for documents with any of the specified tags
|
- Tags (`Added` and `Updated` triggers only). Filter for documents with any of the specified tags
|
||||||
- Document type (`Added` and `Updated` triggers only). Filter documents with this doc type
|
- Document type (`Added` and `Updated` triggers only). Filter documents with this doc type
|
||||||
- Correspondent (`Added` and `Updated` triggers only). Filter documents with this correspondent
|
- Correspondent (`Added` and `Updated` triggers only). Filter documents with this correspondent
|
||||||
|
|
||||||
### Workflow Actions
|
### Workflow Actions
|
||||||
|
|
||||||
There are currently two types of workflow actions, "Assignment", which can assign:
|
There are currently two types of workflow actions, "Assignment", which can assign:
|
||||||
|
|
||||||
- Title, see [title placeholders](usage.md#title-placeholders) below
|
- Title, see [title placeholders](usage.md#title-placeholders) below
|
||||||
- Tags, correspondent, document type and storage path
|
- Tags, correspondent, document type and storage path
|
||||||
- Document owner
|
- Document owner
|
||||||
- View and / or edit permissions to users or groups
|
- View and / or edit permissions to users or groups
|
||||||
- Custom fields. Note that no value for the field will be set
|
- Custom fields. Note that no value for the field will be set
|
||||||
|
|
||||||
and "Removal" actions, which can remove either all of or specific sets of the following:
|
and "Removal" actions, which can remove either all of or specific sets of the following:
|
||||||
|
|
||||||
- Tags, correspondents, document types or storage paths
|
- Tags, correspondents, document types or storage paths
|
||||||
- Document owner
|
- Document owner
|
||||||
- View and / or edit permissions
|
- View and / or edit permissions
|
||||||
- Custom fields
|
- Custom fields
|
||||||
|
|
||||||
#### Title placeholders
|
#### Title placeholders
|
||||||
|
|
||||||
@ -393,29 +393,29 @@ Workflow titles can include placeholders but the available options differ depend
|
|||||||
workflow trigger. This is because at the time of consumption (when the title is to be set), no automatic tags etc. have been
|
workflow trigger. This is because at the time of consumption (when the title is to be set), no automatic tags etc. have been
|
||||||
applied. You can use the following placeholders with any trigger type:
|
applied. You can use the following placeholders with any trigger type:
|
||||||
|
|
||||||
- `{correspondent}`: assigned correspondent name
|
- `{correspondent}`: assigned correspondent name
|
||||||
- `{document_type}`: assigned document type name
|
- `{document_type}`: assigned document type name
|
||||||
- `{owner_username}`: assigned owner username
|
- `{owner_username}`: assigned owner username
|
||||||
- `{added}`: added datetime
|
- `{added}`: added datetime
|
||||||
- `{added_year}`: added year
|
- `{added_year}`: added year
|
||||||
- `{added_year_short}`: added year
|
- `{added_year_short}`: added year
|
||||||
- `{added_month}`: added month
|
- `{added_month}`: added month
|
||||||
- `{added_month_name}`: added month name
|
- `{added_month_name}`: added month name
|
||||||
- `{added_month_name_short}`: added month short name
|
- `{added_month_name_short}`: added month short name
|
||||||
- `{added_day}`: added day
|
- `{added_day}`: added day
|
||||||
- `{added_time}`: added time in HH:MM format
|
- `{added_time}`: added time in HH:MM format
|
||||||
- `{original_filename}`: original file name without extension
|
- `{original_filename}`: original file name without extension
|
||||||
|
|
||||||
The following placeholders are only available for "added" or "updated" triggers
|
The following placeholders are only available for "added" or "updated" triggers
|
||||||
|
|
||||||
- `{created}`: created datetime
|
- `{created}`: created datetime
|
||||||
- `{created_year}`: created year
|
- `{created_year}`: created year
|
||||||
- `{created_year_short}`: created year
|
- `{created_year_short}`: created year
|
||||||
- `{created_month}`: created month
|
- `{created_month}`: created month
|
||||||
- `{created_month_name}`: created month name
|
- `{created_month_name}`: created month name
|
||||||
- `{created_month_name_short}`: created month short name
|
- `{created_month_name_short}`: created month short name
|
||||||
- `{created_day}`: created day
|
- `{created_day}`: created day
|
||||||
- `{created_time}`: created time in HH:MM format
|
- `{created_time}`: created time in HH:MM format
|
||||||
|
|
||||||
### Workflow permissions
|
### Workflow permissions
|
||||||
|
|
||||||
@ -450,24 +450,24 @@ Multiple fields may be attached to a document but the same field name cannot be
|
|||||||
|
|
||||||
The following custom field types are supported:
|
The following custom field types are supported:
|
||||||
|
|
||||||
- `Text`: any text
|
- `Text`: any text
|
||||||
- `Boolean`: true / false (check / unchecked) field
|
- `Boolean`: true / false (check / unchecked) field
|
||||||
- `Date`: date
|
- `Date`: date
|
||||||
- `URL`: a valid url
|
- `URL`: a valid url
|
||||||
- `Integer`: integer number e.g. 12
|
- `Integer`: integer number e.g. 12
|
||||||
- `Number`: float number e.g. 12.3456
|
- `Number`: float number e.g. 12.3456
|
||||||
- `Monetary`: [ISO 4217 currency code](https://en.wikipedia.org/wiki/ISO_4217#List_of_ISO_4217_currency_codes) and a number with exactly two decimals, e.g. USD12.30
|
- `Monetary`: [ISO 4217 currency code](https://en.wikipedia.org/wiki/ISO_4217#List_of_ISO_4217_currency_codes) and a number with exactly two decimals, e.g. USD12.30
|
||||||
- `Document Link`: reference(s) to other document(s) displayed as links, automatically creates a symmetrical link in reverse
|
- `Document Link`: reference(s) to other document(s) displayed as links, automatically creates a symmetrical link in reverse
|
||||||
- `Select`: a pre-defined list of strings from which the user can choose
|
- `Select`: a pre-defined list of strings from which the user can choose
|
||||||
|
|
||||||
## Share Links
|
## Share Links
|
||||||
|
|
||||||
Paperless-ngx added the ability to create shareable links to files in version 2.0. You can find the button for this on the document detail screen.
|
Paperless-ngx added the ability to create shareable links to files in version 2.0. You can find the button for this on the document detail screen.
|
||||||
|
|
||||||
- Share links do not require a user to login and thus link directly to a file.
|
- Share links do not require a user to login and thus link directly to a file.
|
||||||
- Links are unique and are of the form `{paperless-url}/share/{randomly-generated-slug}`.
|
- Links are unique and are of the form `{paperless-url}/share/{randomly-generated-slug}`.
|
||||||
- Links can optionally have an expiration time set.
|
- Links can optionally have an expiration time set.
|
||||||
- After a link expires or is deleted users will be redirected to the regular paperless-ngx login.
|
- After a link expires or is deleted users will be redirected to the regular paperless-ngx login.
|
||||||
|
|
||||||
!!! tip
|
!!! tip
|
||||||
|
|
||||||
@ -477,10 +477,10 @@ Paperless-ngx added the ability to create shareable links to files in version 2.
|
|||||||
|
|
||||||
Paperless-ngx supports four basic editing operations for PDFs (these operations currently cannot be performed on non-PDF files):
|
Paperless-ngx supports four basic editing operations for PDFs (these operations currently cannot be performed on non-PDF files):
|
||||||
|
|
||||||
- Merging documents: available when selecting multiple documents for 'bulk editing'.
|
- Merging documents: available when selecting multiple documents for 'bulk editing'.
|
||||||
- Rotating documents: available when selecting multiple documents for 'bulk editing' and from an individual document's details page.
|
- Rotating documents: available when selecting multiple documents for 'bulk editing' and from an individual document's details page.
|
||||||
- Splitting documents: available from an individual document's details page.
|
- Splitting documents: available from an individual document's details page.
|
||||||
- Deleting pages: available from an individual document's details page.
|
- Deleting pages: available from an individual document's details page.
|
||||||
|
|
||||||
!!! important
|
!!! important
|
||||||
|
|
||||||
@ -558,18 +558,18 @@ the system.
|
|||||||
Here are a couple examples of tags and types that you could use in your
|
Here are a couple examples of tags and types that you could use in your
|
||||||
collection.
|
collection.
|
||||||
|
|
||||||
- An `inbox` tag for newly added documents that you haven't manually
|
- An `inbox` tag for newly added documents that you haven't manually
|
||||||
edited yet.
|
edited yet.
|
||||||
- A tag `car` for everything car related (repairs, registration,
|
- A tag `car` for everything car related (repairs, registration,
|
||||||
insurance, etc)
|
insurance, etc)
|
||||||
- A tag `todo` for documents that you still need to do something with,
|
- A tag `todo` for documents that you still need to do something with,
|
||||||
such as reply, or perform some task online.
|
such as reply, or perform some task online.
|
||||||
- A tag `bank account x` for all bank statement related to that
|
- A tag `bank account x` for all bank statement related to that
|
||||||
account.
|
account.
|
||||||
- A tag `mail` for anything that you added to paperless via its mail
|
- A tag `mail` for anything that you added to paperless via its mail
|
||||||
processing capabilities.
|
processing capabilities.
|
||||||
- A tag `missing_metadata` when you still need to add some metadata to
|
- A tag `missing_metadata` when you still need to add some metadata to
|
||||||
a document, but can't or don't want to do this right now.
|
a document, but can't or don't want to do this right now.
|
||||||
|
|
||||||
## Searching {#basic-usage_searching}
|
## Searching {#basic-usage_searching}
|
||||||
|
|
||||||
@ -658,8 +658,8 @@ The following diagram shows how easy it is to manage your documents.
|
|||||||
|
|
||||||
### Preparations in paperless
|
### Preparations in paperless
|
||||||
|
|
||||||
- Create an inbox tag that gets assigned to all new documents.
|
- Create an inbox tag that gets assigned to all new documents.
|
||||||
- Create a TODO tag.
|
- Create a TODO tag.
|
||||||
|
|
||||||
### Processing of the physical documents
|
### Processing of the physical documents
|
||||||
|
|
||||||
@ -733,78 +733,78 @@ Some documents require attention and require you to act on the document.
|
|||||||
You may take two different approaches to handle these documents based on
|
You may take two different approaches to handle these documents based on
|
||||||
how regularly you intend to scan documents and use paperless.
|
how regularly you intend to scan documents and use paperless.
|
||||||
|
|
||||||
- If you scan and process your documents in paperless regularly,
|
- If you scan and process your documents in paperless regularly,
|
||||||
assign a TODO tag to all scanned documents that you need to process.
|
assign a TODO tag to all scanned documents that you need to process.
|
||||||
Create a saved view on the dashboard that shows all documents with
|
Create a saved view on the dashboard that shows all documents with
|
||||||
this tag.
|
this tag.
|
||||||
- If you do not scan documents regularly and use paperless solely for
|
- If you do not scan documents regularly and use paperless solely for
|
||||||
archiving, create a physical todo box next to your physical inbox
|
archiving, create a physical todo box next to your physical inbox
|
||||||
and put documents you need to process in the TODO box. When you
|
and put documents you need to process in the TODO box. When you
|
||||||
performed the task associated with the document, move it to the
|
performed the task associated with the document, move it to the
|
||||||
inbox.
|
inbox.
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
Paperless-ngx consists of the following components:
|
Paperless-ngx consists of the following components:
|
||||||
|
|
||||||
- **The webserver:** This serves the administration pages, the API,
|
- **The webserver:** This serves the administration pages, the API,
|
||||||
and the new frontend. This is the main tool you'll be using to interact
|
and the new frontend. This is the main tool you'll be using to interact
|
||||||
with paperless. You may start the webserver directly with
|
with paperless. You may start the webserver directly with
|
||||||
|
|
||||||
```shell-session
|
```shell-session
|
||||||
$ cd /path/to/paperless/src/
|
$ cd /path/to/paperless/src/
|
||||||
$ gunicorn -c ../gunicorn.conf.py paperless.wsgi
|
$ gunicorn -c ../gunicorn.conf.py paperless.wsgi
|
||||||
```
|
```
|
||||||
|
|
||||||
or by any other means such as Apache `mod_wsgi`.
|
or by any other means such as Apache `mod_wsgi`.
|
||||||
|
|
||||||
- **The consumer:** This is what watches your consumption folder for
|
- **The consumer:** This is what watches your consumption folder for
|
||||||
documents. However, the consumer itself does not really consume your
|
documents. However, the consumer itself does not really consume your
|
||||||
documents. Now it notifies a task processor that a new file is ready
|
documents. Now it notifies a task processor that a new file is ready
|
||||||
for consumption. I suppose it should be named differently. This was
|
for consumption. I suppose it should be named differently. This was
|
||||||
also used to check your emails, but that's now done elsewhere as
|
also used to check your emails, but that's now done elsewhere as
|
||||||
well.
|
well.
|
||||||
|
|
||||||
Start the consumer with the management command `document_consumer`:
|
Start the consumer with the management command `document_consumer`:
|
||||||
|
|
||||||
```shell-session
|
```shell-session
|
||||||
$ cd /path/to/paperless/src/
|
$ cd /path/to/paperless/src/
|
||||||
$ python3 manage.py document_consumer
|
$ python3 manage.py document_consumer
|
||||||
```
|
```
|
||||||
|
|
||||||
- **The task processor:** Paperless relies on [Celery - Distributed
|
- **The task processor:** Paperless relies on [Celery - Distributed
|
||||||
Task Queue](https://docs.celeryq.dev/en/stable/index.html) for doing
|
Task Queue](https://docs.celeryq.dev/en/stable/index.html) for doing
|
||||||
most of the heavy lifting. This is a task queue that accepts tasks
|
most of the heavy lifting. This is a task queue that accepts tasks
|
||||||
from multiple sources and processes these in parallel. It also comes
|
from multiple sources and processes these in parallel. It also comes
|
||||||
with a scheduler that executes certain commands periodically.
|
with a scheduler that executes certain commands periodically.
|
||||||
|
|
||||||
This task processor is responsible for:
|
This task processor is responsible for:
|
||||||
|
|
||||||
- Consuming documents. When the consumer finds new documents, it
|
- Consuming documents. When the consumer finds new documents, it
|
||||||
notifies the task processor to start a consumption task.
|
notifies the task processor to start a consumption task.
|
||||||
- The task processor also performs the consumption of any
|
- The task processor also performs the consumption of any
|
||||||
documents you upload through the web interface.
|
documents you upload through the web interface.
|
||||||
- Consuming emails. It periodically checks your configured
|
- Consuming emails. It periodically checks your configured
|
||||||
accounts for new emails and notifies the task processor to
|
accounts for new emails and notifies the task processor to
|
||||||
consume the attachment of an email.
|
consume the attachment of an email.
|
||||||
- Maintaining the search index and the automatic matching
|
- Maintaining the search index and the automatic matching
|
||||||
algorithm. These are things that paperless needs to do from time
|
algorithm. These are things that paperless needs to do from time
|
||||||
to time in order to operate properly.
|
to time in order to operate properly.
|
||||||
|
|
||||||
This allows paperless to process multiple documents from your
|
This allows paperless to process multiple documents from your
|
||||||
consumption folder in parallel! On a modern multi core system, this
|
consumption folder in parallel! On a modern multi core system, this
|
||||||
makes the consumption process with full OCR blazingly fast.
|
makes the consumption process with full OCR blazingly fast.
|
||||||
|
|
||||||
The task processor comes with a built-in admin interface that you
|
The task processor comes with a built-in admin interface that you
|
||||||
can use to check whenever any of the tasks fail and inspect the
|
can use to check whenever any of the tasks fail and inspect the
|
||||||
errors (i.e., wrong email credentials, errors during consuming a
|
errors (i.e., wrong email credentials, errors during consuming a
|
||||||
specific file, etc).
|
specific file, etc).
|
||||||
|
|
||||||
- A [redis](https://redis.io/) message broker: This is a really
|
- A [redis](https://redis.io/) message broker: This is a really
|
||||||
lightweight service that is responsible for getting the tasks from
|
lightweight service that is responsible for getting the tasks from
|
||||||
the webserver and the consumer to the task scheduler. These run in a
|
the webserver and the consumer to the task scheduler. These run in a
|
||||||
different process (maybe even on different machines!), and
|
different process (maybe even on different machines!), and
|
||||||
therefore, this is necessary.
|
therefore, this is necessary.
|
||||||
|
|
||||||
- Optional: A database server. Paperless supports PostgreSQL, MariaDB
|
- Optional: A database server. Paperless supports PostgreSQL, MariaDB
|
||||||
and SQLite for storing its data.
|
and SQLite for storing its data.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user