Further cleanup of docs, including fixing autoconvert issues and general cleanups

This commit is contained in:
Trenton Holmes 2022-12-04 08:34:49 -08:00 committed by Trenton H
parent 32d546740b
commit 7788d93227
10 changed files with 147 additions and 138 deletions

View File

@ -9,7 +9,7 @@ Before making backups, make sure that paperless is not running.
Options available to any installation of paperless: Options available to any installation of paperless:
- Use the [document exporter](administration#exporter). The document exporter exports all your documents, - Use the [document exporter](#exporter). The document exporter exports all your documents,
thumbnails and metadata to a specific folder. You may import your thumbnails and metadata to a specific folder. You may import your
documents into a fresh instance of paperless again or store your documents into a fresh instance of paperless again or store your
documents in another DMS with this export. documents in another DMS with this export.
@ -52,7 +52,7 @@ Options available to bare-metal and non-docker installations:
## Updating Paperless {#updating} ## Updating Paperless {#updating}
### Docker Route ### Docker Route {#docker-updating}
If a new release of paperless-ngx is available, upgrading depends on how If a new release of paperless-ngx is available, upgrading depends on how
you installed paperless-ngx in the first place. The releases are you installed paperless-ngx in the first place. The releases are
@ -70,7 +70,7 @@ After that, [make a backup](#backup).
A. If you pull the image from the docker hub, all you need to do is: A. If you pull the image from the docker hub, all you need to do is:
``` shell-session ```shell-session
$ docker-compose pull $ docker-compose pull
$ docker-compose up $ docker-compose up
``` ```
@ -80,7 +80,7 @@ A. If you pull the image from the docker hub, all you need to do is:
B. If you built the image yourself, do the following: B. If you built the image yourself, do the following:
``` shell-session ```shell-session
$ git pull $ git pull
$ docker-compose build $ docker-compose build
$ docker-compose up $ docker-compose up
@ -131,7 +131,7 @@ the background.
image: ghcr.io/paperless-ngx/paperless-ngx:1.7 image: ghcr.io/paperless-ngx/paperless-ngx:1.7
``` ```
### Bare Metal Route ### Bare Metal Route {#bare-metal-updating}
After grabbing the new release and unpacking the contents, do the After grabbing the new release and unpacking the contents, do the
following: following:
@ -158,7 +158,7 @@ following:
This might not actually do anything. Not every new paperless version This might not actually do anything. Not every new paperless version
comes with new database migrations. comes with new database migrations.
## Downgrading Paperless ## Downgrading Paperless {#downgrade-paperless}
Downgrades are possible. However, some updates also contain database Downgrades are possible. However, some updates also contain database
migrations (these change the layout of the database and may move data). migrations (these change the layout of the database and may move data).
@ -366,7 +366,7 @@ task scheduler.
### Managing filenames {#renamer} ### Managing filenames {#renamer}
If you use paperless' feature to If you use paperless' feature to
[assign custom filenames to your documents](/advanced_usage#file_name_handling), you can use this command to move all your files after [assign custom filenames to your documents](/advanced_usage#file-name-handling), you can use this command to move all your files after
changing the naming scheme. changing the naming scheme.
!!! warning !!! warning
@ -430,9 +430,7 @@ rules.
As of October 2022 Microsoft no longer supports IMAP authentication As of October 2022 Microsoft no longer supports IMAP authentication
for Exchange servers, thus Exchange is no longer supported until a for Exchange servers, thus Exchange is no longer supported until a
solution is implemented in the Python IMAP library used by Paperless. solution is implemented in the Python IMAP library used by Paperless.
See See [learn.microsoft.com](https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online)
[learn.microsoft.com](https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online)
### Creating archived documents {#archiver} ### Creating archived documents {#archiver}

View File

@ -50,7 +50,7 @@ and run another document through the consumer. Once complete, you should
see the newly-created document, automatically tagged with the see the newly-created document, automatically tagged with the
appropriate data. appropriate data.
### Automatic matching {#automatic_matching} ### Automatic matching {#automatic-matching}
Paperless-ngx comes with a new matching algorithm called _Auto_. This Paperless-ngx comes with a new matching algorithm called _Auto_. This
matching algorithm tries to assign tags, correspondents, document types, matching algorithm tries to assign tags, correspondents, document types,
@ -59,8 +59,8 @@ assigned these on existing documents. It uses a neural network under the
hood. hood.
If, for example, all your bank statements of your account 123 at the If, for example, all your bank statements of your account 123 at the
Bank of America are tagged with the tag "bofa*123" and the matching Bank of America are tagged with the tag "bofa123" and the matching
algorithm of this tag is set to \_Auto*, this neural network will examine algorithm of this tag is set to _Auto_, this neural network will examine
your documents and automatically learn when to assign this tag. your documents and automatically learn when to assign this tag.
Paperless tries to hide much of the involved complexity with this Paperless tries to hide much of the involved complexity with this
@ -95,7 +95,7 @@ when using this feature:
of these correspondents to ANY new document, if both are set to of these correspondents to ANY new document, if both are set to
automatic matching. automatic matching.
## Hooking into the consumption process ## Hooking into the consumption process {#consume-hooks}
Sometimes you may want to do something arbitrary whenever a document is Sometimes you may want to do something arbitrary whenever a document is
consumed. Rather than try to predict what you may want to do, Paperless consumed. Rather than try to predict what you may want to do, Paperless
@ -115,7 +115,7 @@ and then put the path to that script in `paperless.conf` or
asynchronously, you'll have to fork the process in your script and asynchronously, you'll have to fork the process in your script and
exit. exit.
### Pre-consumption script ### Pre-consumption script {#pre-consume-script}
Executed after the consumer sees a new document in the consumption Executed after the consumer sees a new document in the consumption
folder, but before any processing of the document is performed. This folder, but before any processing of the document is performed. This
@ -151,7 +151,7 @@ with the newly modified file.
The script's stdout and stderr will be logged line by line to the The script's stdout and stderr will be logged line by line to the
webserver log, along with the exit code of the script. webserver log, along with the exit code of the script.
### Post-consumption script {#post_consume_script} ### Post-consumption script {#post-consume-script}
Executed after the consumer has successfully processed a document and Executed after the consumer has successfully processed a document and
has moved it into paperless. It receives the following environment has moved it into paperless. It receives the following environment
@ -181,33 +181,34 @@ The post consumption script cannot cancel the consumption process.
The script's stdout and stderr will be logged line by line to the The script's stdout and stderr will be logged line by line to the
webserver log, along with the exit code of the script. webserver log, along with the exit code of the script.
#### Docker ### Docker {#docker-consume-hooks}
Assumed you have To hook into the consumption process when using Docker, you
`/home/foo/paperless-ngx/scripts/post-consumption-example.sh`. will need to pass the scripts into the container via a host mount
in your `docker-compose.yml`.
You can pass that script into the consumer container via a host mount in Assuming you have
your `docker-compose.yml`. `/home/paperless-ngx/scripts/post-consumption-example.sh` as a
script which you'd like to run.
```bash You can pass that script into the consumer container via a host mount:
```yaml
... ...
consumer: webserver:
... ...
volumes: volumes:
... ...
- /home/paperless-ngx/scripts:/path/in/container/scripts/ - /home/paperless-ngx/scripts:/path/in/container/scripts/ # (1)!
environment: # (3)!
...
PAPERLESS_POST_CONSUME_SCRIPT: /path/in/container/scripts/post-consumption-example.sh # (2)!
... ...
``` ```
Example (docker-compose.yml): 1. The external scripts directory is mounted to a location inside the container.
`- /home/foo/paperless-ngx/scripts:/usr/src/paperless/scripts` 2. The internal location of the script is used to set the script to run
3. This can also be set in `docker-compose.env`
which in turn requires the variable `PAPERLESS_POST_CONSUME_SCRIPT` in
`docker-compose.env` to point to
`/path/in/container/scripts/post-consumption-example.sh`.
Example (docker-compose.env):
`PAPERLESS_POST_CONSUME_SCRIPT=/usr/src/paperless/scripts/post-consumption-example.sh`
Troubleshooting: Troubleshooting:
@ -218,7 +219,7 @@ Troubleshooting:
- Pipe your scripts's output to a log file e.g. - Pipe your scripts's output to a log file e.g.
`echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log` `echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log`
## File name handling {#file_name_handling} ## File name handling {#file-name-handling}
By default, paperless stores your documents in the media directory and By default, paperless stores your documents in the media directory and
renames them using the identifier which it has assigned to each renames them using the identifier which it has assigned to each
@ -316,7 +317,7 @@ value.
Paperless checks the filename of a document whenever it is saved. Paperless checks the filename of a document whenever it is saved.
Therefore, you need to update the filenames of your documents and move Therefore, you need to update the filenames of your documents and move
them after altering this setting by invoking the them after altering this setting by invoking the
[`document renamer <utilities-renamer>`](). [`document renamer`](administration#renamer).
!!! warning !!! warning
@ -344,7 +345,7 @@ When as single storage layout is not sufficient for your use case,
storage paths come to the rescue. Storage paths allow you to configure storage paths come to the rescue. Storage paths allow you to configure
more precisely where each document is stored in the file system. more precisely where each document is stored in the file system.
- Each storage path is a [PAPERLESS_FILENAME_FORMAT]{.title-ref} and - Each storage path is a `PAPERLESS_FILENAME_FORMAT` and
follows the rules described above follows the rules described above
- Each document is assigned a storage path using the matching - Each document is assigned a storage path using the matching
algorithms described above, but can be overwritten at any time algorithms described above, but can be overwritten at any time
@ -352,7 +353,7 @@ more precisely where each document is stored in the file system.
For example, you could define the following two storage paths: For example, you could define the following two storage paths:
1. Normal communications are put into a folder structure sorted by 1. Normal communications are put into a folder structure sorted by
[year/correspondent]{.title-ref} `year/correspondent`
2. Communications with insurance companies are stored in a flat 2. Communications with insurance companies are stored in a flat
structure with longer file names, but containing the full date of structure with longer file names, but containing the full date of
the correspondence. the correspondence.
@ -384,7 +385,7 @@ structure as in the previous example above.
!!! tip !!! tip
Defining a storage path is optional. If no storage path is defined for a Defining a storage path is optional. If no storage path is defined for a
document, the global [PAPERLESS_FILENAME_FORMAT]{.title-ref} is applied. document, the global `PAPERLESS_FILENAME_FORMAT` is applied.
!!! warning !!! warning
@ -403,27 +404,32 @@ queued and completed tasks, timing and more. Flower can also be used
with Prometheus, as it exports metrics. For details on its capabilities, with Prometheus, as it exports metrics. For details on its capabilities,
refer to the Flower documentation. refer to the Flower documentation.
To configure Flower further, create a [flowerconfig.py]{.title-ref} and To configure Flower further, create a `flowerconfig.py` and
place it into the [src/paperless]{.title-ref} directory. For a Docker place it into the `src/paperless` directory. For a Docker
installation, you can use volumes to accomplish this: installation, you can use volumes to accomplish this:
```yaml ```yaml
services: services:
# ... # ...
webserver: webserver:
ports:
- 5555:5555 # (2)!
# ... # ...
volumes: volumes:
- /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro - /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro # (1)!
``` ```
1. Note the `:ro` tag means the file will be mounted as read only.
2. `flower` runs by default on port 5555, but this can be configured
## Custom Container Initialization ## Custom Container Initialization
The Docker image includes the ability to run custom user scripts during The Docker image includes the ability to run custom user scripts during
startup. This could be utilized for installing additional tools or startup. This could be utilized for installing additional tools or
Python packages, for example. Python packages, for example. Scripts are expected to be shell scripts.
To utilize this, mount a folder containing your scripts to the custom To utilize this, mount a folder containing your scripts to the custom
initialization directory, [/custom-cont-init.d]{.title-ref} and place initialization directory, `/custom-cont-init.d` and place
scripts you wish to run inside. For security, the folder must be owned scripts you wish to run inside. For security, the folder must be owned
by `root` and should have permissions of `a=rx`. Additionally, scripts by `root` and should have permissions of `a=rx`. Additionally, scripts
must only be writable by `root`. must only be writable by `root`.
@ -445,9 +451,11 @@ services:
webserver: webserver:
# ... # ...
volumes: volumes:
- /path/to/my/scripts:/custom-cont-init.d:ro - /path/to/my/scripts:/custom-cont-init.d:ro # (1)!
``` ```
1. Note the `:ro` tag means the folder will be mounted as read only. This is for extra security against changes
## MySQL Caveats {#mysql-caveats} ## MySQL Caveats {#mysql-caveats}
### Case Sensitivity ### Case Sensitivity

View File

@ -225,7 +225,7 @@ Query parameters:
Results returned by the endpoint are ordered by importance of the term Results returned by the endpoint are ordered by importance of the term
in the document index. The first result is the term that has the highest in the document index. The first result is the term that has the highest
Tf/Idf score in the index. [Tf/Idf](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) score in the index.
```json ```json
["term1", "term3", "term6", "term4"] ["term1", "term3", "term6", "term4"]

View File

@ -33,15 +33,15 @@ matcher.
[More information on securing your Redis [More information on securing your Redis
Instance](https://redis.io/docs/getting-started/#securing-redis). Instance](https://redis.io/docs/getting-started/#securing-redis).
Defaults to <redis://localhost:6379>. Defaults to `redis://localhost:6379`.
`PAPERLESS_DBENGINE=<engine_name>` `PAPERLESS_DBENGINE=<engine_name>`
: Optional, gives the ability to choose Postgres or MariaDB for : Optional, gives the ability to choose Postgres or MariaDB for
database engine. Available options are [postgresql]{.title-ref} and database engine. Available options are `postgresql` and
[mariadb]{.title-ref}. `mariadb`.
Default is [postgresql]{.title-ref}. Default is `postgresql`.
!!! warning !!! warning
@ -150,25 +150,25 @@ files created using "collectstatic" manager command are stored.
`PAPERLESS_FILENAME_FORMAT=<format>` `PAPERLESS_FILENAME_FORMAT=<format>`
: Changes the filenames paperless uses to store documents in the media : Changes the filenames paperless uses to store documents in the media
directory. See [File name handling](advanced_usage#file_name_handling) for details. directory. See [File name handling](advanced_usage#file-name-handling) for details.
Default is none, which disables this feature. Default is none, which disables this feature.
`PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=<bool>` `PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=<bool>`
: Tells paperless to replace placeholders in : Tells paperless to replace placeholders in
[PAPERLESS_FILENAME_FORMAT]{.title-ref} that would resolve to `PAPERLESS_FILENAME_FORMAT` that would resolve to
'none' to be omitted from the resulting filename. This also holds 'none' to be omitted from the resulting filename. This also holds
true for directory names. See [File name handling](advanced_usage#file_name_handling) for true for directory names. See [File name handling](advanced_usage#file-name-handling) for
details. details.
Defaults to [false]{.title-ref} which disables this feature. Defaults to `false` which disables this feature.
`PAPERLESS_LOGGING_DIR=<path>` `PAPERLESS_LOGGING_DIR=<path>`
: This is where paperless will store log files. : This is where paperless will store log files.
Defaults to "`PAPERLESS_DATA_DIR`/log/". Defaults to `PAPERLESS_DATA_DIR/log/`.
## Logging ## Logging
@ -283,10 +283,10 @@ login with the selected user.
: If this environment variable is specified, Paperless automatically : If this environment variable is specified, Paperless automatically
creates a superuser with the provided username at start. This is creates a superuser with the provided username at start. This is
useful in cases where you can not run the useful in cases where you can not run the
[createsuperuser]{.title-ref} command separately, such as Kubernetes `createsuperuser` command separately, such as Kubernetes
or AWS ECS. or AWS ECS.
Requires [PAPERLESS_ADMIN_PASSWORD]{.title-ref} to be set. Requires PAPERLESS_ADMIN_PASSWORD be set.
!!! note !!! note
@ -297,13 +297,13 @@ or AWS ECS.
`PAPERLESS_ADMIN_MAIL=<email>` `PAPERLESS_ADMIN_MAIL=<email>`
: (Optional) Specify superuser email address. Only used when : (Optional) Specify superuser email address. Only used when
[PAPERLESS_ADMIN_USER]{.title-ref} is set. PAPERLESS_ADMIN_USER is set.
Defaults to `root@localhost`. Defaults to `root@localhost`.
`PAPERLESS_ADMIN_PASSWORD=<password>` `PAPERLESS_ADMIN_PASSWORD=<password>`
: Only used when [PAPERLESS_ADMIN_USER]{.title-ref} is set. This will : Only used when PAPERLESS_ADMIN_USER is set. This will
be the password of the automatically created superuser. be the password of the automatically created superuser.
`PAPERLESS_COOKIE_PREFIX=<str>` `PAPERLESS_COOKIE_PREFIX=<str>`
@ -331,26 +331,25 @@ applications.
If you're exposing paperless to the internet directly, do not use If you're exposing paperless to the internet directly, do not use
this. this.
Also see the warning [in the official documentation Also see the warning [in the official documentation](https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration).
<https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration>]{.title-ref}.
Defaults to [false]{.title-ref} which disables this feature. Defaults to "false" which disables this feature.
`PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>` `PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>`
: If [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} is enabled, this : If "PAPERLESS*ENABLE_HTTP_REMOTE_USER" is enabled, this
property allows to customize the name of the HTTP header from which property allows to customize the name of the HTTP header from which
the authenticated username is extracted. Values are in terms of the authenticated username is extracted. Values are in terms of
\[HttpRequest.META\](<https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META>). [HttpRequest.META](https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META).
Thus, the configured value must start with [HTTP\_]{.title-ref} Thus, the configured value must start with `HTTP*`
followed by the normalized actual header name. followed by the normalized actual header name.
Defaults to [HTTP_REMOTE_USER]{.title-ref}. Defaults to "HTTP_REMOTE_USER".
`PAPERLESS_LOGOUT_REDIRECT_URL=<str>` `PAPERLESS_LOGOUT_REDIRECT_URL=<str>`
: URL to redirect the user to after a logout. This can be used : URL to redirect the user to after a logout. This can be used
together with [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} to together with PAPERLESS_ENABLE_HTTP_REMOTE_USER to
redirect the user back to the SSO application's logout page. redirect the user back to the SSO application's logout page.
Defaults to None, which disables this feature. Defaults to None, which disables this feature.
@ -368,7 +367,7 @@ needs.
parsing documents. parsing documents.
It should be a 3-letter language code consistent with ISO 639: It should be a 3-letter language code consistent with ISO 639:
<https://www.loc.gov/standards/iso639-2/php/code_list.php> https://www.loc.gov/standards/iso639-2/php/code_list.php
Set this to the language most of your documents are written in. Set this to the language most of your documents are written in.
@ -624,8 +623,7 @@ Add the configuration variables to the environment of the webserver
and add the additional services below the webserver service. Watch out and add the additional services below the webserver service. Watch out
for indentation. for indentation.
Make sure to use the correct format [PAPERLESS_TIKA_ENABLED = Make sure to use the correct format `PAPERLESS_TIKA_ENABLED = 1` so python_dotenv can parse the statement correctly.
1]{.title-ref} so python_dotenv can parse the statement correctly.
## Software tweaks {#software_tweaks} ## Software tweaks {#software_tweaks}
@ -648,7 +646,7 @@ paperless will process in parallel on a single document.
Ensure that the product Ensure that the product
`PAPERLESS_TASK_WORKERS \: PAPERLESS_THREADS_PER_WORKER` `PAPERLESS_TASK_WORKERS * PAPERLESS_THREADS_PER_WORKER`
does not exceed your CPU core count or else paperless will be does not exceed your CPU core count or else paperless will be
extremely slow. If you want paperless to process many documents in extremely slow. If you want paperless to process many documents in
@ -752,7 +750,7 @@ consumption directory as well.
`PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=<bool>` `PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=<bool>`
: Set the names of subdirectories as tags for consumed files. E.g. : Set the names of subdirectories as tags for consumed files. E.g.
<CONSUMPTION_DIR>/foo/bar/file.pdf will add the tags "foo" and `<CONSUMPTION_DIR>/foo/bar/file.pdf` will add the tags "foo" and
"bar" to the consumed file. Paperless will create any tags that "bar" to the consumed file. Paperless will create any tags that
don't exist yet. don't exist yet.
@ -827,7 +825,7 @@ documents.
: After a document is consumed, Paperless can trigger an arbitrary : After a document is consumed, Paperless can trigger an arbitrary
script if you like. This script will be passed a number of arguments script if you like. This script will be passed a number of arguments
for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post_consume_script). for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post-consume-script).
The default is blank, which means nothing will be executed. The default is blank, which means nothing will be executed.
@ -841,8 +839,7 @@ option as specified in
The filename will be checked first, and if nothing is found, the The filename will be checked first, and if nothing is found, the
document text will be checked as normal. document text will be checked as normal.
A date in a filename must have some separators ([.]{.title-ref}, A date in a filename must have some separators (`.`, `,`, `-`, `/`, etc) for it to be parsed.
[-]{.title-ref}, [/]{.title-ref}, etc) for it to be parsed.
Defaults to none, which disables this feature. Defaults to none, which disables this feature.
@ -928,7 +925,7 @@ the literal path for that program.
These options don't have any effect in `paperless.conf`. These options These options don't have any effect in `paperless.conf`. These options
adjust the behavior of the docker container. Configure these in adjust the behavior of the docker container. Configure these in
[docker-compose.env]{.title-ref}. `docker-compose.env`.
`PAPERLESS_WEBSERVER_WORKERS=<num>` `PAPERLESS_WEBSERVER_WORKERS=<num>`
@ -946,7 +943,7 @@ increase RAM usage.
There are special setups where you may need to configure this value There are special setups where you may need to configure this value
to restrict the Ip address or interface the webserver listens on. to restrict the Ip address or interface the webserver listens on.
Defaults to \[::\], meaning all interfaces, including IPv6. Defaults to `[::]`, meaning all interfaces, including IPv6.
`PAPERLESS_PORT=<port>` `PAPERLESS_PORT=<port>`

View File

@ -39,16 +39,16 @@ guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTIN
## Code formatting with pre-commit Hooks ## Code formatting with pre-commit Hooks
To ensure a consistent style and formatting across the project source, To ensure a consistent style and formatting across the project source,
the project utilizes a Git [pre-commit]{.title-ref} hook to perform some the project utilizes a Git [`pre-commit`](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks)
formatting and linting before a commit is allowed. That way, everyone hook to perform some formatting and linting before a commit is allowed.
uses the same style and some common issues can be caught early on. See That way, everyone uses the same style and some common issues can be caught
below for installation instructions. early on. See below for installation instructions.
Once installed, hooks will run when you commit. If the formatting isn't Once installed, hooks will run when you commit. If the formatting isn't
quite right or a linter catches something, the commit will be rejected. quite right or a linter catches something, the commit will be rejected.
You'll need to look at the output and fix the issue. Some hooks, such You'll need to look at the output and fix the issue. Some hooks, such
as the Python formatting tool [black]{.title-ref}, will format failing as the Python formatting tool `black`, will format failing
files, so all you need to do is [git add]{.title-ref} those files again files, so all you need to do is `git add` those files again
and retry your commit. and retry your commit.
## Initial setup and first start ## Initial setup and first start
@ -58,7 +58,7 @@ first-time setup. To do the setup you need to perform the steps from the
following chapters in a certain order: following chapters in a certain order:
1. Install prerequisites + pipenv as mentioned in 1. Install prerequisites + pipenv as mentioned in
`[Bare metal route](/setup#bare_metal) [Bare metal route](/setup#bare_metal)
2. Copy `paperless.conf.example` to `paperless.conf` and enable debug 2. Copy `paperless.conf.example` to `paperless.conf` and enable debug
mode. mode.
@ -69,7 +69,7 @@ following chapters in a certain order:
$ npm install -g @angular/cli $ npm install -g @angular/cli
``` ```
4. Install pre-commit 4. Install pre-commit hooks
```shell-session ```shell-session
pre-commit install pre-commit install
@ -81,7 +81,7 @@ following chapters in a certain order:
mkdir -p consume media mkdir -p consume media
``` ```
6. You can now either \... 6. You can now either ...
- install redis or - install redis or
@ -91,9 +91,9 @@ following chapters in a certain order:
- spin up a bare redis container - spin up a bare redis container
> ```shell-session ```shell-session
> docker run -d -p 6379:6379 --restart unless-stopped redis:latest docker run -d -p 6379:6379 --restart unless-stopped redis:latest
> ``` ```
7. Install the python dependencies by performing in the src/ directory. 7. Install the python dependencies by performing in the src/ directory.
@ -101,10 +101,12 @@ following chapters in a certain order:
pipenv install --dev pipenv install --dev
``` ```
> - Make sure you're using python 3.9.x or lower. Otherwise you might !!! note
> get issues with building dependencies. You can use
> [pyenv](https://github.com/pyenv/pyenv) to install a specific Make sure you're using python 3.10.x or lower. Otherwise you might
> python version. get issues with building dependencies. You can use
[pyenv](https://github.com/pyenv/pyenv) to install a specific
python version.
8. Generate the static UI so you can perform a login to get session 8. Generate the static UI so you can perform a login to get session
that is required for frontend development (this needs to be done one that is required for frontend development (this needs to be done one
@ -126,9 +128,9 @@ following chapters in a certain order:
you're developing for, you need to have some or all of them you're developing for, you need to have some or all of them
running. running.
> ```shell-session ```shell-session
> python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
> ``` ```
11. Login with the superuser credentials provided in step 8 at 11. Login with the superuser credentials provided in step 8 at
`http://localhost:8000` to create a session that enables you to use `http://localhost:8000` to create a session that enables you to use
@ -140,15 +142,15 @@ development go to `/src-ui` and run `ng serve`. From there you can use
## Back end development ## Back end development
The backend is a django application. PyCharm works well for development, The backend is a [Django](https://www.djangoproject.com/) application. PyCharm works well for development,
but you can use whatever you want. but you can use whatever you want.
Configure the IDE to use the src/ folder as the base source folder. Configure the IDE to use the src/ folder as the base source folder.
Configure the following launch configurations in your IDE: Configure the following launch configurations in your IDE:
- python3 manage.py runserver - `python3 manage.py runserver`
- celery \--app paperless worker - `celery --app paperless worker`
- python3 manage.py document_consumer - `python3 manage.py document_consumer`
To start them all: To start them all:
@ -158,24 +160,26 @@ python3 manage.py runserver & python3 manage.py document_consumer & celery --app
Testing and code style: Testing and code style:
- Run `pytest` in the src/ directory to execute all tests. This also - Run `pytest` in the `src/` directory to execute all tests. This also
generates a HTML coverage report. When runnings test, paperless.conf generates a HTML coverage report. When runnings test, paperless.conf
is loaded as well. However: the tests rely on the default is loaded as well. However: the tests rely on the default
configuration. This is not ideal. But for now, make sure no settings configuration. This is not ideal. But for now, make sure no settings
except for DEBUG are overridden when testing. except for DEBUG are overridden when testing.
- Coding style is enforced by the Git pre-commit hooks. These will - Coding style is enforced by the Git pre-commit hooks. These will
ensure your code is formatted and do some linting when you do a [git ensure your code is formatted and do some linting when you do a `git commit`.
commit]{.title-ref}.
- You can also run `black` manually to format your code - You can also run `black` manually to format your code
!!! note - The `pre-commit` hooks will modify files and interact with each other.
It may take a couple of `git add`, `git commit` cycle to satisfy them.
!!! note
The line length rule E501 is generally useful for getting multiple The line length rule E501 is generally useful for getting multiple
source files next to each other on the screen. However, in some source files next to each other on the screen. However, in some
cases, its just not possible to make some lines fit, especially cases, its just not possible to make some lines fit, especially
complicated IF cases. Append `# NOQA: E501` to disable this check complicated IF cases. Append `# noqa: E501` to disable this check
for certain lines. for certain lines.
## Front end development ## Front end development
@ -353,7 +357,8 @@ LANGUAGES = [
## Building the documentation ## Building the documentation
The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/). If you want to build the documentation locally, this is how you do it: The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/).
If you want to build the documentation locally, this is how you do it:
1. Install python dependencies. 1. Install python dependencies.
@ -366,7 +371,7 @@ The documentation is built using material-mkdocs, see their [documentation](http
```shell-session ```shell-session
$ cd /path/to/paperless $ cd /path/to/paperless
$ pipenv mkdocs build $ pipenv mkdocs build --config-file mkdocs.yml
``` ```
## Building the Docker image ## Building the Docker image
@ -379,9 +384,9 @@ helper script `build-docker-image.sh`.
Building the docker image from source: Building the docker image from source:
> ```shell-session ```shell-session
> ./build-docker-image.sh Dockerfile -t <your-tag> ./build-docker-image.sh Dockerfile -t <your-tag>
> ``` ```
## Extending Paperless ## Extending Paperless
@ -428,7 +433,7 @@ class MyCustomParser(DocumentParser):
def get_thumbnail(self, document_path, mime_type): def get_thumbnail(self, document_path, mime_type):
# This should return the path to a thumbnail you created for this # This should return the path to a thumbnail you created for this
# document. # document.
return os.path.join(self.tempdir, "thumb.png") return os.path.join(self.tempdir, "thumb.webp")
``` ```
If you encounter any issues during parsing, raise a If you encounter any issues during parsing, raise a

View File

@ -1,6 +1,6 @@
# Frequently Asked Questions # Frequently Asked Questions
### _What's the general plan for Paperless-ngx?_ ## _What's the general plan for Paperless-ngx?_
**A:** While Paperless-ngx is already considered largely **A:** While Paperless-ngx is already considered largely
"feature-complete" it is a community-driven project and development "feature-complete" it is a community-driven project and development
@ -9,7 +9,7 @@ discussions and "up-voted" by the community but this is not a
guarantee the feature will be implemented. This project will always be guarantee the feature will be implemented. This project will always be
open to collaboration in the form of PRs, ideas etc. open to collaboration in the form of PRs, ideas etc.
### _I'm using docker. Where are my documents?_ ## _I'm using docker. Where are my documents?_
**A:** Your documents are stored inside the docker volume **A:** Your documents are stored inside the docker volume
`paperless_media`. Docker manages this volume automatically for you. It `paperless_media`. Docker manages this volume automatically for you. It
@ -27,9 +27,7 @@ system. On Linux, chances are high that this location is
files around manually. This folder is meant to be entirely managed by files around manually. This folder is meant to be entirely managed by
docker and paperless. docker and paperless.
### Let's say I want to switch tools in a year. Can I easily move ## Let's say I want to switch tools in a year. Can I easily move to other systems?
to other systems?\*
**A:** Your documents are stored as plain files inside the media folder. **A:** Your documents are stored as plain files inside the media folder.
You can always drag those files out of that folder to use them You can always drag those files out of that folder to use them
@ -41,17 +39,17 @@ elsewhere. Here are a couple notes about that.
- By default, paperless uses the internal ID of each document as its - By default, paperless uses the internal ID of each document as its
filename. This might not be very convenient for export. However, you filename. This might not be very convenient for export. However, you
can adjust the way files are stored in paperless by can adjust the way files are stored in paperless by
[configuring the filename format](advanced_usage#file_name_handling). [configuring the filename format](advanced_usage#file-name-handling).
- [The exporter](administration#exporter) is - [The exporter](administration#exporter) is
another easy way to get your files out of paperless with reasonable another easy way to get your files out of paperless with reasonable
file names. file names.
### _What file types does paperless-ngx support?_ ## _What file types does paperless-ngx support?_
**A:** Currently, the following files are supported: **A:** Currently, the following files are supported:
- PDF documents, PNG images, JPEG images, TIFF images and GIF images - PDF documents, PNG images, JPEG images, TIFF images, GIF images and
are processed with OCR and converted into PDF documents. WebP images are processed with OCR and converted into PDF documents.
- Plain text documents are supported as well and are added verbatim to - Plain text documents are supported as well and are added verbatim to
paperless. paperless.
- With the optional Tika integration enabled (see [Tika configuration](configuration#tika), - With the optional Tika integration enabled (see [Tika configuration](configuration#tika),
@ -61,7 +59,7 @@ elsewhere. Here are a couple notes about that.
Paperless-ngx determines the type of a file by inspecting its content. Paperless-ngx determines the type of a file by inspecting its content.
The file extensions do not matter. The file extensions do not matter.
### _Will paperless-ngx run on Raspberry Pi?_ ## _Will paperless-ngx run on Raspberry Pi?_
**A:** The short answer is yes. I've tested it on a Raspberry Pi 3 B. **A:** The short answer is yes. I've tested it on a Raspberry Pi 3 B.
The long answer is that certain parts of Paperless will run very slow, The long answer is that certain parts of Paperless will run very slow,
@ -73,11 +71,11 @@ has to do much less work to serve the data.
!!! note !!! note
You can adjust some of the settings so that paperless uses less You can adjust some of the settings so that paperless uses less
processing power. See [setup](setup#less_powerful_devices) for details. processing power. See [setup](setup#less-powerful-devices) for details.
### _How do I install paperless-ngx on Raspberry Pi?_ ## _How do I install paperless-ngx on Raspberry Pi?_
**A:** Docker images are available for arm and arm64 hardware, so just **A:** Docker images are available for armv7 and arm64 hardware, so just
follow the docker-compose instructions. Apart from more required disk follow the docker-compose instructions. Apart from more required disk
space compared to a bare metal installation, docker comes with close to space compared to a bare metal installation, docker comes with close to
zero overhead, even on Raspberry Pi. zero overhead, even on Raspberry Pi.
@ -87,13 +85,13 @@ the python requirements do not have precompiled packages for ARM /
ARM64. Installation of these will require additional development ARM64. Installation of these will require additional development
libraries and compilation will take a long time. libraries and compilation will take a long time.
### _How do I run this on Unraid?_ ## _How do I run this on Unraid?_
**A:** Paperless-ngx is available as [community **A:** Paperless-ngx is available as [community
app](https://unraid.net/community/apps?q=paperless-ngx) in Unraid. [Uli app](https://unraid.net/community/apps?q=paperless-ngx) in Unraid. [Uli
Fahrer](https://github.com/Tooa) created a container template for that. Fahrer](https://github.com/Tooa) created a container template for that.
### _How do I run this on my toaster?_ ## _How do I run this on my toaster?_
**A:** I honestly don't know! As for all other devices that might be **A:** I honestly don't know! As for all other devices that might be
able to run paperless, you're a bit on your own. If you can't run the able to run paperless, you're a bit on your own. If you can't run the
@ -103,11 +101,11 @@ This is also what I use to test new releases with. Apart from that, I
also have a Raspberry Pi, which I occasionally build the image on and also have a Raspberry Pi, which I occasionally build the image on and
see if it works. see if it works.
### _How do I proxy this with NGINX?_ ## _How do I proxy this with NGINX?_
**A:** See [here](setup#nginx). **A:** See [here](setup#nginx).
### _How do I get WebSocket support with Apache mod_wsgi_? ## _How do I get WebSocket support with Apache mod_wsgi_?
**A:** `mod_wsgi` by itself does not support ASGI. Paperless will **A:** `mod_wsgi` by itself does not support ASGI. Paperless will
continue to work with WSGI, but certain features such as status continue to work with WSGI, but certain features such as status

View File

@ -50,7 +50,7 @@ If you want to learn about what's different in paperless-ngx from
Paperless, check out these resources in the documentation: Paperless, check out these resources in the documentation:
- [Some screenshots](#screenshots) of the new UI are available. - [Some screenshots](#screenshots) of the new UI are available.
- Read [this section](/advanced_usage/#advanced-automatic_matching) if you want to learn about how paperless automates all - Read [this section](/advanced_usage/#advanced-automatic-matching) if you want to learn about how paperless automates all
tagging using machine learning. tagging using machine learning.
- Paperless now comes with a [proper email consumer](/usage/#usage-email) that's fully tested and production ready. - Paperless now comes with a [proper email consumer](/usage/#usage-email) that's fully tested and production ready.
- Paperless creates searchable PDF/A documents from whatever you put into the consumption directory. This means - Paperless creates searchable PDF/A documents from whatever you put into the consumption directory. This means

View File

@ -767,7 +767,7 @@ After that, you need to clear your cookies (Paperless-ngx comes with
updated dependencies that do cookie-processing differently) and probably updated dependencies that do cookie-processing differently) and probably
your cache as well. your cache as well.
# Considerations for less powerful devices {#less_powerful_devices} # Considerations for less powerful devices {#less-powerful-devices}
Paperless runs on Raspberry Pi. However, some things are rather slow on Paperless runs on Raspberry Pi. However, some things are rather slow on
the Pi and configuring some options in paperless can help improve the Pi and configuring some options in paperless can help improve
@ -803,7 +803,7 @@ For details, refer to [configuration](configuration).
!!! note !!! note
Updating the Updating the
[automatic matching algorithm](/advanced_usage#automatic_matching) takes quite a bit of time. However, the update mechanism [automatic matching algorithm](/advanced_usage#automatic-matching) takes quite a bit of time. However, the update mechanism
checks if your data has changed before doing the heavy lifting. If you checks if your data has changed before doing the heavy lifting. If you
experience the algorithm taking too much cpu time, consider changing the experience the algorithm taking too much cpu time, consider changing the
schedule in the admin interface to daily. You can also manually invoke schedule in the admin interface to daily. You can also manually invoke

View File

@ -1,9 +1,9 @@
# Usage Overview # Usage Overview
Paperless is an application that manages your personal documents. With Paperless is an application that manages your personal documents. With
the help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)), paperless transforms your wieldy physical document binders the help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)),
into a searchable archive and provides many utilities for finding and paperless transforms your unwieldy physical document binders into a searchable archive
managing your documents. and provides many utilities for finding and managing your documents.
## Terms and definitions ## Terms and definitions
@ -37,7 +37,7 @@ Each document has a couple of fields that you can assign to them:
date you signed a contract, or the date a letter was sent to you. date you signed a contract, or the date a letter was sent to you.
- The _archive serial number_ (short: ASN) of a document is the - The _archive serial number_ (short: ASN) of a document is the
identifier of the document in your physical document binders. See identifier of the document in your physical document binders. See
[recommended workflow](#usage-reccomended_workflow) below. [recommended workflow](#usage-recommended-workflow) below.
- The _content_ of a document is the text that was OCR'ed from the - The _content_ of a document is the text that was OCR'ed from the
document. This text is fed into the search engine and is used for document. This text is fed into the search engine and is used for
matching tags, correspondents and document types. matching tags, correspondents and document types.
@ -74,8 +74,8 @@ following operations on your documents:
### The consumption directory ### The consumption directory
The primary method of getting documents into your database is by putting The primary method of getting documents into your database is by putting
them in the consumption directory. The consumer runs in an infinite them in the consumption directory. The consumer waits patiently, looking
loop, looking for new additions to this directory. When it finds them, for new additions to this directory. When it finds them,
the consumer goes about the process of parsing them with the OCR, the consumer goes about the process of parsing them with the OCR,
indexing what it finds, and storing it in the media directory. indexing what it finds, and storing it in the media directory.
@ -99,7 +99,7 @@ dragging-and-dropping files into your browser window.
### Mobile upload {#usage-mobile_upload} ### Mobile upload {#usage-mobile_upload}
The mobile app over at <https://github.com/qcasey/paperless_share> The mobile app over at [https://github.com/qcasey/paperless_share](https://github.com/qcasey/paperless_share)
allows Android users to share any documents with paperless. This can be allows Android users to share any documents with paperless. This can be
combined with any of the mobile scanning apps out there, such as Office combined with any of the mobile scanning apps out there, such as Office
Lens. Lens.
@ -325,7 +325,7 @@ language](https://whoosh.readthedocs.io/en/latest/querylang.html). For
details on what date parsing utilities are available, see [Date details on what date parsing utilities are available, see [Date
parsing](https://whoosh.readthedocs.io/en/latest/dates.html#parsing-date-queries). parsing](https://whoosh.readthedocs.io/en/latest/dates.html#parsing-date-queries).
## The recommended workflow {#usage-recommended_workflow} ## The recommended workflow {#usage-recommended-workflow}
Once you have familiarized yourself with paperless and are ready to use Once you have familiarized yourself with paperless and are ready to use
it for all your documents, the recommended workflow for managing your it for all your documents, the recommended workflow for managing your

View File

@ -23,6 +23,7 @@ theme:
- navigation.tabs - navigation.tabs
- navigation.top - navigation.top
- toc.integrate - toc.integrate
- content.code.annotate
icon: icon:
repo: fontawesome/brands/github repo: fontawesome/brands/github
favicon: assets/favicon.png favicon: assets/favicon.png
@ -39,6 +40,8 @@ markdown_extensions:
- pymdownx.highlight: - pymdownx.highlight:
anchor_linenums: true anchor_linenums: true
- pymdownx.superfences - pymdownx.superfences
- pymdownx.inlinehilite
strict: true
nav: nav:
- index.md - index.md
- setup.md - setup.md