mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Further cleanup of docs, including fixing autoconvert issues and general cleanups
This commit is contained in:
parent
32d546740b
commit
7788d93227
@ -9,7 +9,7 @@ Before making backups, make sure that paperless is not running.
|
||||
|
||||
Options available to any installation of paperless:
|
||||
|
||||
- Use the [document exporter](administration#exporter). The document exporter exports all your documents,
|
||||
- Use the [document exporter](#exporter). The document exporter exports all your documents,
|
||||
thumbnails and metadata to a specific folder. You may import your
|
||||
documents into a fresh instance of paperless again or store your
|
||||
documents in another DMS with this export.
|
||||
@ -52,7 +52,7 @@ Options available to bare-metal and non-docker installations:
|
||||
|
||||
## Updating Paperless {#updating}
|
||||
|
||||
### Docker Route
|
||||
### Docker Route {#docker-updating}
|
||||
|
||||
If a new release of paperless-ngx is available, upgrading depends on how
|
||||
you installed paperless-ngx in the first place. The releases are
|
||||
@ -131,7 +131,7 @@ the background.
|
||||
image: ghcr.io/paperless-ngx/paperless-ngx:1.7
|
||||
```
|
||||
|
||||
### Bare Metal Route
|
||||
### Bare Metal Route {#bare-metal-updating}
|
||||
|
||||
After grabbing the new release and unpacking the contents, do the
|
||||
following:
|
||||
@ -158,7 +158,7 @@ following:
|
||||
This might not actually do anything. Not every new paperless version
|
||||
comes with new database migrations.
|
||||
|
||||
## Downgrading Paperless
|
||||
## Downgrading Paperless {#downgrade-paperless}
|
||||
|
||||
Downgrades are possible. However, some updates also contain database
|
||||
migrations (these change the layout of the database and may move data).
|
||||
@ -366,7 +366,7 @@ task scheduler.
|
||||
### Managing filenames {#renamer}
|
||||
|
||||
If you use paperless' feature to
|
||||
[assign custom filenames to your documents](/advanced_usage#file_name_handling), you can use this command to move all your files after
|
||||
[assign custom filenames to your documents](/advanced_usage#file-name-handling), you can use this command to move all your files after
|
||||
changing the naming scheme.
|
||||
|
||||
!!! warning
|
||||
@ -430,9 +430,7 @@ rules.
|
||||
As of October 2022 Microsoft no longer supports IMAP authentication
|
||||
for Exchange servers, thus Exchange is no longer supported until a
|
||||
solution is implemented in the Python IMAP library used by Paperless.
|
||||
See
|
||||
|
||||
[learn.microsoft.com](https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online)
|
||||
See [learn.microsoft.com](https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online)
|
||||
|
||||
### Creating archived documents {#archiver}
|
||||
|
||||
|
@ -50,7 +50,7 @@ and run another document through the consumer. Once complete, you should
|
||||
see the newly-created document, automatically tagged with the
|
||||
appropriate data.
|
||||
|
||||
### Automatic matching {#automatic_matching}
|
||||
### Automatic matching {#automatic-matching}
|
||||
|
||||
Paperless-ngx comes with a new matching algorithm called _Auto_. This
|
||||
matching algorithm tries to assign tags, correspondents, document types,
|
||||
@ -59,8 +59,8 @@ assigned these on existing documents. It uses a neural network under the
|
||||
hood.
|
||||
|
||||
If, for example, all your bank statements of your account 123 at the
|
||||
Bank of America are tagged with the tag "bofa*123" and the matching
|
||||
algorithm of this tag is set to \_Auto*, this neural network will examine
|
||||
Bank of America are tagged with the tag "bofa123" and the matching
|
||||
algorithm of this tag is set to _Auto_, this neural network will examine
|
||||
your documents and automatically learn when to assign this tag.
|
||||
|
||||
Paperless tries to hide much of the involved complexity with this
|
||||
@ -95,7 +95,7 @@ when using this feature:
|
||||
of these correspondents to ANY new document, if both are set to
|
||||
automatic matching.
|
||||
|
||||
## Hooking into the consumption process
|
||||
## Hooking into the consumption process {#consume-hooks}
|
||||
|
||||
Sometimes you may want to do something arbitrary whenever a document is
|
||||
consumed. Rather than try to predict what you may want to do, Paperless
|
||||
@ -115,7 +115,7 @@ and then put the path to that script in `paperless.conf` or
|
||||
asynchronously, you'll have to fork the process in your script and
|
||||
exit.
|
||||
|
||||
### Pre-consumption script
|
||||
### Pre-consumption script {#pre-consume-script}
|
||||
|
||||
Executed after the consumer sees a new document in the consumption
|
||||
folder, but before any processing of the document is performed. This
|
||||
@ -151,7 +151,7 @@ with the newly modified file.
|
||||
The script's stdout and stderr will be logged line by line to the
|
||||
webserver log, along with the exit code of the script.
|
||||
|
||||
### Post-consumption script {#post_consume_script}
|
||||
### Post-consumption script {#post-consume-script}
|
||||
|
||||
Executed after the consumer has successfully processed a document and
|
||||
has moved it into paperless. It receives the following environment
|
||||
@ -181,33 +181,34 @@ The post consumption script cannot cancel the consumption process.
|
||||
The script's stdout and stderr will be logged line by line to the
|
||||
webserver log, along with the exit code of the script.
|
||||
|
||||
#### Docker
|
||||
### Docker {#docker-consume-hooks}
|
||||
|
||||
Assumed you have
|
||||
`/home/foo/paperless-ngx/scripts/post-consumption-example.sh`.
|
||||
To hook into the consumption process when using Docker, you
|
||||
will need to pass the scripts into the container via a host mount
|
||||
in your `docker-compose.yml`.
|
||||
|
||||
You can pass that script into the consumer container via a host mount in
|
||||
your `docker-compose.yml`.
|
||||
Assuming you have
|
||||
`/home/paperless-ngx/scripts/post-consumption-example.sh` as a
|
||||
script which you'd like to run.
|
||||
|
||||
```bash
|
||||
You can pass that script into the consumer container via a host mount:
|
||||
|
||||
```yaml
|
||||
...
|
||||
consumer:
|
||||
webserver:
|
||||
...
|
||||
volumes:
|
||||
...
|
||||
- /home/paperless-ngx/scripts:/path/in/container/scripts/
|
||||
- /home/paperless-ngx/scripts:/path/in/container/scripts/ # (1)!
|
||||
environment: # (3)!
|
||||
...
|
||||
PAPERLESS_POST_CONSUME_SCRIPT: /path/in/container/scripts/post-consumption-example.sh # (2)!
|
||||
...
|
||||
```
|
||||
|
||||
Example (docker-compose.yml):
|
||||
`- /home/foo/paperless-ngx/scripts:/usr/src/paperless/scripts`
|
||||
|
||||
which in turn requires the variable `PAPERLESS_POST_CONSUME_SCRIPT` in
|
||||
`docker-compose.env` to point to
|
||||
`/path/in/container/scripts/post-consumption-example.sh`.
|
||||
|
||||
Example (docker-compose.env):
|
||||
`PAPERLESS_POST_CONSUME_SCRIPT=/usr/src/paperless/scripts/post-consumption-example.sh`
|
||||
1. The external scripts directory is mounted to a location inside the container.
|
||||
2. The internal location of the script is used to set the script to run
|
||||
3. This can also be set in `docker-compose.env`
|
||||
|
||||
Troubleshooting:
|
||||
|
||||
@ -218,7 +219,7 @@ Troubleshooting:
|
||||
- Pipe your scripts's output to a log file e.g.
|
||||
`echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log`
|
||||
|
||||
## File name handling {#file_name_handling}
|
||||
## File name handling {#file-name-handling}
|
||||
|
||||
By default, paperless stores your documents in the media directory and
|
||||
renames them using the identifier which it has assigned to each
|
||||
@ -316,7 +317,7 @@ value.
|
||||
Paperless checks the filename of a document whenever it is saved.
|
||||
Therefore, you need to update the filenames of your documents and move
|
||||
them after altering this setting by invoking the
|
||||
[`document renamer <utilities-renamer>`]().
|
||||
[`document renamer`](administration#renamer).
|
||||
|
||||
!!! warning
|
||||
|
||||
@ -344,7 +345,7 @@ When as single storage layout is not sufficient for your use case,
|
||||
storage paths come to the rescue. Storage paths allow you to configure
|
||||
more precisely where each document is stored in the file system.
|
||||
|
||||
- Each storage path is a [PAPERLESS_FILENAME_FORMAT]{.title-ref} and
|
||||
- Each storage path is a `PAPERLESS_FILENAME_FORMAT` and
|
||||
follows the rules described above
|
||||
- Each document is assigned a storage path using the matching
|
||||
algorithms described above, but can be overwritten at any time
|
||||
@ -352,7 +353,7 @@ more precisely where each document is stored in the file system.
|
||||
For example, you could define the following two storage paths:
|
||||
|
||||
1. Normal communications are put into a folder structure sorted by
|
||||
[year/correspondent]{.title-ref}
|
||||
`year/correspondent`
|
||||
2. Communications with insurance companies are stored in a flat
|
||||
structure with longer file names, but containing the full date of
|
||||
the correspondence.
|
||||
@ -384,7 +385,7 @@ structure as in the previous example above.
|
||||
!!! tip
|
||||
|
||||
Defining a storage path is optional. If no storage path is defined for a
|
||||
document, the global [PAPERLESS_FILENAME_FORMAT]{.title-ref} is applied.
|
||||
document, the global `PAPERLESS_FILENAME_FORMAT` is applied.
|
||||
|
||||
!!! warning
|
||||
|
||||
@ -403,27 +404,32 @@ queued and completed tasks, timing and more. Flower can also be used
|
||||
with Prometheus, as it exports metrics. For details on its capabilities,
|
||||
refer to the Flower documentation.
|
||||
|
||||
To configure Flower further, create a [flowerconfig.py]{.title-ref} and
|
||||
place it into the [src/paperless]{.title-ref} directory. For a Docker
|
||||
To configure Flower further, create a `flowerconfig.py` and
|
||||
place it into the `src/paperless` directory. For a Docker
|
||||
installation, you can use volumes to accomplish this:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
# ...
|
||||
webserver:
|
||||
ports:
|
||||
- 5555:5555 # (2)!
|
||||
# ...
|
||||
volumes:
|
||||
- /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro
|
||||
- /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro # (1)!
|
||||
```
|
||||
|
||||
1. Note the `:ro` tag means the file will be mounted as read only.
|
||||
2. `flower` runs by default on port 5555, but this can be configured
|
||||
|
||||
## Custom Container Initialization
|
||||
|
||||
The Docker image includes the ability to run custom user scripts during
|
||||
startup. This could be utilized for installing additional tools or
|
||||
Python packages, for example.
|
||||
Python packages, for example. Scripts are expected to be shell scripts.
|
||||
|
||||
To utilize this, mount a folder containing your scripts to the custom
|
||||
initialization directory, [/custom-cont-init.d]{.title-ref} and place
|
||||
initialization directory, `/custom-cont-init.d` and place
|
||||
scripts you wish to run inside. For security, the folder must be owned
|
||||
by `root` and should have permissions of `a=rx`. Additionally, scripts
|
||||
must only be writable by `root`.
|
||||
@ -445,9 +451,11 @@ services:
|
||||
webserver:
|
||||
# ...
|
||||
volumes:
|
||||
- /path/to/my/scripts:/custom-cont-init.d:ro
|
||||
- /path/to/my/scripts:/custom-cont-init.d:ro # (1)!
|
||||
```
|
||||
|
||||
1. Note the `:ro` tag means the folder will be mounted as read only. This is for extra security against changes
|
||||
|
||||
## MySQL Caveats {#mysql-caveats}
|
||||
|
||||
### Case Sensitivity
|
||||
|
@ -225,7 +225,7 @@ Query parameters:
|
||||
|
||||
Results returned by the endpoint are ordered by importance of the term
|
||||
in the document index. The first result is the term that has the highest
|
||||
Tf/Idf score in the index.
|
||||
[Tf/Idf](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) score in the index.
|
||||
|
||||
```json
|
||||
["term1", "term3", "term6", "term4"]
|
||||
|
@ -33,15 +33,15 @@ matcher.
|
||||
[More information on securing your Redis
|
||||
Instance](https://redis.io/docs/getting-started/#securing-redis).
|
||||
|
||||
Defaults to <redis://localhost:6379>.
|
||||
Defaults to `redis://localhost:6379`.
|
||||
|
||||
`PAPERLESS_DBENGINE=<engine_name>`
|
||||
|
||||
: Optional, gives the ability to choose Postgres or MariaDB for
|
||||
database engine. Available options are [postgresql]{.title-ref} and
|
||||
[mariadb]{.title-ref}.
|
||||
database engine. Available options are `postgresql` and
|
||||
`mariadb`.
|
||||
|
||||
Default is [postgresql]{.title-ref}.
|
||||
Default is `postgresql`.
|
||||
|
||||
!!! warning
|
||||
|
||||
@ -150,25 +150,25 @@ files created using "collectstatic" manager command are stored.
|
||||
`PAPERLESS_FILENAME_FORMAT=<format>`
|
||||
|
||||
: Changes the filenames paperless uses to store documents in the media
|
||||
directory. See [File name handling](advanced_usage#file_name_handling) for details.
|
||||
directory. See [File name handling](advanced_usage#file-name-handling) for details.
|
||||
|
||||
Default is none, which disables this feature.
|
||||
|
||||
`PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=<bool>`
|
||||
|
||||
: Tells paperless to replace placeholders in
|
||||
[PAPERLESS_FILENAME_FORMAT]{.title-ref} that would resolve to
|
||||
`PAPERLESS_FILENAME_FORMAT` that would resolve to
|
||||
'none' to be omitted from the resulting filename. This also holds
|
||||
true for directory names. See [File name handling](advanced_usage#file_name_handling) for
|
||||
true for directory names. See [File name handling](advanced_usage#file-name-handling) for
|
||||
details.
|
||||
|
||||
Defaults to [false]{.title-ref} which disables this feature.
|
||||
Defaults to `false` which disables this feature.
|
||||
|
||||
`PAPERLESS_LOGGING_DIR=<path>`
|
||||
|
||||
: This is where paperless will store log files.
|
||||
|
||||
Defaults to "`PAPERLESS_DATA_DIR`/log/".
|
||||
Defaults to `PAPERLESS_DATA_DIR/log/`.
|
||||
|
||||
## Logging
|
||||
|
||||
@ -283,10 +283,10 @@ login with the selected user.
|
||||
: If this environment variable is specified, Paperless automatically
|
||||
creates a superuser with the provided username at start. This is
|
||||
useful in cases where you can not run the
|
||||
[createsuperuser]{.title-ref} command separately, such as Kubernetes
|
||||
`createsuperuser` command separately, such as Kubernetes
|
||||
or AWS ECS.
|
||||
|
||||
Requires [PAPERLESS_ADMIN_PASSWORD]{.title-ref} to be set.
|
||||
Requires PAPERLESS_ADMIN_PASSWORD be set.
|
||||
|
||||
!!! note
|
||||
|
||||
@ -297,13 +297,13 @@ or AWS ECS.
|
||||
`PAPERLESS_ADMIN_MAIL=<email>`
|
||||
|
||||
: (Optional) Specify superuser email address. Only used when
|
||||
[PAPERLESS_ADMIN_USER]{.title-ref} is set.
|
||||
PAPERLESS_ADMIN_USER is set.
|
||||
|
||||
Defaults to `root@localhost`.
|
||||
|
||||
`PAPERLESS_ADMIN_PASSWORD=<password>`
|
||||
|
||||
: Only used when [PAPERLESS_ADMIN_USER]{.title-ref} is set. This will
|
||||
: Only used when PAPERLESS_ADMIN_USER is set. This will
|
||||
be the password of the automatically created superuser.
|
||||
|
||||
`PAPERLESS_COOKIE_PREFIX=<str>`
|
||||
@ -331,26 +331,25 @@ applications.
|
||||
If you're exposing paperless to the internet directly, do not use
|
||||
this.
|
||||
|
||||
Also see the warning [in the official documentation
|
||||
<https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration>]{.title-ref}.
|
||||
Also see the warning [in the official documentation](https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration).
|
||||
|
||||
Defaults to [false]{.title-ref} which disables this feature.
|
||||
Defaults to "false" which disables this feature.
|
||||
|
||||
`PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>`
|
||||
|
||||
: If [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} is enabled, this
|
||||
: If "PAPERLESS*ENABLE_HTTP_REMOTE_USER" is enabled, this
|
||||
property allows to customize the name of the HTTP header from which
|
||||
the authenticated username is extracted. Values are in terms of
|
||||
\[HttpRequest.META\](<https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META>).
|
||||
Thus, the configured value must start with [HTTP\_]{.title-ref}
|
||||
[HttpRequest.META](https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META).
|
||||
Thus, the configured value must start with `HTTP*`
|
||||
followed by the normalized actual header name.
|
||||
|
||||
Defaults to [HTTP_REMOTE_USER]{.title-ref}.
|
||||
Defaults to "HTTP_REMOTE_USER".
|
||||
|
||||
`PAPERLESS_LOGOUT_REDIRECT_URL=<str>`
|
||||
|
||||
: URL to redirect the user to after a logout. This can be used
|
||||
together with [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} to
|
||||
together with PAPERLESS_ENABLE_HTTP_REMOTE_USER to
|
||||
redirect the user back to the SSO application's logout page.
|
||||
|
||||
Defaults to None, which disables this feature.
|
||||
@ -368,7 +367,7 @@ needs.
|
||||
parsing documents.
|
||||
|
||||
It should be a 3-letter language code consistent with ISO 639:
|
||||
<https://www.loc.gov/standards/iso639-2/php/code_list.php>
|
||||
https://www.loc.gov/standards/iso639-2/php/code_list.php
|
||||
|
||||
Set this to the language most of your documents are written in.
|
||||
|
||||
@ -624,8 +623,7 @@ Add the configuration variables to the environment of the webserver
|
||||
and add the additional services below the webserver service. Watch out
|
||||
for indentation.
|
||||
|
||||
Make sure to use the correct format [PAPERLESS_TIKA_ENABLED =
|
||||
1]{.title-ref} so python_dotenv can parse the statement correctly.
|
||||
Make sure to use the correct format `PAPERLESS_TIKA_ENABLED = 1` so python_dotenv can parse the statement correctly.
|
||||
|
||||
## Software tweaks {#software_tweaks}
|
||||
|
||||
@ -648,7 +646,7 @@ paperless will process in parallel on a single document.
|
||||
|
||||
Ensure that the product
|
||||
|
||||
`PAPERLESS_TASK_WORKERS \: PAPERLESS_THREADS_PER_WORKER`
|
||||
`PAPERLESS_TASK_WORKERS * PAPERLESS_THREADS_PER_WORKER`
|
||||
|
||||
does not exceed your CPU core count or else paperless will be
|
||||
extremely slow. If you want paperless to process many documents in
|
||||
@ -752,7 +750,7 @@ consumption directory as well.
|
||||
`PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=<bool>`
|
||||
|
||||
: Set the names of subdirectories as tags for consumed files. E.g.
|
||||
<CONSUMPTION_DIR>/foo/bar/file.pdf will add the tags "foo" and
|
||||
`<CONSUMPTION_DIR>/foo/bar/file.pdf` will add the tags "foo" and
|
||||
"bar" to the consumed file. Paperless will create any tags that
|
||||
don't exist yet.
|
||||
|
||||
@ -827,7 +825,7 @@ documents.
|
||||
|
||||
: After a document is consumed, Paperless can trigger an arbitrary
|
||||
script if you like. This script will be passed a number of arguments
|
||||
for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post_consume_script).
|
||||
for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post-consume-script).
|
||||
|
||||
The default is blank, which means nothing will be executed.
|
||||
|
||||
@ -841,8 +839,7 @@ option as specified in
|
||||
The filename will be checked first, and if nothing is found, the
|
||||
document text will be checked as normal.
|
||||
|
||||
A date in a filename must have some separators ([.]{.title-ref},
|
||||
[-]{.title-ref}, [/]{.title-ref}, etc) for it to be parsed.
|
||||
A date in a filename must have some separators (`.`, `,`, `-`, `/`, etc) for it to be parsed.
|
||||
|
||||
Defaults to none, which disables this feature.
|
||||
|
||||
@ -928,7 +925,7 @@ the literal path for that program.
|
||||
|
||||
These options don't have any effect in `paperless.conf`. These options
|
||||
adjust the behavior of the docker container. Configure these in
|
||||
[docker-compose.env]{.title-ref}.
|
||||
`docker-compose.env`.
|
||||
|
||||
`PAPERLESS_WEBSERVER_WORKERS=<num>`
|
||||
|
||||
@ -946,7 +943,7 @@ increase RAM usage.
|
||||
There are special setups where you may need to configure this value
|
||||
to restrict the Ip address or interface the webserver listens on.
|
||||
|
||||
Defaults to \[::\], meaning all interfaces, including IPv6.
|
||||
Defaults to `[::]`, meaning all interfaces, including IPv6.
|
||||
|
||||
`PAPERLESS_PORT=<port>`
|
||||
|
||||
|
@ -39,16 +39,16 @@ guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTIN
|
||||
## Code formatting with pre-commit Hooks
|
||||
|
||||
To ensure a consistent style and formatting across the project source,
|
||||
the project utilizes a Git [pre-commit]{.title-ref} hook to perform some
|
||||
formatting and linting before a commit is allowed. That way, everyone
|
||||
uses the same style and some common issues can be caught early on. See
|
||||
below for installation instructions.
|
||||
the project utilizes a Git [`pre-commit`](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks)
|
||||
hook to perform some formatting and linting before a commit is allowed.
|
||||
That way, everyone uses the same style and some common issues can be caught
|
||||
early on. See below for installation instructions.
|
||||
|
||||
Once installed, hooks will run when you commit. If the formatting isn't
|
||||
quite right or a linter catches something, the commit will be rejected.
|
||||
You'll need to look at the output and fix the issue. Some hooks, such
|
||||
as the Python formatting tool [black]{.title-ref}, will format failing
|
||||
files, so all you need to do is [git add]{.title-ref} those files again
|
||||
as the Python formatting tool `black`, will format failing
|
||||
files, so all you need to do is `git add` those files again
|
||||
and retry your commit.
|
||||
|
||||
## Initial setup and first start
|
||||
@ -58,7 +58,7 @@ first-time setup. To do the setup you need to perform the steps from the
|
||||
following chapters in a certain order:
|
||||
|
||||
1. Install prerequisites + pipenv as mentioned in
|
||||
`[Bare metal route](/setup#bare_metal)
|
||||
[Bare metal route](/setup#bare_metal)
|
||||
|
||||
2. Copy `paperless.conf.example` to `paperless.conf` and enable debug
|
||||
mode.
|
||||
@ -69,7 +69,7 @@ following chapters in a certain order:
|
||||
$ npm install -g @angular/cli
|
||||
```
|
||||
|
||||
4. Install pre-commit
|
||||
4. Install pre-commit hooks
|
||||
|
||||
```shell-session
|
||||
pre-commit install
|
||||
@ -81,7 +81,7 @@ following chapters in a certain order:
|
||||
mkdir -p consume media
|
||||
```
|
||||
|
||||
6. You can now either \...
|
||||
6. You can now either ...
|
||||
|
||||
- install redis or
|
||||
|
||||
@ -91,9 +91,9 @@ following chapters in a certain order:
|
||||
|
||||
- spin up a bare redis container
|
||||
|
||||
> ```shell-session
|
||||
> docker run -d -p 6379:6379 --restart unless-stopped redis:latest
|
||||
> ```
|
||||
```shell-session
|
||||
docker run -d -p 6379:6379 --restart unless-stopped redis:latest
|
||||
```
|
||||
|
||||
7. Install the python dependencies by performing in the src/ directory.
|
||||
|
||||
@ -101,10 +101,12 @@ following chapters in a certain order:
|
||||
pipenv install --dev
|
||||
```
|
||||
|
||||
> - Make sure you're using python 3.9.x or lower. Otherwise you might
|
||||
> get issues with building dependencies. You can use
|
||||
> [pyenv](https://github.com/pyenv/pyenv) to install a specific
|
||||
> python version.
|
||||
!!! note
|
||||
|
||||
Make sure you're using python 3.10.x or lower. Otherwise you might
|
||||
get issues with building dependencies. You can use
|
||||
[pyenv](https://github.com/pyenv/pyenv) to install a specific
|
||||
python version.
|
||||
|
||||
8. Generate the static UI so you can perform a login to get session
|
||||
that is required for frontend development (this needs to be done one
|
||||
@ -126,9 +128,9 @@ following chapters in a certain order:
|
||||
you're developing for, you need to have some or all of them
|
||||
running.
|
||||
|
||||
> ```shell-session
|
||||
> python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
|
||||
> ```
|
||||
```shell-session
|
||||
python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
|
||||
```
|
||||
|
||||
11. Login with the superuser credentials provided in step 8 at
|
||||
`http://localhost:8000` to create a session that enables you to use
|
||||
@ -140,15 +142,15 @@ development go to `/src-ui` and run `ng serve`. From there you can use
|
||||
|
||||
## Back end development
|
||||
|
||||
The backend is a django application. PyCharm works well for development,
|
||||
The backend is a [Django](https://www.djangoproject.com/) application. PyCharm works well for development,
|
||||
but you can use whatever you want.
|
||||
|
||||
Configure the IDE to use the src/ folder as the base source folder.
|
||||
Configure the following launch configurations in your IDE:
|
||||
|
||||
- python3 manage.py runserver
|
||||
- celery \--app paperless worker
|
||||
- python3 manage.py document_consumer
|
||||
- `python3 manage.py runserver`
|
||||
- `celery --app paperless worker`
|
||||
- `python3 manage.py document_consumer`
|
||||
|
||||
To start them all:
|
||||
|
||||
@ -158,24 +160,26 @@ python3 manage.py runserver & python3 manage.py document_consumer & celery --app
|
||||
|
||||
Testing and code style:
|
||||
|
||||
- Run `pytest` in the src/ directory to execute all tests. This also
|
||||
- Run `pytest` in the `src/` directory to execute all tests. This also
|
||||
generates a HTML coverage report. When runnings test, paperless.conf
|
||||
is loaded as well. However: the tests rely on the default
|
||||
configuration. This is not ideal. But for now, make sure no settings
|
||||
except for DEBUG are overridden when testing.
|
||||
|
||||
- Coding style is enforced by the Git pre-commit hooks. These will
|
||||
ensure your code is formatted and do some linting when you do a [git
|
||||
commit]{.title-ref}.
|
||||
ensure your code is formatted and do some linting when you do a `git commit`.
|
||||
|
||||
- You can also run `black` manually to format your code
|
||||
|
||||
- The `pre-commit` hooks will modify files and interact with each other.
|
||||
It may take a couple of `git add`, `git commit` cycle to satisfy them.
|
||||
|
||||
!!! note
|
||||
|
||||
The line length rule E501 is generally useful for getting multiple
|
||||
source files next to each other on the screen. However, in some
|
||||
cases, its just not possible to make some lines fit, especially
|
||||
complicated IF cases. Append `# NOQA: E501` to disable this check
|
||||
complicated IF cases. Append `# noqa: E501` to disable this check
|
||||
for certain lines.
|
||||
|
||||
## Front end development
|
||||
@ -353,7 +357,8 @@ LANGUAGES = [
|
||||
|
||||
## Building the documentation
|
||||
|
||||
The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/). If you want to build the documentation locally, this is how you do it:
|
||||
The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/).
|
||||
If you want to build the documentation locally, this is how you do it:
|
||||
|
||||
1. Install python dependencies.
|
||||
|
||||
@ -366,7 +371,7 @@ The documentation is built using material-mkdocs, see their [documentation](http
|
||||
|
||||
```shell-session
|
||||
$ cd /path/to/paperless
|
||||
$ pipenv mkdocs build
|
||||
$ pipenv mkdocs build --config-file mkdocs.yml
|
||||
```
|
||||
|
||||
## Building the Docker image
|
||||
@ -379,9 +384,9 @@ helper script `build-docker-image.sh`.
|
||||
|
||||
Building the docker image from source:
|
||||
|
||||
> ```shell-session
|
||||
> ./build-docker-image.sh Dockerfile -t <your-tag>
|
||||
> ```
|
||||
```shell-session
|
||||
./build-docker-image.sh Dockerfile -t <your-tag>
|
||||
```
|
||||
|
||||
## Extending Paperless
|
||||
|
||||
@ -428,7 +433,7 @@ class MyCustomParser(DocumentParser):
|
||||
def get_thumbnail(self, document_path, mime_type):
|
||||
# This should return the path to a thumbnail you created for this
|
||||
# document.
|
||||
return os.path.join(self.tempdir, "thumb.png")
|
||||
return os.path.join(self.tempdir, "thumb.webp")
|
||||
```
|
||||
|
||||
If you encounter any issues during parsing, raise a
|
||||
|
32
docs/faq.md
32
docs/faq.md
@ -1,6 +1,6 @@
|
||||
# Frequently Asked Questions
|
||||
|
||||
### _What's the general plan for Paperless-ngx?_
|
||||
## _What's the general plan for Paperless-ngx?_
|
||||
|
||||
**A:** While Paperless-ngx is already considered largely
|
||||
"feature-complete" it is a community-driven project and development
|
||||
@ -9,7 +9,7 @@ discussions and "up-voted" by the community but this is not a
|
||||
guarantee the feature will be implemented. This project will always be
|
||||
open to collaboration in the form of PRs, ideas etc.
|
||||
|
||||
### _I'm using docker. Where are my documents?_
|
||||
## _I'm using docker. Where are my documents?_
|
||||
|
||||
**A:** Your documents are stored inside the docker volume
|
||||
`paperless_media`. Docker manages this volume automatically for you. It
|
||||
@ -27,9 +27,7 @@ system. On Linux, chances are high that this location is
|
||||
files around manually. This folder is meant to be entirely managed by
|
||||
docker and paperless.
|
||||
|
||||
### Let's say I want to switch tools in a year. Can I easily move
|
||||
|
||||
to other systems?\*
|
||||
## Let's say I want to switch tools in a year. Can I easily move to other systems?
|
||||
|
||||
**A:** Your documents are stored as plain files inside the media folder.
|
||||
You can always drag those files out of that folder to use them
|
||||
@ -41,17 +39,17 @@ elsewhere. Here are a couple notes about that.
|
||||
- By default, paperless uses the internal ID of each document as its
|
||||
filename. This might not be very convenient for export. However, you
|
||||
can adjust the way files are stored in paperless by
|
||||
[configuring the filename format](advanced_usage#file_name_handling).
|
||||
[configuring the filename format](advanced_usage#file-name-handling).
|
||||
- [The exporter](administration#exporter) is
|
||||
another easy way to get your files out of paperless with reasonable
|
||||
file names.
|
||||
|
||||
### _What file types does paperless-ngx support?_
|
||||
## _What file types does paperless-ngx support?_
|
||||
|
||||
**A:** Currently, the following files are supported:
|
||||
|
||||
- PDF documents, PNG images, JPEG images, TIFF images and GIF images
|
||||
are processed with OCR and converted into PDF documents.
|
||||
- PDF documents, PNG images, JPEG images, TIFF images, GIF images and
|
||||
WebP images are processed with OCR and converted into PDF documents.
|
||||
- Plain text documents are supported as well and are added verbatim to
|
||||
paperless.
|
||||
- With the optional Tika integration enabled (see [Tika configuration](configuration#tika),
|
||||
@ -61,7 +59,7 @@ elsewhere. Here are a couple notes about that.
|
||||
Paperless-ngx determines the type of a file by inspecting its content.
|
||||
The file extensions do not matter.
|
||||
|
||||
### _Will paperless-ngx run on Raspberry Pi?_
|
||||
## _Will paperless-ngx run on Raspberry Pi?_
|
||||
|
||||
**A:** The short answer is yes. I've tested it on a Raspberry Pi 3 B.
|
||||
The long answer is that certain parts of Paperless will run very slow,
|
||||
@ -73,11 +71,11 @@ has to do much less work to serve the data.
|
||||
!!! note
|
||||
|
||||
You can adjust some of the settings so that paperless uses less
|
||||
processing power. See [setup](setup#less_powerful_devices) for details.
|
||||
processing power. See [setup](setup#less-powerful-devices) for details.
|
||||
|
||||
### _How do I install paperless-ngx on Raspberry Pi?_
|
||||
## _How do I install paperless-ngx on Raspberry Pi?_
|
||||
|
||||
**A:** Docker images are available for arm and arm64 hardware, so just
|
||||
**A:** Docker images are available for armv7 and arm64 hardware, so just
|
||||
follow the docker-compose instructions. Apart from more required disk
|
||||
space compared to a bare metal installation, docker comes with close to
|
||||
zero overhead, even on Raspberry Pi.
|
||||
@ -87,13 +85,13 @@ the python requirements do not have precompiled packages for ARM /
|
||||
ARM64. Installation of these will require additional development
|
||||
libraries and compilation will take a long time.
|
||||
|
||||
### _How do I run this on Unraid?_
|
||||
## _How do I run this on Unraid?_
|
||||
|
||||
**A:** Paperless-ngx is available as [community
|
||||
app](https://unraid.net/community/apps?q=paperless-ngx) in Unraid. [Uli
|
||||
Fahrer](https://github.com/Tooa) created a container template for that.
|
||||
|
||||
### _How do I run this on my toaster?_
|
||||
## _How do I run this on my toaster?_
|
||||
|
||||
**A:** I honestly don't know! As for all other devices that might be
|
||||
able to run paperless, you're a bit on your own. If you can't run the
|
||||
@ -103,11 +101,11 @@ This is also what I use to test new releases with. Apart from that, I
|
||||
also have a Raspberry Pi, which I occasionally build the image on and
|
||||
see if it works.
|
||||
|
||||
### _How do I proxy this with NGINX?_
|
||||
## _How do I proxy this with NGINX?_
|
||||
|
||||
**A:** See [here](setup#nginx).
|
||||
|
||||
### _How do I get WebSocket support with Apache mod_wsgi_?
|
||||
## _How do I get WebSocket support with Apache mod_wsgi_?
|
||||
|
||||
**A:** `mod_wsgi` by itself does not support ASGI. Paperless will
|
||||
continue to work with WSGI, but certain features such as status
|
||||
|
@ -50,7 +50,7 @@ If you want to learn about what's different in paperless-ngx from
|
||||
Paperless, check out these resources in the documentation:
|
||||
|
||||
- [Some screenshots](#screenshots) of the new UI are available.
|
||||
- Read [this section](/advanced_usage/#advanced-automatic_matching) if you want to learn about how paperless automates all
|
||||
- Read [this section](/advanced_usage/#advanced-automatic-matching) if you want to learn about how paperless automates all
|
||||
tagging using machine learning.
|
||||
- Paperless now comes with a [proper email consumer](/usage/#usage-email) that's fully tested and production ready.
|
||||
- Paperless creates searchable PDF/A documents from whatever you put into the consumption directory. This means
|
||||
|
@ -767,7 +767,7 @@ After that, you need to clear your cookies (Paperless-ngx comes with
|
||||
updated dependencies that do cookie-processing differently) and probably
|
||||
your cache as well.
|
||||
|
||||
# Considerations for less powerful devices {#less_powerful_devices}
|
||||
# Considerations for less powerful devices {#less-powerful-devices}
|
||||
|
||||
Paperless runs on Raspberry Pi. However, some things are rather slow on
|
||||
the Pi and configuring some options in paperless can help improve
|
||||
@ -803,7 +803,7 @@ For details, refer to [configuration](configuration).
|
||||
!!! note
|
||||
|
||||
Updating the
|
||||
[automatic matching algorithm](/advanced_usage#automatic_matching) takes quite a bit of time. However, the update mechanism
|
||||
[automatic matching algorithm](/advanced_usage#automatic-matching) takes quite a bit of time. However, the update mechanism
|
||||
checks if your data has changed before doing the heavy lifting. If you
|
||||
experience the algorithm taking too much cpu time, consider changing the
|
||||
schedule in the admin interface to daily. You can also manually invoke
|
||||
|
@ -1,9 +1,9 @@
|
||||
# Usage Overview
|
||||
|
||||
Paperless is an application that manages your personal documents. With
|
||||
the help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)), paperless transforms your wieldy physical document binders
|
||||
into a searchable archive and provides many utilities for finding and
|
||||
managing your documents.
|
||||
the help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)),
|
||||
paperless transforms your unwieldy physical document binders into a searchable archive
|
||||
and provides many utilities for finding and managing your documents.
|
||||
|
||||
## Terms and definitions
|
||||
|
||||
@ -37,7 +37,7 @@ Each document has a couple of fields that you can assign to them:
|
||||
date you signed a contract, or the date a letter was sent to you.
|
||||
- The _archive serial number_ (short: ASN) of a document is the
|
||||
identifier of the document in your physical document binders. See
|
||||
[recommended workflow](#usage-reccomended_workflow) below.
|
||||
[recommended workflow](#usage-recommended-workflow) below.
|
||||
- The _content_ of a document is the text that was OCR'ed from the
|
||||
document. This text is fed into the search engine and is used for
|
||||
matching tags, correspondents and document types.
|
||||
@ -74,8 +74,8 @@ following operations on your documents:
|
||||
### The consumption directory
|
||||
|
||||
The primary method of getting documents into your database is by putting
|
||||
them in the consumption directory. The consumer runs in an infinite
|
||||
loop, looking for new additions to this directory. When it finds them,
|
||||
them in the consumption directory. The consumer waits patiently, looking
|
||||
for new additions to this directory. When it finds them,
|
||||
the consumer goes about the process of parsing them with the OCR,
|
||||
indexing what it finds, and storing it in the media directory.
|
||||
|
||||
@ -99,7 +99,7 @@ dragging-and-dropping files into your browser window.
|
||||
|
||||
### Mobile upload {#usage-mobile_upload}
|
||||
|
||||
The mobile app over at <https://github.com/qcasey/paperless_share>
|
||||
The mobile app over at [https://github.com/qcasey/paperless_share](https://github.com/qcasey/paperless_share)
|
||||
allows Android users to share any documents with paperless. This can be
|
||||
combined with any of the mobile scanning apps out there, such as Office
|
||||
Lens.
|
||||
@ -325,7 +325,7 @@ language](https://whoosh.readthedocs.io/en/latest/querylang.html). For
|
||||
details on what date parsing utilities are available, see [Date
|
||||
parsing](https://whoosh.readthedocs.io/en/latest/dates.html#parsing-date-queries).
|
||||
|
||||
## The recommended workflow {#usage-recommended_workflow}
|
||||
## The recommended workflow {#usage-recommended-workflow}
|
||||
|
||||
Once you have familiarized yourself with paperless and are ready to use
|
||||
it for all your documents, the recommended workflow for managing your
|
||||
|
@ -23,6 +23,7 @@ theme:
|
||||
- navigation.tabs
|
||||
- navigation.top
|
||||
- toc.integrate
|
||||
- content.code.annotate
|
||||
icon:
|
||||
repo: fontawesome/brands/github
|
||||
favicon: assets/favicon.png
|
||||
@ -39,6 +40,8 @@ markdown_extensions:
|
||||
- pymdownx.highlight:
|
||||
anchor_linenums: true
|
||||
- pymdownx.superfences
|
||||
- pymdownx.inlinehilite
|
||||
strict: true
|
||||
nav:
|
||||
- index.md
|
||||
- setup.md
|
||||
|
Loading…
x
Reference in New Issue
Block a user