Merge branch 'dev' into beta

This commit is contained in:
Trenton H
2022-11-09 13:51:10 -08:00
168 changed files with 19419 additions and 23870 deletions

View File

@@ -258,12 +258,18 @@ Paperless provides the following placeholders within filenames:
* ``{tag_list}``: A comma separated list of all tags assigned to the document.
* ``{title}``: The title of the document.
* ``{created}``: The full date (ISO format) the document was created.
* ``{created_year}``: Year created only.
* ``{created_year}``: Year created only, formatted as the year with century.
* ``{created_year_short}``: Year created only, formatted as the year without century, zero padded.
* ``{created_month}``: Month created only (number 01-12).
* ``{created_month_name}``: Month created name, as per locale
* ``{created_month_name_short}``: Month created abbreviated name, as per locale
* ``{created_day}``: Day created only (number 01-31).
* ``{added}``: The full date (ISO format) the document was added to paperless.
* ``{added_year}``: Year added only.
* ``{added_year_short}``: Year added only, formatted as the year without century, zero padded.
* ``{added_month}``: Month added only (number 01-12).
* ``{added_month_name}``: Month added name, as per locale
* ``{added_month_name_short}``: Month added abbreviated name, as per locale
* ``{added_day}``: Day added only (number 01-31).
@@ -364,3 +370,50 @@ For simplicity, `By Year` defines the same structure as in the previous example
If you adjust the format of an existing storage path, old documents don't get relocated automatically.
You need to run the :ref:`document renamer <utilities-renamer>` to adjust their pathes.
.. _advanced-celery-monitoring:
Celery Monitoring
#################
The monitoring tool `Flower <https://flower.readthedocs.io/en/latest/index.html>`_ can be used to view more
detailed information about the health of the celery workers used for asynchronous tasks. This includes details
on currently running, queued and completed tasks, timing and more. Flower can also be used with Prometheus, as it
exports metrics. For details on its capabilities, refer to the Flower documentation.
To configure Flower further, create a `flowerconfig.py` and place it into the `src/paperless` directory. For
a Docker installation, you can use volumes to accomplish this:
.. code:: yaml
services:
# ...
webserver:
# ...
volumes:
- /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro
Custom Container Initialization
###############################
The Docker image includes the ability to run custom user scripts during startup. This could be
utilized for installing additional tools or Python packages, for example.
To utilize this, mount a folder containing your scripts to the custom initialization directory, `/custom-cont-init.d`
and place scripts you wish to run inside. For security, the folder and its contents must be owned by `root`.
Additionally, scripts must only be writable by `root`.
Your scripts will be run directly before the webserver completes startup. Scripts will be run by the `root` user.
This is an advanced functionality with which you could break functionality or lose data.
For example, using Docker Compose:
.. code:: yaml
services:
# ...
webserver:
# ...
volumes:
- /path/to/my/scripts:/custom-cont-init.d:ro

View File

@@ -538,7 +538,7 @@ requires are as follows:
# ...
gotenberg:
image: gotenberg/gotenberg:7.4
image: gotenberg/gotenberg:7.6
restart: unless-stopped
command:
- "gotenberg"
@@ -701,6 +701,7 @@ PAPERLESS_CONSUMER_ENABLE_BARCODES=<bool>
Defaults to false.
PAPERLESS_CONSUMER_BARCODE_TIFF_SUPPORT=<bool>
Whether TIFF image files should be scanned for barcodes.
This will automatically convert any TIFF image(s) to pdfs for later
@@ -901,6 +902,14 @@ PAPERLESS_OCR_LANGUAGES=<list>
Defaults to none, which does not install any additional languages.
PAPERLESS_ENABLE_FLOWER=<defined>
If this environment variable is defined, the Celery monitoring tool
`Flower <https://flower.readthedocs.io/en/latest/index.html>`_ will
be started by the container.
You can read more about this in the :ref:`advanced setup <advanced-celery-monitoring>`
documentation.
.. _configuration-update-checking:
@@ -908,18 +917,9 @@ Update Checking
###############
PAPERLESS_ENABLE_UPDATE_CHECK=<bool>
Enable (or disable) the automatic check for available updates. This feature is disabled
by default but if it is not explicitly set Paperless-ngx will show a message about this.
If enabled, the feature works by pinging the the Github API for the latest release e.g.
https://api.github.com/repos/paperless-ngx/paperless-ngx/releases/latest
to determine whether a new version is available.
.. note::
Actual updating of the app must still be performed manually.
Note that for users of thirdy-party containers e.g. linuxserver.io this notification
may be 'ahead' of a new release from the third-party maintainers.
In either case, no tracking data is collected by the app in any way.
Defaults to none, which disables the feature.
This setting was deprecated in favor of a frontend setting after v1.9.2. A one-time
migration is performed for users who have this setting set. This setting is always
ignored if the corresponding frontend setting has been set.

View File

@@ -112,7 +112,7 @@ To do the setup you need to perform the steps from the following chapters in a c
.. code:: shell-session
python3 manage.py runserver & python3 manage.py document_consumer & python3 manage.py qcluster
python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
11. Login with the superuser credentials provided in step 8 at ``http://localhost:8000`` to create a session that enables you to use the backend.
@@ -128,14 +128,14 @@ Configure the IDE to use the src/ folder as the base source folder. Configure th
launch configurations in your IDE:
* python3 manage.py runserver
* python3 manage.py qcluster
* celery --app paperless worker
* python3 manage.py document_consumer
To start them all:
.. code:: shell-session
python3 manage.py runserver & python3 manage.py document_consumer & python3 manage.py qcluster
python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
Testing and code style:

View File

@@ -1 +1 @@
myst-parser==0.17.2
myst-parser==0.18.1

View File

@@ -39,7 +39,7 @@ Paperless consists of the following components:
.. _setup-task_processor:
* **The task processor:** Paperless relies on `Django Q <https://django-q.readthedocs.io/en/latest/>`_
* **The task processor:** Paperless relies on `Celery - Distributed Task Queue <https://docs.celeryq.dev/en/stable/index.html>`_
for doing most of the heavy lifting. This is a task queue that accepts tasks from
multiple sources and processes these in parallel. It also comes with a scheduler that executes
certain commands periodically.
@@ -62,13 +62,6 @@ Paperless consists of the following components:
tasks fail and inspect the errors (i.e., wrong email credentials, errors during consuming a specific
file, etc).
You may start the task processor by executing:
.. code:: shell-session
$ cd /path/to/paperless/src/
$ python3 manage.py qcluster
* A `redis <https://redis.io/>`_ message broker: This is a really lightweight service that is responsible
for getting the tasks from the webserver and the consumer to the task scheduler. These run in a different
process (maybe even on different machines!), and therefore, this is necessary.
@@ -291,7 +284,20 @@ Build the Docker image yourself
.. code:: yaml
webserver:
build: .
build:
context: .
args:
QPDF_VERSION: x.y.x
PIKEPDF_VERSION: x.y.z
PSYCOPG2_VERSION: x.y.z
JBIG2ENC_VERSION: 0.29
.. note::
You should match the build argument versions to the version for the release you have
checked out. These are pre-built images with certain, more updated software.
If you want to build these images your self, that is possible, but beyond
the scope of these steps.
4. Follow steps 3 to 8 of :ref:`setup-docker_hub`. When asked to run
``docker-compose pull`` to pull the image, do
@@ -332,7 +338,7 @@ writing. Windows is not and will never be supported.
.. code::
python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev libmagic-dev mime-support libzbar0 poppler-utils
python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev default-libmysqlclient-dev libmagic-dev mime-support libzbar0 poppler-utils
These dependencies are required for OCRmyPDF, which is used for text recognition.
@@ -361,7 +367,7 @@ writing. Windows is not and will never be supported.
You will also need ``build-essential``, ``python3-setuptools`` and ``python3-wheel``
for installing some of the python dependencies.
2. Install ``redis`` >= 5.0 and configure it to start automatically.
2. Install ``redis`` >= 6.0 and configure it to start automatically.
3. Optional. Install ``postgresql`` and configure a database, user and password for paperless. If you do not wish
to use PostgreSQL, MariaDB and SQLite are available as well.
@@ -461,8 +467,9 @@ writing. Windows is not and will never be supported.
as a starting point.
Paperless needs the ``webserver`` script to run the webserver, the
``consumer`` script to watch the input folder, and the ``scheduler``
script to run tasks such as email checking and document consumption.
``consumer`` script to watch the input folder, ``taskqueue`` for the background workers
used to handle things like document consumption and the ``scheduler`` script to run tasks such as
email checking at certain times .
The ``socket`` script enables ``gunicorn`` to run on port 80 without
root privileges. For this you need to uncomment the ``Require=paperless-webserver.socket``
@@ -513,6 +520,13 @@ writing. Windows is not and will never be supported.
to compile this by yourself, because this software has been patented until around 2017 and
binary packages are not available for most distributions.
15. Optional: If using the NLTK machine learning processing (see ``PAPERLESS_ENABLE_NLTK`` in
:ref:`configuration` for details), download the NLTK data for the Snowball Stemmer, Stopwords
and Punkt tokenizer to your ``PAPERLESS_DATA_DIR/nltk``. Refer to
the `NLTK instructions <https://www.nltk.org/data.html>`_ for details on how to
download the data.
Migrating to Paperless-ngx
##########################
@@ -809,6 +823,8 @@ configuring some options in paperless can help improve performance immensely:
OCR results.
* If using docker, consider setting ``PAPERLESS_WEBSERVER_WORKERS`` to
1. This will save some memory.
* Consider setting ``PAPERLESS_ENABLE_NLTK`` to false, to disable the more
advanced language processing, which can take more memory and processing time.
For details, refer to :ref:`configuration`.

View File

@@ -19,7 +19,7 @@ Check for the following issues:
.. code:: shell-session
$ python3 manage.py qcluster
$ celery --app paperless worker
* Look at the output of paperless and inspect it for any errors.
* Go to the admin interface, and check if there are failed tasks. If so, the
@@ -125,7 +125,7 @@ If using docker-compose, this is achieved by the following configuration change
.. code:: yaml
gotenberg:
image: gotenberg/gotenberg:7.4
image: gotenberg/gotenberg:7.6
restart: unless-stopped
command:
- "gotenberg"