Merge remote-tracking branch 'paperless/dev' into feature-consume-eml

2025-12-24 02:05:48 -06:00 · 2022-07-11 23:58:21 +02:00
parent d0a0ae91c4 72ce4405d5
commit cdd2b99b6b
214 changed files with 49850 additions and 27323 deletions
--- a/docs/api.rst
+++ b/docs/api.rst
@@ -31,7 +31,8 @@ The objects served by the document endpoint contain the following fields:
 *   ``tags``: List of IDs of tags assigned to this document, or empty list.
 *   ``document_type``: Document type of this document, or null.
 *   ``correspondent``:  Correspondent of this document or null.
-*   ``created``: The date at which this document was created.
+*   ``created``: The date time at which this document was created.
+*   ``created_date``: The date (YYYY-MM-DD) at which this document was created. Optional. If also passed with created, this is ignored.
 *   ``modified``: The date at which this document was last edited in paperless. Read-only.
 *   ``added``: The date at which this document was added to paperless. Read-only.
 *   ``archive_serial_number``: The identifier of this document in a physical document archive.
--- a/docs/configuration.rst
+++ b/docs/configuration.rst
@@ -424,14 +424,23 @@ PAPERLESS_OCR_IMAGE_DPI=<num>
    the produced PDF documents are A4 sized.

 PAPERLESS_OCR_MAX_IMAGE_PIXELS=<num>
-    Paperless will not OCR images that have more pixels than this limit.
-    This is intended to prevent decompression bombs from overloading paperless.
-    Increasing this limit is desired if you face a DecompressionBombError despite
-    the concerning file not being malicious; this could e.g. be caused by invalidly
-    recognized metadata.
-    If you have enough resources or if you are certain that your uploaded files
-    are not malicious you can increase this value to your needs.
-    The default value is 256000000, an image with more pixels than that would not be parsed.
+    Paperless will raise a warning when OCRing images which are over this limit and
+    will not OCR images which are more than twice this limit.  Note this does not
+    prevent the document from being consumed, but could result in missing text content.
+
+    If unset, will default to the value determined by
+    `Pillow <https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.MAX_IMAGE_PIXELS>`_.
+
+    .. note::
+
+        Increasing this limit could cause Paperless to consume additional resources
+        when consuming a file.  Be sure you have sufficient system resources.
+
+    .. caution::
+
+        The limit is intended to prevent malicious files from consuming system resources
+        and causing crashes and other errors.  Only increase this value if you are certain
+        your documents are not malicious and you need the text which was not OCRed

 PAPERLESS_OCR_USER_ARGS=<json>
    OCRmyPDF offers many more options. Use this parameter to specify any
@@ -700,13 +709,6 @@ PAPERLESS_CONVERT_TMPDIR=<path>

    Default is none, which disables the temporary directory.

-PAPERLESS_OPTIMIZE_THUMBNAILS=<bool>
-    Use optipng to optimize thumbnails. This usually reduces the size of
-    thumbnails by about 20%, but uses considerable compute time during
-    consumption.
-
-    Defaults to true.
-
 PAPERLESS_POST_CONSUME_SCRIPT=<filename>
    After a document is consumed, Paperless can trigger an arbitrary script if
    you like.  This script will be passed a number of arguments for you to work
@@ -777,9 +779,6 @@ PAPERLESS_CONVERT_BINARY=<path>
 PAPERLESS_GS_BINARY=<path>
    Defaults to "/usr/bin/gs".

-PAPERLESS_OPTIPNG_BINARY=<path>
-    Defaults to "/usr/bin/optipng".
-

 .. _configuration-docker:

--- a/docs/setup.rst
+++ b/docs/setup.rst
@@ -200,6 +200,19 @@ Install Paperless from Docker Hub
        You can copy any setting from the file ``paperless.conf.example`` and paste it here.
        Have a look at :ref:`configuration` to see what's available.

+    .. note::
+
+        You can utilize Docker secrets for some configuration settings by
+        appending `_FILE` to some configuration values.  This is supported currently
+        only by:
+          * PAPERLESS_DBUSER
+          * PAPERLESS_DBPASS
+          * PAPERLESS_SECRET_KEY
+          * PAPERLESS_AUTO_LOGIN_USERNAME
+          * PAPERLESS_ADMIN_USER
+          * PAPERLESS_ADMIN_MAIL
+          * PAPERLESS_ADMIN_PASSWORD
+
    .. caution::

        Some file systems such as NFS network shares don't support file system
@@ -286,7 +299,6 @@ writing. Windows is not and will never be supported.

    *   ``fonts-liberation`` for generating thumbnails for plain text files
    *   ``imagemagick`` >= 6 for PDF conversion
-    *   ``optipng`` for optimizing thumbnails
    *   ``gnupg`` for handling encrypted documents
    *   ``libpq-dev`` for PostgreSQL
    *   ``libmagic-dev`` for mime type detection
@@ -298,7 +310,7 @@ writing. Windows is not and will never be supported.

    .. code::

-        python3 python3-pip python3-dev imagemagick fonts-liberation optipng gnupg libpq-dev libmagic-dev mime-support libzbar0 poppler-utils
+        python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev libmagic-dev mime-support libzbar0 poppler-utils

    These dependencies are required for OCRmyPDF, which is used for text recognition.

@@ -730,8 +742,6 @@ configuring some options in paperless can help improve performance immensely:
 *   If you want to perform OCR on the device, consider using ``PAPERLESS_OCR_CLEAN=none``.
    This will speed up OCR times and use less memory at the expense of slightly worse
    OCR results.
-*   Set ``PAPERLESS_OPTIMIZE_THUMBNAILS`` to 'false' if you want faster consumption
-    times. Thumbnails will be about 20% larger.
 *   If using docker, consider setting ``PAPERLESS_WEBSERVER_WORKERS`` to
    1. This will save some memory.

--- a/docs/usage_overview.rst
+++ b/docs/usage_overview.rst
@@ -161,6 +161,9 @@ These are as follows:
    will not consume flagged mails.
 *   **Move to folder:** Moves consumed mails out of the way so that paperless wont
    consume them again.
+*   **Add custom Tag:** Adds a custom tag to mails with consumed documents (the IMAP
+    standard calls these "keywords"). Paperless will not consume mails already tagged.
+    Not all mail servers support this feature!

 .. caution::