Merge branch 'dev'

2026-01-30 23:08:59 -06:00 · 2020-12-22 15:58:06 +01:00
parent 45bf921a0a 802c765310
commit bdaf6cd1d0
87 changed files with 2052 additions and 983 deletions
--- a/docs/advanced_usage.rst
+++ b/docs/advanced_usage.rst
@@ -5,85 +5,6 @@ Advanced topics
 Paperless offers a couple features that automate certain tasks and make your life
 easier.

-Guesswork
-#########
-
-
-Any document you put into the consumption directory will be consumed, but if
-you name the file right, it'll automatically set some values in the database
-for you.  This is is the logic the consumer follows:
-
-1. Try to find the correspondent, title, and tags in the file name following
-   the pattern: ``Date - Correspondent - Title - tag,tag,tag.pdf``.  Note that
-   the format of the date is **rigidly defined** as ``YYYYMMDDHHMMSSZ`` or
-   ``YYYYMMDDZ``.  The ``Z`` refers "Zulu time" AKA "UTC".
-   The tags are optional, so the format ``Date - Correspondent - Title.pdf``
-   works as well.
-2. If that doesn't work, we skip the date and try this pattern:
-   ``Correspondent - Title - tag,tag,tag.pdf``.
-3. If that doesn't work, we try to find the correspondent and title in the file
-   name following the pattern: ``Correspondent - Title.pdf``.
-4. If that doesn't work, just assume that the name of the file is the title.
-
-So given the above, the following examples would work as you'd expect:
-
-* ``20150314000700Z - Some Company Name - Invoice 2016-01-01 - money,invoices.pdf``
-* ``20150314Z - Some Company Name - Invoice 2016-01-01 - money,invoices.pdf``
-* ``Some Company Name - Invoice 2016-01-01 - money,invoices.pdf``
-* ``Another Company - Letter of Reference.jpg``
-* ``Dad's Recipe for Pancakes.png``
-
-These however wouldn't work:
-
-* ``2015-03-14 00:07:00 UTC - Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
-* ``2015-03-14 - Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
-* ``Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
-* ``Another Company- Letter of Reference.jpg``
-
-Do I have to be so strict about naming?
-=======================================
-
-Rather than using the strict document naming rules, one can also set the option
-``PAPERLESS_FILENAME_DATE_ORDER`` in ``paperless.conf`` to any date order
-that is accepted by dateparser_. Doing so will cause ``paperless`` to default
-to any date format that is found in the title, instead of a date pulled from
-the document's text, without requiring the strict formatting of the document
-filename as described above.
-
-.. _dateparser: https://github.com/scrapinghub/dateparser/blob/v0.7.0/docs/usage.rst#settings
-
-.. _advanced-transforming_filenames:
-
-Transforming filenames for parsing
-==================================
-
-Some devices can't produce filenames that can be parsed by the default
-parser. By configuring the option ``PAPERLESS_FILENAME_PARSE_TRANSFORMS`` in
-``paperless.conf`` one can add transformations that are applied to the filename
-before it's parsed.
-
-The option contains a list of dictionaries of regular expressions (key:
-``pattern``) and replacements (key: ``repl``) in JSON format, which are
-applied in order by passing them to ``re.subn``. Transformation stops
-after the first match, so at most one transformation is applied. The general
-syntax is
-
-.. code:: python
-
-   [{"pattern":"pattern1", "repl":"repl1"}, {"pattern":"pattern2", "repl":"repl2"}, ..., {"pattern":"patternN", "repl":"replN"}]
-
-The example below is for a Brother ADS-2400N, a scanner that allows
-different names to different hardware buttons (useful for handling
-multiple entities in one instance), but insists on adding ``_<count>``
-to the filename.
-
-.. code:: python
-
-   # Brother profile configuration, support "Name_Date_Count" (the default
-   # setting) and "Name_Count" (use "Name" as tag and "Count" as title).
-   PAPERLESS_FILENAME_PARSE_TRANSFORMS=[{"pattern":"^([a-z]+)_(\\d{8})_(\\d{6})_([0-9]+)\\.", "repl":"\\2\\3Z - \\4 - \\1."}, {"pattern":"^([a-z]+)_([0-9]+)\\.", "repl":" - \\2 - \\1."}]
-
-
 .. _advanced-matching:

 Matching tags, correspondents and document types
--- a/docs/api.rst
+++ b/docs/api.rst
@@ -221,21 +221,16 @@ Each fragment contains a list of strings, and some of them are marked as a highl

    [
        [
-            {"text": "This is a sample text with a "},
-            {"text": "highlighted", "term": 0},
-            {"text": " word."}
+            {"text": "This is a sample text with a ", "highlight": false},
+            {"text": "highlighted", "highlight": true},
+            {"text": " word.", "highlight": false}
        ],
        [
-            {"text": "Another", "term": 1},
-            {"text": " fragment with a highlight."}
+            {"text": "Another", "highlight": true},
+            {"text": " fragment with a highlight.", "highlight": false}
        ]
    ]

-
-
-When ``term`` is present within a string, the word within ``text`` should be highlighted.
-The term index groups multiple matches together and words with the same index
-should get identical highlighting.
 A client may use this example to produce the following output:

 ... This is a sample text with a **highlighted** word. ... **Another** fragment with a highlight. ...
--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@@ -6,6 +6,40 @@ Changelog
 *********


+paperless-ng 0.9.9
+##################
+
+Christmas release!
+
+* Bulk editing
+
+  * Paperless now supports bulk editing.
+  * The following operations are available: Add and remove correspondents, tags, document types from selected documents, as well as mass-deleting documents.
+  * We've got a more fancy UI in the works that makes these features more accessible, but that's not quite ready yet.
+
+* Searching
+
+  * Paperless now supports searching for similar documents ("More like this") both from the document detail page as well as from individual search results.
+  * A search score indicates how well a document matches the search query, or how similar a document is to a given reference document.
+
+* Other additions and changes
+
+  * Clarification in the UI that the fields "Match" and "Is insensitive" are not relevant for the Auto matching algorithm.
+  * New select interface for tags, types and correspondents allows filtering. This also improves tag selection. Thanks again to `Michael Shamoon`_!
+  * Page navigation controls for the document viewer, thanks to `Michael Shamoon`_.
+  * Layout changes to the small cards document list.
+  * The dashboard now displays the username (or full name if specified in the admin) on the dashboard.
+
+* Fixes
+
+  * An error that caused the document importer to crash was fixed.
+  * An issue with changes not being possible when ``PAPERLESS_COOKIE_PREFIX`` is used was fixed.
+  * The date selection filters now allow manual entry of dates.
+
+* Feature Removal
+
+  * Most of the guesswork features have been removed. Paperless no longer tries to extract correspondents and tags from file names.
+
 paperless-ng 0.9.8
 ##################

--- a/docs/configuration.rst
+++ b/docs/configuration.rst
@@ -400,11 +400,6 @@ PAPERLESS_FILENAME_DATE_ORDER=<format>

    Defaults to none, which disables this feature.

-PAPERLESS_FILENAME_PARSE_TRANSFORMS
-    Transforms filenames before they are processed by paperless. See
-    :ref:`advanced-transforming_filenames` for details.
-
-    Defaults to none, which disables this feature.

 Binaries
 ########
--- a/docs/setup.rst
+++ b/docs/setup.rst
@@ -120,6 +120,8 @@ The `bare metal route`_ is more complicated to setup but makes it easier
 should you want to contribute some code back. You need to configure and
 run the above mentioned components yourself.

+.. _setup-docker_route:
+
 Docker Route
 ============

--- a/docs/troubleshooting.rst
+++ b/docs/troubleshooting.rst
@@ -39,7 +39,7 @@ Operation not permitted

 You might see errors such as:

-.. code::
+.. code:: shell-session

    chown: changing ownership of '../export': Operation not permitted

@@ -49,3 +49,29 @@ to these folders. This happens when pointing these directories to NFS shares,
 for example.

 Ensure that `chown` is possible on these directories.
+
+Classifier error: No training data available
+############################################
+
+This indicates that the Auto matching algorithm found no documents to learn from.
+This may have two reasons:
+
+*   You don't use the Auto matching algorithm: The error can be safely ignored in this case.
+*   You are using the Auto matching algorithm: The classifier explicitly excludes documents
+    with Inbox tags. Verify that there are documents in your archive without inbox tags.
+    The algorithm will only learn from documents not in your inbox.
+
+Permission denied errors in the consumption directory
+#####################################################
+
+You might encounter errors such as:
+
+.. code:: shell-session
+
+    The following error occured while consuming document.pdf: [Errno 13] Permission denied: '/usr/src/paperless/src/../consume/document.pdf'
+
+This happens when paperless does not have permission to delete files inside the consumption directory.
+Ensure that ``USERMAP_UID`` and ``USERMAP_GID`` are set to the user id and group id you use on the host operating system, if these are
+different from ``1000``. See :ref:`setup-docker_route`.
+
+Also ensure that you are able to read and write to the consumption directory on the host.