mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-11-03 03:16:10 -06:00 
			
		
		
		
	many changes to the documentation, mostly typos
This commit is contained in:
		@@ -30,7 +30,7 @@ Options available to docker installations:
 | 
			
		||||
    Paperless uses 3 volumes:
 | 
			
		||||
 | 
			
		||||
    *   ``paperless_media``: This is where your documents are stored.
 | 
			
		||||
    *   ``paperless_data``: This is where auxilliary data is stored. This
 | 
			
		||||
    *   ``paperless_data``: This is where auxillary data is stored. This
 | 
			
		||||
        folder also contains the SQLite database, if you use it.
 | 
			
		||||
    *   ``paperless_pgdata``: Exists only if you use PostgreSQL and contains
 | 
			
		||||
        the database.
 | 
			
		||||
@@ -109,7 +109,7 @@ B.  If you built the image yourself, grab the new archive and replace your curre
 | 
			
		||||
.. hint::
 | 
			
		||||
 | 
			
		||||
    You can usually keep your ``docker-compose.env`` file, since this file will
 | 
			
		||||
    never include mandantory configuration options. However, it is worth checking
 | 
			
		||||
    never include mandatory configuration options. However, it is worth checking
 | 
			
		||||
    out the new version of this file, since it might have new recommendations
 | 
			
		||||
    on what to configure.
 | 
			
		||||
 | 
			
		||||
@@ -126,8 +126,8 @@ After grabbing the new release and unpacking the contents, do the following:
 | 
			
		||||
 | 
			
		||||
        $ pip install --upgrade pipenv
 | 
			
		||||
        $ cd /path/to/paperless
 | 
			
		||||
        $ pipenv install
 | 
			
		||||
        $ pipenv clean
 | 
			
		||||
        $ pipenv install
 | 
			
		||||
 | 
			
		||||
    This creates a new virtual environment (or uses your existing environment)
 | 
			
		||||
    and installs all dependencies into it.
 | 
			
		||||
@@ -247,12 +247,12 @@ your already processed documents.
 | 
			
		||||
 | 
			
		||||
When multiple document types or correspondents match a single document,
 | 
			
		||||
the retagger won't assign these to the document. Specify ``--use-first``
 | 
			
		||||
to override this behaviour and just use the first correspondent or type
 | 
			
		||||
to override this behavior and just use the first correspondent or type
 | 
			
		||||
it finds. This option does not apply to tags, since any amount of tags
 | 
			
		||||
can be applied to a document.
 | 
			
		||||
 | 
			
		||||
Finally, ``-f`` specifies that you wish to overwrite already assigned
 | 
			
		||||
correspondents, types and/or tags. The default behaviour is to not
 | 
			
		||||
correspondents, types and/or tags. The default behavior is to not
 | 
			
		||||
assign correspondents and types to documents that have this data already
 | 
			
		||||
assigned. ``-f`` works differently for tags: By default, only additional tags get
 | 
			
		||||
added to documents, no tags will be removed. With ``-f``, tags that don't
 | 
			
		||||
@@ -341,7 +341,7 @@ Documents can be stored in Paperless using GnuPG encryption.
 | 
			
		||||
 | 
			
		||||
.. danger::
 | 
			
		||||
 | 
			
		||||
    Encryption is depreceated since paperless-ng 0.9 and doesn't really provide any
 | 
			
		||||
    Encryption is deprecated since paperless-ng 0.9 and doesn't really provide any
 | 
			
		||||
    additional security, since you have to store the passphrase in a configuration
 | 
			
		||||
    file on the same system as the encrypted documents for paperless to work.
 | 
			
		||||
    Furthermore, the entire text content of the documents is stored plain in the
 | 
			
		||||
 
 | 
			
		||||
@@ -84,6 +84,8 @@ to the filename.
 | 
			
		||||
   PAPERLESS_FILENAME_PARSE_TRANSFORMS=[{"pattern":"^([a-z]+)_(\\d{8})_(\\d{6})_([0-9]+)\\.", "repl":"\\2\\3Z - \\4 - \\1."}, {"pattern":"^([a-z]+)_([0-9]+)\\.", "repl":" - \\2 - \\1."}]
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
.. _advanced-matching:
 | 
			
		||||
 | 
			
		||||
Matching tags, correspondents and document types
 | 
			
		||||
################################################
 | 
			
		||||
 | 
			
		||||
@@ -253,7 +255,7 @@ By default, paperless stores your documents in the media directory and renames t
 | 
			
		||||
using the identifier which it has assigned to each document. You will end up getting
 | 
			
		||||
files like ``0000123.pdf`` in your media directory. This isn't necessarily a bad
 | 
			
		||||
thing, because you normally don't have to access these files manually. However, if
 | 
			
		||||
you wish to name your files differently, you can do that by adjustng the
 | 
			
		||||
you wish to name your files differently, you can do that by adjusting the
 | 
			
		||||
``PAPERLESS_FILENAME_FORMAT`` settings variable.
 | 
			
		||||
 | 
			
		||||
This variable allows you to configure the filename (folders are allowed!) using
 | 
			
		||||
@@ -278,7 +280,7 @@ will create a directory structure as follows:
 | 
			
		||||
        my_new_shoes-0000004.pdf
 | 
			
		||||
 | 
			
		||||
Paperless appends the unique identifier of each document to the filename. This
 | 
			
		||||
avoides filename clashes.
 | 
			
		||||
avoids filename clashes.
 | 
			
		||||
 | 
			
		||||
.. danger::
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -94,7 +94,7 @@ Result object:
 | 
			
		||||
    }
 | 
			
		||||
 | 
			
		||||
*   ``id``: the primary key of the found document
 | 
			
		||||
*   ``highlights``: an object containing parseable highlights for the result.
 | 
			
		||||
*   ``highlights``: an object containing parsable highlights for the result.
 | 
			
		||||
    See below.
 | 
			
		||||
*   ``score``: The score assigned to the document. A higher score indicates a
 | 
			
		||||
    better match with the query. Search results are sorted descending by score.
 | 
			
		||||
 
 | 
			
		||||
@@ -52,7 +52,7 @@ paperless-ng 0.9.0
 | 
			
		||||
* **Added:** New frontend. Features:
 | 
			
		||||
 | 
			
		||||
  * Single page application: It's much more responsive than the django admin pages.
 | 
			
		||||
  * Dashboard. Shows recently scanned documents, or todos, or other documents
 | 
			
		||||
  * Dashboard. Shows recently scanned documents, or todo notes, or other documents
 | 
			
		||||
    at wish. Allows uploading of documents. Shows basic statistics.
 | 
			
		||||
  * Better document list with multiple display options.
 | 
			
		||||
  * Full text search with result highlighting, auto completion and scoring based
 | 
			
		||||
@@ -102,7 +102,7 @@ paperless-ng 0.9.0
 | 
			
		||||
 | 
			
		||||
* **Modified [breaking]:** PostgreSQL:
 | 
			
		||||
 | 
			
		||||
  * If ``PAPERLESS_DBHOST`` is specified in the settings, paperless uses postgresql instead of sqlite.
 | 
			
		||||
  * If ``PAPERLESS_DBHOST`` is specified in the settings, paperless uses PostgreSQL instead of SQLite.
 | 
			
		||||
    Username, database and password all default to ``paperless`` if not specified.
 | 
			
		||||
 | 
			
		||||
* **Modified [breaking]:** document_retagger management command rework. See
 | 
			
		||||
@@ -130,7 +130,7 @@ paperless-ng 0.9.0
 | 
			
		||||
    Certain language specifics such as umlauts may not get picked up properly.
 | 
			
		||||
  * ``PAPERLESS_DEBUG`` defaults to ``false``.
 | 
			
		||||
  * The presence of ``PAPERLESS_DBHOST`` now determines whether to use PostgreSQL or
 | 
			
		||||
    sqlite.
 | 
			
		||||
    SQLite.
 | 
			
		||||
  * ``PAPERLESS_OCR_THREADS`` is gone and replaced with ``PAPERLESS_TASK_WORKERS`` and
 | 
			
		||||
    ``PAPERLESS_THREADS_PER_WORKER``. Refer to the config example for details.
 | 
			
		||||
  * ``PAPERLESS_OPTIMIZE_THUMBNAILS`` allows you to disable or enable thumbnail
 | 
			
		||||
 
 | 
			
		||||
@@ -69,7 +69,7 @@ PAPERLESS_CONSUMPTION_DIR=<path>
 | 
			
		||||
    Defaults to "../consume", relative to the "src" directory.
 | 
			
		||||
 | 
			
		||||
PAPERLESS_DATA_DIR=<path>
 | 
			
		||||
    This is where paperless stores all its data (search index, sqlite database,
 | 
			
		||||
    This is where paperless stores all its data (search index, SQLite database,
 | 
			
		||||
    classification model, etc).
 | 
			
		||||
 | 
			
		||||
    Defaults to "../data", relative to the "src" directory.
 | 
			
		||||
@@ -100,7 +100,7 @@ Hosting & Security
 | 
			
		||||
##################
 | 
			
		||||
 | 
			
		||||
PAPERLESS_SECRET_KEY=<key>
 | 
			
		||||
    Paperless uses this to make session tokens. If you exose paperless on the
 | 
			
		||||
    Paperless uses this to make session tokens. If you expose paperless on the
 | 
			
		||||
    internet, you need to change this, since the default secret is well known.
 | 
			
		||||
 | 
			
		||||
    Use any sequence of characters. The more, the better. You don't need to
 | 
			
		||||
@@ -220,7 +220,7 @@ PAPERLESS_CONSUMER_POLLING=<num>
 | 
			
		||||
    specify a polling interval in seconds here, which will then cause paperless
 | 
			
		||||
    to periodically check your consumption directory for changes.
 | 
			
		||||
 | 
			
		||||
    Defaults to 0, which disables polling and uses filesystem notifiactions.
 | 
			
		||||
    Defaults to 0, which disables polling and uses filesystem notifications.
 | 
			
		||||
 | 
			
		||||
PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool>
 | 
			
		||||
    When the consumer detects a duplicate document, it will not touch the
 | 
			
		||||
@@ -264,7 +264,7 @@ PAPERLESS_CONVERT_DENSITY=<num>
 | 
			
		||||
    Default is 300.
 | 
			
		||||
 | 
			
		||||
PAPERLESS_OPTIMIZE_THUMBNAILS=<bool>
 | 
			
		||||
    Use optipng to optimize thumbnails. This usually reduces the sice of
 | 
			
		||||
    Use optipng to optimize thumbnails. This usually reduces the size of
 | 
			
		||||
    thumbnails by about 20%, but uses considerable compute time during
 | 
			
		||||
    consumption.
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -85,7 +85,7 @@ quoted, or triple-quoted string will do:
 | 
			
		||||
    problematic_string = 'This is a "string" with "quotes" in it'
 | 
			
		||||
 | 
			
		||||
In HTML templates, please use double-quotes for tag attributes, and single
 | 
			
		||||
quotes for arguments passed to Django tempalte tags:
 | 
			
		||||
quotes for arguments passed to Django template tags:
 | 
			
		||||
 | 
			
		||||
.. code:: html
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -17,7 +17,7 @@ is
 | 
			
		||||
 | 
			
		||||
.. caution::
 | 
			
		||||
 | 
			
		||||
    Dont mess with this folder. Don't change permissions and don't move
 | 
			
		||||
    Do not mess with this folder. Don't change permissions and don't move
 | 
			
		||||
    files around manually. This folder is meant to be entirely managed by docker
 | 
			
		||||
    and paperless.
 | 
			
		||||
 | 
			
		||||
@@ -36,9 +36,9 @@ file extensions do not matter.
 | 
			
		||||
 | 
			
		||||
**A:** The short answer is yes. I've tested it on a Raspberry Pi 3 B.
 | 
			
		||||
The long answer is that certain parts of
 | 
			
		||||
Paperless will run very slow, such as the tesseract OCR. On Rasperry Pi,
 | 
			
		||||
Paperless will run very slow, such as the tesseract OCR. On Raspberry Pi,
 | 
			
		||||
try to OCR documents before feeding them into paperless so that paperless can
 | 
			
		||||
reuse the text. The web interface should be alot snappier, since it runs
 | 
			
		||||
reuse the text. The web interface should be a lot snappier, since it runs
 | 
			
		||||
in your browser and paperless has to do much less work to serve the data.
 | 
			
		||||
 | 
			
		||||
.. note::
 | 
			
		||||
 
 | 
			
		||||
@@ -8,7 +8,7 @@ Scanner recommendations
 | 
			
		||||
As Paperless operates by watching a folder for new files, doesn't care what
 | 
			
		||||
scanner you use, but sometimes finding a scanner that will write to an FTP,
 | 
			
		||||
NFS, or SMB server can be difficult.  This page is here to help you find one
 | 
			
		||||
that works right for you based on recommentations from other Paperless users.
 | 
			
		||||
that works right for you based on recommendations from other Paperless users.
 | 
			
		||||
 | 
			
		||||
+---------+----------------+-----+-----+-----+----------------+
 | 
			
		||||
| Brand   | Model          | Supports        | Recommended By |
 | 
			
		||||
 
 | 
			
		||||
@@ -21,7 +21,7 @@ Extensive filtering mechanisms:
 | 
			
		||||
 | 
			
		||||
.. image:: _static/screenshots/documents-filter.png
 | 
			
		||||
 | 
			
		||||
Side-by-side editing of documents. Optmized for 1080p.
 | 
			
		||||
Side-by-side editing of documents. Optimized for 1080p.
 | 
			
		||||
 | 
			
		||||
.. image:: _static/screenshots/editing.png
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -85,7 +85,7 @@ Paperless consists of the following components:
 | 
			
		||||
        needs to do from time to time in order to operate properly.
 | 
			
		||||
 | 
			
		||||
    This allows paperless to process multiple documents from your consumption folder in parallel! On
 | 
			
		||||
    a modern multicore system, consumption with full ocr is blazing fast.
 | 
			
		||||
    a modern multi core system, consumption with full ocr is blazing fast.
 | 
			
		||||
 | 
			
		||||
    The task processor comes with a built-in admin interface that you can use to see whenever any of the
 | 
			
		||||
    tasks fail and inspect the errors (i.e., wrong email credentials, errors during consuming a specific
 | 
			
		||||
@@ -322,7 +322,7 @@ management commands as below.
 | 
			
		||||
            $ cd /path/to/paperless
 | 
			
		||||
            $ docker-compose run --rm webserver /bin/bash
 | 
			
		||||
        
 | 
			
		||||
        This will lauch the container and initialize the PostgreSQL database.
 | 
			
		||||
        This will launch the container and initialize the PostgreSQL database.
 | 
			
		||||
    
 | 
			
		||||
    b)  Without docker, open a shell in your virtual environment, switch to
 | 
			
		||||
        the ``src`` directory and create the database schema:
 | 
			
		||||
@@ -372,7 +372,7 @@ configuring some options in paperless can help improve performance immensely:
 | 
			
		||||
*   ``PAPERLESS_TASK_WORKERS`` and ``PAPERLESS_THREADS_PER_WORKER`` are configured
 | 
			
		||||
    to use all cores. The Raspberry Pi models 3 and up have 4 cores, meaning that
 | 
			
		||||
    paperless will use 2 workers and 2 threads per worker. This may result in
 | 
			
		||||
    slugish response times during consumption, so you might want to lower these
 | 
			
		||||
    sluggish response times during consumption, so you might want to lower these
 | 
			
		||||
    settings (example: 2 workers and 1 thread to always have some computing power
 | 
			
		||||
    left for other tasks).
 | 
			
		||||
*   Keep ``PAPERLESS_OCR_ALWAYS`` at its default value 'false' and consider OCR'ing
 | 
			
		||||
 
 | 
			
		||||
@@ -5,13 +5,13 @@ Usage Overview
 | 
			
		||||
Paperless is an application that manages your personal documents. With
 | 
			
		||||
the help of a document scanner (see :ref:`scanners`), paperless transforms
 | 
			
		||||
your wieldy physical document binders into a searchable archive and
 | 
			
		||||
provices many utilities for finding and managing your documents.
 | 
			
		||||
provides many utilities for finding and managing your documents.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
Terms and definitions
 | 
			
		||||
#####################
 | 
			
		||||
 | 
			
		||||
Paperless esentially consists of two different parts for managing your
 | 
			
		||||
Paperless essentially consists of two different parts for managing your
 | 
			
		||||
documents:
 | 
			
		||||
 | 
			
		||||
* The *consumer* watches a specified folder and adds all documents in that
 | 
			
		||||
@@ -30,12 +30,12 @@ Each document has a couple of fields that you can assign to them:
 | 
			
		||||
  tag, however, a single document can also have multiple tags. This is not
 | 
			
		||||
  possible with folders. The reason folders are not implemented in paperless
 | 
			
		||||
  is simply that tags are much more versatile than folders.
 | 
			
		||||
* A *document type* is used to demarkate the type of a document such as letter,
 | 
			
		||||
* A *document type* is used to demarcate the type of a document such as letter,
 | 
			
		||||
  bank statement, invoice, contract, etc. It is used to identify what a document
 | 
			
		||||
  is about.
 | 
			
		||||
* The *date added* of a document is the date the document was scanned into
 | 
			
		||||
  paperless. You cannot and should not change this date.
 | 
			
		||||
* The *date created* of a document is the date the document was intially issued.
 | 
			
		||||
* The *date created* of a document is the date the document was initially issued.
 | 
			
		||||
  This can be the date you bought a product, the date you signed a contract, or
 | 
			
		||||
  the date a letter was sent to you.
 | 
			
		||||
* The *archive serial number* (short: ASN) of a document is the identifier of
 | 
			
		||||
@@ -131,7 +131,7 @@ These are as follows:
 | 
			
		||||
 | 
			
		||||
    With the correct set of rules, you can completely automate your email documents.
 | 
			
		||||
    Create rules for every correspondent you receive digital documents from and
 | 
			
		||||
    paperless will read them automatically. The default acion "mark as read" is
 | 
			
		||||
    paperless will read them automatically. The default action "mark as read" is
 | 
			
		||||
    pretty tame and will not cause any damage or data loss whatsoever.
 | 
			
		||||
 | 
			
		||||
    You can also setup a special folder in your mail account for paperless and use
 | 
			
		||||
@@ -182,7 +182,7 @@ Processing of the physical documents
 | 
			
		||||
====================================
 | 
			
		||||
 | 
			
		||||
Keep a physical inbox. Whenever you receive a document that you need to
 | 
			
		||||
archive, put it into your inbox. Regulary, do the following for all documents
 | 
			
		||||
archive, put it into your inbox. Regularly, do the following for all documents
 | 
			
		||||
in your inbox:
 | 
			
		||||
 | 
			
		||||
1.  For each document, decide if you need to keep the document in physical
 | 
			
		||||
@@ -217,18 +217,24 @@ Once you have scanned in a document, proceed in paperless as follows.
 | 
			
		||||
 | 
			
		||||
1.  If the document has an ASN, assign the ASN to the document.
 | 
			
		||||
2.  Assign a correspondent to the document (i.e., your employer, bank, etc)
 | 
			
		||||
    This isnt strictly necessary but helps in finding a document when you need
 | 
			
		||||
    This isn't strictly necessary but helps in finding a document when you need
 | 
			
		||||
    it.
 | 
			
		||||
3.  Assign a document type (i.e., invoice, bank statement, etc) to the document
 | 
			
		||||
    This isnt strictly necessary but helps in finding a document when you need
 | 
			
		||||
    This isn't strictly necessary but helps in finding a document when you need
 | 
			
		||||
    it.
 | 
			
		||||
4.  Assign a proper title to the document (the name of an item you bought, the
 | 
			
		||||
    subject of the letter, etc)
 | 
			
		||||
5.  Check that the date of the document is corrent. Paperless tries to read
 | 
			
		||||
5.  Check that the date of the document is correct. Paperless tries to read
 | 
			
		||||
    the date from the content of the document, but this fails sometimes if the
 | 
			
		||||
    OCR is bad or multiple dates appear on the document.
 | 
			
		||||
6.  Remove inbox tags from the documents.
 | 
			
		||||
 | 
			
		||||
.. hint::
 | 
			
		||||
    
 | 
			
		||||
    You can setup manual matching rules for your correspondents and tags and
 | 
			
		||||
    paperless will assign them automatically. After consuming a couple documents,
 | 
			
		||||
    you can even ask paperless to *learn* when to assign tags and correspondents
 | 
			
		||||
    by itself. For details on this feature, see :ref:`advanced-matching`.
 | 
			
		||||
 | 
			
		||||
Task management
 | 
			
		||||
===============
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user