mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-11-03 03:16:10 -06:00 
			
		
		
		
	Compare commits
	
		
			3 Commits
		
	
	
		
			v2.11.0
			...
			sunset-rtd
		
	
	| Author | SHA1 | Date | |
|---|---|---|---|
| 
						 | 
					15f4808fec | ||
| 
						 | 
					d531805597 | ||
| 
						 | 
					304cfc42a9 | 
							
								
								
									
										8
									
								
								docs/_static/css/custom.css
									
									
									
									
										vendored
									
									
								
							
							
						
						
									
										8
									
								
								docs/_static/css/custom.css
									
									
									
									
										vendored
									
									
								
							@@ -595,3 +595,11 @@ html.writer-html5 .rst-content dl.footnote code {
 | 
				
			|||||||
.wy-nav-content-wrap {
 | 
					.wy-nav-content-wrap {
 | 
				
			||||||
  z-index: 20;
 | 
					  z-index: 20;
 | 
				
			||||||
}
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					.rst-content .toctree-wrapper {
 | 
				
			||||||
 | 
					  display: none;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					.redirect-notice {
 | 
				
			||||||
 | 
					  font-size: 2.5rem;
 | 
				
			||||||
 | 
					}
 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										25
									
								
								docs/_templates/layout.html
									
									
									
									
										vendored
									
									
								
							
							
						
						
									
										25
									
								
								docs/_templates/layout.html
									
									
									
									
										vendored
									
									
								
							@@ -8,6 +8,31 @@
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
        document.documentElement.classList.toggle("dark-mode", darkModeState);
 | 
					        document.documentElement.classList.toggle("dark-mode", darkModeState);
 | 
				
			||||||
        document.documentElement.classList.toggle("light-mode", !darkModeState);
 | 
					        document.documentElement.classList.toggle("light-mode", !darkModeState);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        const RTD_TO_MKD = {
 | 
				
			||||||
 | 
					            "index.html": "",
 | 
				
			||||||
 | 
					            "setup.html": "setup",
 | 
				
			||||||
 | 
					            "usage_overview.html": "usage",
 | 
				
			||||||
 | 
					            "advanced_usage.html": "advanced_usage",
 | 
				
			||||||
 | 
					            "administration.html": "administration",
 | 
				
			||||||
 | 
					            "configuration.html": "configuration",
 | 
				
			||||||
 | 
					            "api.html": "api",
 | 
				
			||||||
 | 
					            "faq.html": "faq",
 | 
				
			||||||
 | 
					            "troubleshooting.html": "troubleshooting",
 | 
				
			||||||
 | 
					            "extending.html": "development",
 | 
				
			||||||
 | 
					            "scanners.html": "",
 | 
				
			||||||
 | 
					            "screenshots.html": "",
 | 
				
			||||||
 | 
					            "changelog.html": "changelog",
 | 
				
			||||||
 | 
					        }
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        const path = (RTD_TO_MKD[window.location.pathname.substring(window.location.pathname.lastIndexOf("/") + 1)] ?? "") + "/";
 | 
				
			||||||
 | 
					        const hash = window.location.hash;
 | 
				
			||||||
 | 
					        const redirectURL = new URL(path  + hash, "https://docs.paperless-ngx.com/");
 | 
				
			||||||
 | 
					        console.log(`Redirecting to ${redirectURL} in 3 seconds...`);
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					        setTimeout(() => {
 | 
				
			||||||
 | 
					            window.location.replace(redirectURL);
 | 
				
			||||||
 | 
					        }, 3000);
 | 
				
			||||||
    </script>
 | 
					    </script>
 | 
				
			||||||
    {{ super() }}
 | 
					    {{ super() }}
 | 
				
			||||||
{% endblock %}
 | 
					{% endblock %}
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -1,531 +1,11 @@
 | 
				
			|||||||
 | 
					.. _administration:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
**************
 | 
					**************
 | 
				
			||||||
Administration
 | 
					Administration
 | 
				
			||||||
**************
 | 
					**************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. _administration-backup:
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Making backups
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
##############
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Multiple options exist for making backups of your paperless instance,
 | 
					    You will be redirected shortly...
 | 
				
			||||||
depending on how you installed paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Before making backups, make sure that paperless is not running.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Options available to any installation of paperless:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Use the :ref:`document exporter <utilities-exporter>`.
 | 
					 | 
				
			||||||
    The document exporter exports all your documents, thumbnails and
 | 
					 | 
				
			||||||
    metadata to a specific folder. You may import your documents into a
 | 
					 | 
				
			||||||
    fresh instance of paperless again or store your documents in another
 | 
					 | 
				
			||||||
    DMS with this export.
 | 
					 | 
				
			||||||
*   The document exporter is also able to update an already existing export.
 | 
					 | 
				
			||||||
    Therefore, incremental backups with ``rsync`` are entirely possible.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You cannot import the export generated with one version of paperless in a
 | 
					 | 
				
			||||||
    different version of paperless. The export contains an exact image of the
 | 
					 | 
				
			||||||
    database, and migrations may change the database layout.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Options available to docker installations:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Backup the docker volumes. These usually reside within
 | 
					 | 
				
			||||||
    ``/var/lib/docker/volumes`` on the host and you need to be root in order
 | 
					 | 
				
			||||||
    to access them.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless uses 4 volumes:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``paperless_media``: This is where your documents are stored.
 | 
					 | 
				
			||||||
    *   ``paperless_data``: This is where auxillary data is stored. This
 | 
					 | 
				
			||||||
        folder also contains the SQLite database, if you use it.
 | 
					 | 
				
			||||||
    *   ``paperless_pgdata``: Exists only if you use PostgreSQL and contains
 | 
					 | 
				
			||||||
        the database.
 | 
					 | 
				
			||||||
    *   ``paperless_dbdata``: Exists only if you use MariaDB and contains
 | 
					 | 
				
			||||||
        the database.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Options available to bare-metal and non-docker installations:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Backup the entire paperless folder. This ensures that if your paperless instance
 | 
					 | 
				
			||||||
    crashes at some point or your disk fails, you can simply copy the folder back
 | 
					 | 
				
			||||||
    into place and it works.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    When using PostgreSQL or MariaDB, you'll also have to backup the database.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _migrating-restoring:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Restoring
 | 
					 | 
				
			||||||
=========
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _administration-updating:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Updating Paperless
 | 
					 | 
				
			||||||
##################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Docker Route
 | 
					 | 
				
			||||||
============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If a new release of paperless-ngx is available, upgrading depends on how you
 | 
					 | 
				
			||||||
installed paperless-ngx in the first place. The releases are available at the
 | 
					 | 
				
			||||||
`release page <https://github.com/paperless-ngx/paperless-ngx/releases>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
First of all, ensure that paperless is stopped.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ cd /path/to/paperless
 | 
					 | 
				
			||||||
    $ docker-compose down
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
After that, :ref:`make a backup <administration-backup>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
A.  If you pull the image from the docker hub, all you need to do is:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ docker-compose pull
 | 
					 | 
				
			||||||
        $ docker-compose up
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The docker-compose files refer to the ``latest`` version, which is always the latest
 | 
					 | 
				
			||||||
    stable release.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
B.  If you built the image yourself, do the following:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ git pull
 | 
					 | 
				
			||||||
        $ docker-compose build
 | 
					 | 
				
			||||||
        $ docker-compose up
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Running ``docker-compose up`` will also apply any new database migrations.
 | 
					 | 
				
			||||||
If you see everything working, press CTRL+C once to gracefully stop paperless.
 | 
					 | 
				
			||||||
Then you can start paperless-ngx with ``-d`` to have it run in the background.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        In version 0.9.14, the update process was changed. In 0.9.13 and earlier, the
 | 
					 | 
				
			||||||
        docker-compose files specified exact versions and pull won't automatically
 | 
					 | 
				
			||||||
        update to newer versions. In order to enable updates as described above, either
 | 
					 | 
				
			||||||
        get the new ``docker-compose.yml`` file from `here <https://github.com/paperless-ngx/paperless-ngx/tree/master/docker/compose>`_
 | 
					 | 
				
			||||||
        or edit the ``docker-compose.yml`` file, find the line that says
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                image: ghcr.io/paperless-ngx/paperless-ngx:0.9.x
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        and replace the version with ``latest``:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                image: ghcr.io/paperless-ngx/paperless-ngx:latest
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
        In version 1.7.1 and onwards, the Docker image can now be pinned to a release series.
 | 
					 | 
				
			||||||
        This is often combined with automatic updaters such as Watchtower to allow safer
 | 
					 | 
				
			||||||
        unattended upgrading to new bugfix releases only.  It is still recommended to always
 | 
					 | 
				
			||||||
        review release notes before upgrading.  To pin your install to a release series, edit
 | 
					 | 
				
			||||||
        the ``docker-compose.yml`` find the line that says
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                image: ghcr.io/paperless-ngx/paperless-ngx:latest
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        and replace the version with the series you want to track, for example:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                image: ghcr.io/paperless-ngx/paperless-ngx:1.7
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Bare Metal Route
 | 
					 | 
				
			||||||
================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
After grabbing the new release and unpacking the contents, do the following:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Update dependencies. New paperless version may require additional
 | 
					 | 
				
			||||||
    dependencies. The dependencies required are listed in the section about
 | 
					 | 
				
			||||||
    :ref:`bare metal installations <setup-bare_metal>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Update python requirements. Keep in mind to activate your virtual environment
 | 
					 | 
				
			||||||
    before that, if you use one.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ pip install -r requirements.txt
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Migrate the database.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ cd src
 | 
					 | 
				
			||||||
        $ python3 manage.py migrate
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This might not actually do anything. Not every new paperless version comes with new
 | 
					 | 
				
			||||||
    database migrations.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Downgrading Paperless
 | 
					 | 
				
			||||||
#####################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Downgrades are possible. However, some updates also contain database migrations (these change the layout of the database and may move data).
 | 
					 | 
				
			||||||
In order to move back from a version that applied database migrations, you'll have to revert the database migration *before* downgrading,
 | 
					 | 
				
			||||||
and then downgrade paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This table lists the compatible versions for each database migration number.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
+------------------+-----------------+
 | 
					 | 
				
			||||||
| Migration number | Version range   |
 | 
					 | 
				
			||||||
+------------------+-----------------+
 | 
					 | 
				
			||||||
| 1011             | 1.0.0           |
 | 
					 | 
				
			||||||
+------------------+-----------------+
 | 
					 | 
				
			||||||
| 1012             | 1.1.0 - 1.2.1   |
 | 
					 | 
				
			||||||
+------------------+-----------------+
 | 
					 | 
				
			||||||
| 1014             | 1.3.0 - 1.3.1   |
 | 
					 | 
				
			||||||
+------------------+-----------------+
 | 
					 | 
				
			||||||
| 1016             | 1.3.2 - current |
 | 
					 | 
				
			||||||
+------------------+-----------------+
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Execute the following management command to migrate your database:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ python3 manage.py migrate documents <migration number>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Some migrations cannot be undone. The command will issue errors if that happens.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-management-commands:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Management utilities
 | 
					 | 
				
			||||||
####################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless comes with some management commands that perform various maintenance
 | 
					 | 
				
			||||||
tasks on your paperless instance. You can invoke these commands in the following way:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
With docker-compose, while paperless is running:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ cd /path/to/paperless
 | 
					 | 
				
			||||||
    $ docker-compose exec webserver <command> <arguments>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
With docker, while paperless is running:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ docker exec -it <container-name> <command> <arguments>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Bare metal:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ cd /path/to/paperless/src
 | 
					 | 
				
			||||||
    $ python3 manage.py <command> <arguments>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
All commands have built-in help, which can be accessed by executing them with
 | 
					 | 
				
			||||||
the argument ``--help``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-exporter:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Document exporter
 | 
					 | 
				
			||||||
=================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The document exporter exports all your data from paperless into a folder for
 | 
					 | 
				
			||||||
backup or migration to another DMS.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you use the document exporter within a cronjob to backup your data you might use the ``-T`` flag behind exec to suppress "The input device is not a TTY" errors. For example: ``docker-compose exec -T webserver document_exporter ../export``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_exporter target [-c] [-f] [-d]
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    optional arguments:
 | 
					 | 
				
			||||||
    -c, --compare-checksums
 | 
					 | 
				
			||||||
    -f, --use-filename-format
 | 
					 | 
				
			||||||
    -d, --delete
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``target`` is a folder to which the data gets written. This includes documents,
 | 
					 | 
				
			||||||
thumbnails and a ``manifest.json`` file. The manifest contains all metadata from
 | 
					 | 
				
			||||||
the database (correspondents, tags, etc).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When you use the provided docker compose script, specify ``../export`` as the
 | 
					 | 
				
			||||||
target. This path inside the container is automatically mounted on your host on
 | 
					 | 
				
			||||||
the folder ``export``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If the target directory already exists and contains files, paperless will assume
 | 
					 | 
				
			||||||
that the contents of the export directory are a previous export and will attempt
 | 
					 | 
				
			||||||
to update the previous export. Paperless will only export changed and added files.
 | 
					 | 
				
			||||||
Paperless determines whether a file has changed by inspecting the file attributes
 | 
					 | 
				
			||||||
"date/time modified" and "size". If that does not work out for you, specify
 | 
					 | 
				
			||||||
``--compare-checksums`` and paperless will attempt to compare file checksums instead.
 | 
					 | 
				
			||||||
This is slower.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless will not remove any existing files in the export directory. If you want
 | 
					 | 
				
			||||||
paperless to also remove files that do not belong to the current export such as files
 | 
					 | 
				
			||||||
from deleted documents, specify ``--delete``. Be careful when pointing paperless to
 | 
					 | 
				
			||||||
a directory that already contains other files.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The filenames generated by this command follow the format
 | 
					 | 
				
			||||||
``[date created] [correspondent] [title].[extension]``.
 | 
					 | 
				
			||||||
If you want paperless to use ``PAPERLESS_FILENAME_FORMAT`` for exported filenames
 | 
					 | 
				
			||||||
instead, specify ``--use-filename-format``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-importer:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Document importer
 | 
					 | 
				
			||||||
=================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The document importer takes the export produced by the `Document exporter`_ and
 | 
					 | 
				
			||||||
imports it into paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The importer works just like the exporter.  You point it at a directory, and
 | 
					 | 
				
			||||||
the script does the rest of the work:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_importer source
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When you use the provided docker compose script, put the export inside the
 | 
					 | 
				
			||||||
``export`` folder in your paperless source directory. Specify ``../export``
 | 
					 | 
				
			||||||
as the ``source``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Importing from a previous version of Paperless may work, but for best results
 | 
					 | 
				
			||||||
    it is suggested to match the versions.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-retagger:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Document retagger
 | 
					 | 
				
			||||||
=================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Say you've imported a few hundred documents and now want to introduce
 | 
					 | 
				
			||||||
a tag or set up a new correspondent, and apply its matching to all of
 | 
					 | 
				
			||||||
the currently-imported docs. This problem is common enough that
 | 
					 | 
				
			||||||
there are tools for it.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_retagger [-h] [-c] [-T] [-t] [-i] [--use-first] [-f]
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    optional arguments:
 | 
					 | 
				
			||||||
    -c, --correspondent
 | 
					 | 
				
			||||||
    -T, --tags
 | 
					 | 
				
			||||||
    -t, --document_type
 | 
					 | 
				
			||||||
    -s, --storage_path
 | 
					 | 
				
			||||||
    -i, --inbox-only
 | 
					 | 
				
			||||||
    --use-first
 | 
					 | 
				
			||||||
    -f, --overwrite
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Run this after changing or adding matching rules. It'll loop over all
 | 
					 | 
				
			||||||
of the documents in your database and attempt to match documents
 | 
					 | 
				
			||||||
according to the new rules.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Specify any combination of ``-c``, ``-T``, ``-t`` and ``-s`` to have the
 | 
					 | 
				
			||||||
retagger perform matching of the specified metadata type. If you don't
 | 
					 | 
				
			||||||
specify any of these options, the document retagger won't do anything.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Specify ``-i`` to have the document retagger work on documents tagged
 | 
					 | 
				
			||||||
with inbox tags only. This is useful when you don't want to mess with
 | 
					 | 
				
			||||||
your already processed documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When multiple document types or correspondents match a single document,
 | 
					 | 
				
			||||||
the retagger won't assign these to the document. Specify ``--use-first``
 | 
					 | 
				
			||||||
to override this behavior and just use the first correspondent or type
 | 
					 | 
				
			||||||
it finds. This option does not apply to tags, since any amount of tags
 | 
					 | 
				
			||||||
can be applied to a document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Finally, ``-f`` specifies that you wish to overwrite already assigned
 | 
					 | 
				
			||||||
correspondents, types and/or tags. The default behavior is to not
 | 
					 | 
				
			||||||
assign correspondents and types to documents that have this data already
 | 
					 | 
				
			||||||
assigned. ``-f`` works differently for tags: By default, only additional tags get
 | 
					 | 
				
			||||||
added to documents, no tags will be removed. With ``-f``, tags that don't
 | 
					 | 
				
			||||||
match a document anymore get removed as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Managing the Automatic matching algorithm
 | 
					 | 
				
			||||||
=========================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The *Auto* matching algorithm requires a trained neural network to work.
 | 
					 | 
				
			||||||
This network needs to be updated whenever somethings in your data
 | 
					 | 
				
			||||||
changes. The docker image takes care of that automatically with the task
 | 
					 | 
				
			||||||
scheduler. You can manually renew the classifier by invoking the following
 | 
					 | 
				
			||||||
management command:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_create_classifier
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This command takes no arguments.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _`administration-index`:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Managing the document search index
 | 
					 | 
				
			||||||
==================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The document search index is responsible for delivering search results for the
 | 
					 | 
				
			||||||
website. The document index is automatically updated whenever documents get
 | 
					 | 
				
			||||||
added to, changed, or removed from paperless. However, if the search yields
 | 
					 | 
				
			||||||
non-existing documents or won't find anything, you may need to recreate the
 | 
					 | 
				
			||||||
index manually.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_index {reindex,optimize}
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Specify ``reindex`` to have the index created from scratch. This may take some
 | 
					 | 
				
			||||||
time.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Specify ``optimize`` to optimize the index. This updates certain aspects of
 | 
					 | 
				
			||||||
the index and usually makes queries faster and also ensures that the
 | 
					 | 
				
			||||||
autocompletion works properly. This command is regularly invoked by the task
 | 
					 | 
				
			||||||
scheduler.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-renamer:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Managing filenames
 | 
					 | 
				
			||||||
==================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you use paperless' feature to
 | 
					 | 
				
			||||||
:ref:`assign custom filenames to your documents <advanced-file_name_handling>`,
 | 
					 | 
				
			||||||
you can use this command to move all your files after changing
 | 
					 | 
				
			||||||
the naming scheme.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Since this command moves your documents, it is advised to do
 | 
					 | 
				
			||||||
    a backup beforehand. The renaming logic is robust and will never overwrite
 | 
					 | 
				
			||||||
    or delete a file, but you can't ever be careful enough.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_renamer
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The command takes no arguments and processes all your documents at once.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Learn how to use :ref:`Management Utilities<utilities-management-commands>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-sanity-checker:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Sanity checker
 | 
					 | 
				
			||||||
==============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless has a built-in sanity checker that inspects your document collection for issues.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The issues detected by the sanity checker are as follows:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* Missing original files.
 | 
					 | 
				
			||||||
* Missing archive files.
 | 
					 | 
				
			||||||
* Inaccessible original files due to improper permissions.
 | 
					 | 
				
			||||||
* Inaccessible archive files due to improper permissions.
 | 
					 | 
				
			||||||
* Corrupted original documents by comparing their checksum against what is stored in the database.
 | 
					 | 
				
			||||||
* Corrupted archive documents by comparing their checksum against what is stored in the database.
 | 
					 | 
				
			||||||
* Missing thumbnails.
 | 
					 | 
				
			||||||
* Inaccessible thumbnails due to improper permissions.
 | 
					 | 
				
			||||||
* Documents without any content (warning).
 | 
					 | 
				
			||||||
* Orphaned files in the media directory (warning). These are files that are not referenced by any document im paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_sanity_checker
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The command takes no arguments. Depending on the size of your document archive, this may take some time.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Fetching e-mail
 | 
					 | 
				
			||||||
===============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless automatically fetches your e-mail every 10 minutes by default. If
 | 
					 | 
				
			||||||
you want to invoke the email consumer manually, call the following management
 | 
					 | 
				
			||||||
command:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    mail_fetcher
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The command takes no arguments and processes all your mail accounts and rules.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    As of October 2022 Microsoft no longer supports IMAP authentication for Exchange
 | 
					 | 
				
			||||||
    servers, thus Exchange is no longer supported until a solution is implemented in
 | 
					 | 
				
			||||||
    the Python IMAP library used by Paperless. See  `learn.microsoft.com`_
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _learn.microsoft.com: https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-archiver:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Creating archived documents
 | 
					 | 
				
			||||||
===========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless stores archived PDF/A documents alongside your original documents.
 | 
					 | 
				
			||||||
These archived documents will also contain selectable text for image-only
 | 
					 | 
				
			||||||
originals.
 | 
					 | 
				
			||||||
These documents are derived from the originals, which are always stored
 | 
					 | 
				
			||||||
unmodified. If coming from an earlier version of paperless, your documents
 | 
					 | 
				
			||||||
won't have archived versions.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This command creates PDF/A documents for your documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    document_archiver --overwrite --document <id>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This command will only attempt to create archived documents when no archived
 | 
					 | 
				
			||||||
document exists yet, unless ``--overwrite`` is specified. If ``--document <id>``
 | 
					 | 
				
			||||||
is specified, the archiver will only process that document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This command essentially performs OCR on all your documents again,
 | 
					 | 
				
			||||||
    according to your settings. If you run this with ``PAPERLESS_OCR_MODE=redo``,
 | 
					 | 
				
			||||||
    it will potentially run for a very long time. You can cancel the command
 | 
					 | 
				
			||||||
    at any time, since this command will skip already archived versions the next time
 | 
					 | 
				
			||||||
    it is run.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Some documents will cause errors and cannot be converted into PDF/A documents,
 | 
					 | 
				
			||||||
    such as encrypted PDF documents. The archiver will skip over these documents
 | 
					 | 
				
			||||||
    each time it sees them.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _utilities-encyption:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Managing encryption
 | 
					 | 
				
			||||||
===================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Documents can be stored in Paperless using GnuPG encryption.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. danger::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Encryption is deprecated since paperless-ngx 0.9 and doesn't really provide any
 | 
					 | 
				
			||||||
    additional security, since you have to store the passphrase in a configuration
 | 
					 | 
				
			||||||
    file on the same system as the encrypted documents for paperless to work.
 | 
					 | 
				
			||||||
    Furthermore, the entire text content of the documents is stored plain in the
 | 
					 | 
				
			||||||
    database, even if your documents are encrypted. Filenames are not encrypted as
 | 
					 | 
				
			||||||
    well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Also, the web server provides transparent access to your encrypted documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Consider running paperless on an encrypted filesystem instead, which will then
 | 
					 | 
				
			||||||
    at least provide security against physical hardware theft.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Enabling encryption
 | 
					 | 
				
			||||||
-------------------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Enabling encryption is no longer supported.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Disabling encryption
 | 
					 | 
				
			||||||
--------------------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Basic usage to disable encryption of your document store:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
(Note: If ``PAPERLESS_PASSPHRASE`` isn't set already, you need to specify it here)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    decrypt_documents [--passphrase SECR3TP4SSPHRA$E]
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
@@ -1,447 +1,11 @@
 | 
				
			|||||||
 | 
					.. _advanced_usage:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
***************
 | 
					***************
 | 
				
			||||||
Advanced topics
 | 
					Advanced topics
 | 
				
			||||||
***************
 | 
					***************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless offers a couple features that automate certain tasks and make your life
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
easier.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. _advanced-matching:
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Matching tags, correspondents, document types, and storage paths
 | 
					    You will be redirected shortly...
 | 
				
			||||||
################################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless will compare the matching algorithms defined by every tag, correspondent,
 | 
					 | 
				
			||||||
document type, and storage path in your database to see if they apply to the text
 | 
					 | 
				
			||||||
in a document. In other words, if you define a tag called ``Home Utility``
 | 
					 | 
				
			||||||
that had a ``match`` property of ``bc hydro`` and a ``matching_algorithm`` of
 | 
					 | 
				
			||||||
``literal``, Paperless will automatically tag your newly-consumed document with
 | 
					 | 
				
			||||||
your ``Home Utility`` tag so long as the text ``bc hydro`` appears in the body
 | 
					 | 
				
			||||||
of the document somewhere.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The matching logic is quite powerful. It supports searching the text of your
 | 
					 | 
				
			||||||
document with different algorithms, and as such, some experimentation may be
 | 
					 | 
				
			||||||
necessary to get things right.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
In order to have a tag, correspondent, document type, or storage path assigned
 | 
					 | 
				
			||||||
automatically to newly consumed documents, assign a match and matching algorithm
 | 
					 | 
				
			||||||
using the web interface. These settings define when to assign tags, correspondents,
 | 
					 | 
				
			||||||
document types, and storage paths to documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The following algorithms are available:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* **Any:** Looks for any occurrence of any word provided in match in the PDF.
 | 
					 | 
				
			||||||
  If you define the match as ``Bank1 Bank2``, it will match documents containing
 | 
					 | 
				
			||||||
  either of these terms.
 | 
					 | 
				
			||||||
* **All:** Requires that every word provided appears in the PDF, albeit not in the
 | 
					 | 
				
			||||||
  order provided.
 | 
					 | 
				
			||||||
* **Literal:** Matches only if the match appears exactly as provided (i.e. preserve ordering) in the PDF.
 | 
					 | 
				
			||||||
* **Regular expression:** Parses the match as a regular expression and tries to
 | 
					 | 
				
			||||||
  find a match within the document.
 | 
					 | 
				
			||||||
* **Fuzzy match:** I don't know. Look at the source.
 | 
					 | 
				
			||||||
* **Auto:** Tries to automatically match new documents. This does not require you
 | 
					 | 
				
			||||||
  to set a match. See the notes below.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When using the *any* or *all* matching algorithms, you can search for terms
 | 
					 | 
				
			||||||
that consist of multiple words by enclosing them in double quotes. For example,
 | 
					 | 
				
			||||||
defining a match text of ``"Bank of America" BofA`` using the *any* algorithm,
 | 
					 | 
				
			||||||
will match documents that contain either "Bank of America" or "BofA", but will
 | 
					 | 
				
			||||||
not match documents containing "Bank of South America".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Then just save your tag, correspondent, document type, or storage path and run
 | 
					 | 
				
			||||||
another document through the consumer.  Once complete, you should see the
 | 
					 | 
				
			||||||
newly-created document, automatically tagged with the appropriate data.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _advanced-automatic_matching:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Automatic matching
 | 
					 | 
				
			||||||
==================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx comes with a new matching algorithm called *Auto*. This matching
 | 
					 | 
				
			||||||
algorithm tries to assign tags, correspondents, document types, and storage paths
 | 
					 | 
				
			||||||
to your documents based on how you have already assigned these on existing documents.
 | 
					 | 
				
			||||||
It uses a neural network under the hood.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If, for example, all your bank statements of your account 123 at the Bank of
 | 
					 | 
				
			||||||
America are tagged with the tag "bofa_123" and the matching algorithm of this
 | 
					 | 
				
			||||||
tag is set to *Auto*, this neural network will examine your documents and
 | 
					 | 
				
			||||||
automatically learn when to assign this tag.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless tries to hide much of the involved complexity with this approach.
 | 
					 | 
				
			||||||
However, there are a couple caveats you need to keep in mind when using this
 | 
					 | 
				
			||||||
feature:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* Changes to your documents are not immediately reflected by the matching
 | 
					 | 
				
			||||||
  algorithm. The neural network needs to be *trained* on your documents after
 | 
					 | 
				
			||||||
  changes. Paperless periodically (default: once each hour) checks for changes
 | 
					 | 
				
			||||||
  and does this automatically for you.
 | 
					 | 
				
			||||||
* The Auto matching algorithm only takes documents into account which are NOT
 | 
					 | 
				
			||||||
  placed in your inbox (i.e. have any inbox tags assigned to them). This ensures
 | 
					 | 
				
			||||||
  that the neural network only learns from documents which you have correctly
 | 
					 | 
				
			||||||
  tagged before.
 | 
					 | 
				
			||||||
* The matching algorithm can only work if there is a correlation between the
 | 
					 | 
				
			||||||
  tag, correspondent, document type, or storage path and the document itself.
 | 
					 | 
				
			||||||
  Your bank statements usually contain your bank account number and the name
 | 
					 | 
				
			||||||
  of the bank, so this works reasonably well, However, tags such as "TODO"
 | 
					 | 
				
			||||||
  cannot be automatically assigned.
 | 
					 | 
				
			||||||
* The matching algorithm needs a reasonable number of documents to identify when
 | 
					 | 
				
			||||||
  to assign tags, correspondents, storage paths, and types. If one out of a
 | 
					 | 
				
			||||||
  thousand documents has the correspondent "Very obscure web shop I bought
 | 
					 | 
				
			||||||
  something five years ago", it will probably not assign this correspondent
 | 
					 | 
				
			||||||
  automatically if you buy something from them again. The more documents, the better.
 | 
					 | 
				
			||||||
* Paperless also needs a reasonable amount of negative examples to decide when
 | 
					 | 
				
			||||||
  not to assign a certain tag, correspondent, document type, or storage path. This will
 | 
					 | 
				
			||||||
  usually be the case as you start filling up paperless with documents.
 | 
					 | 
				
			||||||
  Example: If all your documents are either from "Webshop" and "Bank", paperless
 | 
					 | 
				
			||||||
  will assign one of these correspondents to ANY new document, if both are set
 | 
					 | 
				
			||||||
  to automatic matching.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Hooking into the consumption process
 | 
					 | 
				
			||||||
####################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Sometimes you may want to do something arbitrary whenever a document is
 | 
					 | 
				
			||||||
consumed.  Rather than try to predict what you may want to do, Paperless lets
 | 
					 | 
				
			||||||
you execute scripts of your own choosing just before or after a document is
 | 
					 | 
				
			||||||
consumed using a couple simple hooks.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Just write a script, put it somewhere that Paperless can read & execute, and
 | 
					 | 
				
			||||||
then put the path to that script in ``paperless.conf`` or ``docker-compose.env`` with the variable name
 | 
					 | 
				
			||||||
of either ``PAPERLESS_PRE_CONSUME_SCRIPT`` or
 | 
					 | 
				
			||||||
``PAPERLESS_POST_CONSUME_SCRIPT``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. important::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    These scripts are executed in a **blocking** process, which means that if
 | 
					 | 
				
			||||||
    a script takes a long time to run, it can significantly slow down your
 | 
					 | 
				
			||||||
    document consumption flow.  If you want things to run asynchronously,
 | 
					 | 
				
			||||||
    you'll have to fork the process in your script and exit.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Pre-consumption script
 | 
					 | 
				
			||||||
======================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Executed after the consumer sees a new document in the consumption folder, but
 | 
					 | 
				
			||||||
before any processing of the document is performed. This script can access the
 | 
					 | 
				
			||||||
following relevant environment variables set:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* ``DOCUMENT_SOURCE_PATH``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
A simple but common example for this would be creating a simple script like
 | 
					 | 
				
			||||||
this:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``/usr/local/bin/ocr-pdf``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    #!/usr/bin/env bash
 | 
					 | 
				
			||||||
    pdf2pdfocr.py -i ${DOCUMENT_SOURCE_PATH}
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``/etc/paperless.conf``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    ...
 | 
					 | 
				
			||||||
    PAPERLESS_PRE_CONSUME_SCRIPT="/usr/local/bin/ocr-pdf"
 | 
					 | 
				
			||||||
    ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This will pass the path to the document about to be consumed to ``/usr/local/bin/ocr-pdf``,
 | 
					 | 
				
			||||||
which will in turn call `pdf2pdfocr.py`_ on your document, which will then
 | 
					 | 
				
			||||||
overwrite the file with an OCR'd version of the file and exit.  At which point,
 | 
					 | 
				
			||||||
the consumption process will begin with the newly modified file.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The script's stdout and stderr will be logged line by line to the webserver log, along
 | 
					 | 
				
			||||||
with the exit code of the script.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _pdf2pdfocr.py: https://github.com/LeoFCardoso/pdf2pdfocr
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _advanced-post_consume_script:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Post-consumption script
 | 
					 | 
				
			||||||
=======================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Executed after the consumer has successfully processed a document and has moved it
 | 
					 | 
				
			||||||
into paperless. It receives the following environment variables:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* ``DOCUMENT_ID``
 | 
					 | 
				
			||||||
* ``DOCUMENT_FILE_NAME``
 | 
					 | 
				
			||||||
* ``DOCUMENT_CREATED``
 | 
					 | 
				
			||||||
* ``DOCUMENT_MODIFIED``
 | 
					 | 
				
			||||||
* ``DOCUMENT_ADDED``
 | 
					 | 
				
			||||||
* ``DOCUMENT_SOURCE_PATH``
 | 
					 | 
				
			||||||
* ``DOCUMENT_ARCHIVE_PATH``
 | 
					 | 
				
			||||||
* ``DOCUMENT_THUMBNAIL_PATH``
 | 
					 | 
				
			||||||
* ``DOCUMENT_DOWNLOAD_URL``
 | 
					 | 
				
			||||||
* ``DOCUMENT_THUMBNAIL_URL``
 | 
					 | 
				
			||||||
* ``DOCUMENT_CORRESPONDENT``
 | 
					 | 
				
			||||||
* ``DOCUMENT_TAGS``
 | 
					 | 
				
			||||||
* ``DOCUMENT_ORIGINAL_FILENAME``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The script can be in any language, but for a simple shell script
 | 
					 | 
				
			||||||
example, you can take a look at `post-consumption-example.sh`_ in this project.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The post consumption script cannot cancel the consumption process.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The script's stdout and stderr will be logged line by line to the webserver log, along
 | 
					 | 
				
			||||||
with the exit code of the script.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Docker
 | 
					 | 
				
			||||||
------
 | 
					 | 
				
			||||||
Assumed you have ``/home/foo/paperless-ngx/scripts/post-consumption-example.sh``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can pass that script into the consumer container via a host mount in your ``docker-compose.yml``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  ...
 | 
					 | 
				
			||||||
  consumer:
 | 
					 | 
				
			||||||
    ...
 | 
					 | 
				
			||||||
    volumes:
 | 
					 | 
				
			||||||
      ...
 | 
					 | 
				
			||||||
      - /home/paperless-ngx/scripts:/path/in/container/scripts/
 | 
					 | 
				
			||||||
  ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Example (docker-compose.yml): ``- /home/foo/paperless-ngx/scripts:/usr/src/paperless/scripts``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
which in turn requires the variable ``PAPERLESS_POST_CONSUME_SCRIPT`` in ``docker-compose.env``  to point to ``/path/in/container/scripts/post-consumption-example.sh``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Example (docker-compose.env): ``PAPERLESS_POST_CONSUME_SCRIPT=/usr/src/paperless/scripts/post-consumption-example.sh``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Troubleshooting:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
- Monitor the docker-compose log ``cd ~/paperless-ngx; docker-compose logs -f``
 | 
					 | 
				
			||||||
- Check your script's permission e.g. in case of permission error ``sudo chmod 755 post-consumption-example.sh``
 | 
					 | 
				
			||||||
- Pipe your scripts's output to a log file e.g. ``echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _post-consumption-example.sh: https://github.com/paperless-ngx/paperless-ngx/blob/main/scripts/post-consumption-example.sh
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _advanced-file_name_handling:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
File name handling
 | 
					 | 
				
			||||||
##################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
By default, paperless stores your documents in the media directory and renames them
 | 
					 | 
				
			||||||
using the identifier which it has assigned to each document. You will end up getting
 | 
					 | 
				
			||||||
files like ``0000123.pdf`` in your media directory. This isn't necessarily a bad
 | 
					 | 
				
			||||||
thing, because you normally don't have to access these files manually. However, if
 | 
					 | 
				
			||||||
you wish to name your files differently, you can do that by adjusting the
 | 
					 | 
				
			||||||
``PAPERLESS_FILENAME_FORMAT`` configuration option. Paperless adds the correct
 | 
					 | 
				
			||||||
file extension e.g. ``.pdf``, ``.jpg`` automatically.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This variable allows you to configure the filename (folders are allowed) using
 | 
					 | 
				
			||||||
placeholders. For example, configuring this to
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    PAPERLESS_FILENAME_FORMAT={created_year}/{correspondent}/{title}
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
will create a directory structure as follows:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    2019/
 | 
					 | 
				
			||||||
      My bank/
 | 
					 | 
				
			||||||
        Statement January.pdf
 | 
					 | 
				
			||||||
        Statement February.pdf
 | 
					 | 
				
			||||||
    2020/
 | 
					 | 
				
			||||||
      My bank/
 | 
					 | 
				
			||||||
        Statement January.pdf
 | 
					 | 
				
			||||||
        Letter.pdf
 | 
					 | 
				
			||||||
        Letter_01.pdf
 | 
					 | 
				
			||||||
      Shoe store/
 | 
					 | 
				
			||||||
        My new shoes.pdf
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. danger::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Do not manually move your files in the media folder. Paperless remembers the
 | 
					 | 
				
			||||||
    last filename a document was stored as. If you do rename a file, paperless will
 | 
					 | 
				
			||||||
    report your files as missing and won't be able to find them.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless provides the following placeholders within filenames:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* ``{asn}``: The archive serial number of the document, or "none".
 | 
					 | 
				
			||||||
* ``{correspondent}``: The name of the correspondent, or "none".
 | 
					 | 
				
			||||||
* ``{document_type}``: The name of the document type, or "none".
 | 
					 | 
				
			||||||
* ``{tag_list}``: A comma separated list of all tags assigned to the document.
 | 
					 | 
				
			||||||
* ``{title}``: The title of the document.
 | 
					 | 
				
			||||||
* ``{created}``: The full date (ISO format) the document was created.
 | 
					 | 
				
			||||||
* ``{created_year}``: Year created only, formatted as the year with century.
 | 
					 | 
				
			||||||
* ``{created_year_short}``: Year created only, formatted as the year without century, zero padded.
 | 
					 | 
				
			||||||
* ``{created_month}``: Month created only (number 01-12).
 | 
					 | 
				
			||||||
* ``{created_month_name}``: Month created name, as per locale
 | 
					 | 
				
			||||||
* ``{created_month_name_short}``: Month created abbreviated name, as per locale
 | 
					 | 
				
			||||||
* ``{created_day}``: Day created only (number 01-31).
 | 
					 | 
				
			||||||
* ``{added}``: The full date (ISO format) the document was added to paperless.
 | 
					 | 
				
			||||||
* ``{added_year}``: Year added only.
 | 
					 | 
				
			||||||
* ``{added_year_short}``: Year added only, formatted as the year without century, zero padded.
 | 
					 | 
				
			||||||
* ``{added_month}``: Month added only (number 01-12).
 | 
					 | 
				
			||||||
* ``{added_month_name}``: Month added name, as per locale
 | 
					 | 
				
			||||||
* ``{added_month_name_short}``: Month added abbreviated name, as per locale
 | 
					 | 
				
			||||||
* ``{added_day}``: Day added only (number 01-31).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless will try to conserve the information from your database as much as possible.
 | 
					 | 
				
			||||||
However, some characters that you can use in document titles and correspondent names (such
 | 
					 | 
				
			||||||
as ``: \ /`` and a couple more) are not allowed in filenames and will be replaced with dashes.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If paperless detects that two documents share the same filename, paperless will automatically
 | 
					 | 
				
			||||||
append ``_01``, ``_02``, etc to the filename. This happens if all the placeholders in a filename
 | 
					 | 
				
			||||||
evaluate to the same value.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
    You can affect how empty placeholders are treated by changing the following setting to
 | 
					 | 
				
			||||||
    `true`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=True
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Doing this results in all empty placeholders resolving to "" instead of "none" as stated above.
 | 
					 | 
				
			||||||
    Spaces before empty placeholders are removed as well, empty directories are omitted.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless checks the filename of a document whenever it is saved. Therefore,
 | 
					 | 
				
			||||||
    you need to update the filenames of your documents and move them after altering
 | 
					 | 
				
			||||||
    this setting by invoking the :ref:`document renamer <utilities-renamer>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Make absolutely sure you get the spelling of the placeholders right, or else
 | 
					 | 
				
			||||||
    paperless will use the default naming scheme instead.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    As of now, you could totally tell paperless to store your files anywhere outside
 | 
					 | 
				
			||||||
    the media directory by setting
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        PAPERLESS_FILENAME_FORMAT=../../my/custom/location/{title}
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    However, keep in mind that inside docker, if files get stored outside of the
 | 
					 | 
				
			||||||
    predefined volumes, they will be lost after a restart of paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Storage paths
 | 
					 | 
				
			||||||
#############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
One of the best things in Paperless is that you can not only access the documents via the
 | 
					 | 
				
			||||||
web interface, but also via the file system.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When as single storage layout is not sufficient for your use case, storage paths come to
 | 
					 | 
				
			||||||
the rescue. Storage paths allow you to configure more precisely where each document is stored
 | 
					 | 
				
			||||||
in the file system.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
- Each storage path is a `PAPERLESS_FILENAME_FORMAT` and follows the rules described above
 | 
					 | 
				
			||||||
- Each document is assigned a storage path using the matching algorithms described above, but
 | 
					 | 
				
			||||||
  can be overwritten at any time
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
For example, you could define the following two storage paths:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1. Normal communications are put into a folder structure sorted by `year/correspondent`
 | 
					 | 
				
			||||||
2. Communications with insurance companies are stored in a flat structure with longer file names,
 | 
					 | 
				
			||||||
   but containing the full date of the correspondence.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    By Year = {created_year}/{correspondent}/{title}
 | 
					 | 
				
			||||||
    Insurances = Insurances/{correspondent}/{created_year}-{created_month}-{created_day} {title}
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you then map these storage paths to the documents, you might get the following result.
 | 
					 | 
				
			||||||
For simplicity, `By Year` defines the same structure as in the previous example above.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: text
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
   2019/                                   # By Year
 | 
					 | 
				
			||||||
      My bank/
 | 
					 | 
				
			||||||
        Statement January.pdf
 | 
					 | 
				
			||||||
        Statement February.pdf
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Insurances/                           # Insurances
 | 
					 | 
				
			||||||
      Healthcare 123/
 | 
					 | 
				
			||||||
        2022-01-01 Statement January.pdf
 | 
					 | 
				
			||||||
        2022-02-02 Letter.pdf
 | 
					 | 
				
			||||||
        2022-02-03 Letter.pdf
 | 
					 | 
				
			||||||
      Dental 456/
 | 
					 | 
				
			||||||
        2021-12-01 New Conditions.pdf
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defining a storage path is optional. If no storage path is defined for a document, the global
 | 
					 | 
				
			||||||
    `PAPERLESS_FILENAME_FORMAT` is applied.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If you adjust the format of an existing storage path, old documents don't get relocated automatically.
 | 
					 | 
				
			||||||
    You need to run the :ref:`document renamer <utilities-renamer>` to adjust their pathes.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _advanced-celery-monitoring:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Celery Monitoring
 | 
					 | 
				
			||||||
#################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The monitoring tool `Flower <https://flower.readthedocs.io/en/latest/index.html>`_ can be used to view more
 | 
					 | 
				
			||||||
detailed information about the health of the celery workers used for asynchronous tasks.  This includes details
 | 
					 | 
				
			||||||
on currently running, queued and completed tasks, timing and more.  Flower can also be used with Prometheus, as it
 | 
					 | 
				
			||||||
exports metrics.  For details on its capabilities, refer to the Flower documentation.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
To configure Flower further, create a `flowerconfig.py` and place it into the `src/paperless` directory.  For
 | 
					 | 
				
			||||||
a Docker installation, you can use volumes to accomplish this:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: yaml
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    services:
 | 
					 | 
				
			||||||
      # ...
 | 
					 | 
				
			||||||
      webserver:
 | 
					 | 
				
			||||||
        # ...
 | 
					 | 
				
			||||||
        volumes:
 | 
					 | 
				
			||||||
          - /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Custom Container Initialization
 | 
					 | 
				
			||||||
###############################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The Docker image includes the ability to run custom user scripts during startup.  This could be
 | 
					 | 
				
			||||||
utilized for installing additional tools or Python packages, for example.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
To utilize this, mount a folder containing your scripts to the custom initialization directory, `/custom-cont-init.d`
 | 
					 | 
				
			||||||
and place scripts you wish to run inside.  For security, the folder and its contents must be owned by `root`.
 | 
					 | 
				
			||||||
Additionally, scripts must only be writable by `root`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Your scripts will be run directly before the webserver completes startup.  Scripts will be run by the `root` user.
 | 
					 | 
				
			||||||
This is an advanced functionality with which you could break functionality or lose data.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
For example, using Docker Compose:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: yaml
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    services:
 | 
					 | 
				
			||||||
      # ...
 | 
					 | 
				
			||||||
      webserver:
 | 
					 | 
				
			||||||
        # ...
 | 
					 | 
				
			||||||
        volumes:
 | 
					 | 
				
			||||||
          - /path/to/my/scripts:/custom-cont-init.d:ro
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _advanced-mysql-caveats:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
MySQL Caveats
 | 
					 | 
				
			||||||
#############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Case Sensitivity
 | 
					 | 
				
			||||||
================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The database interface does not provide a method to configure a MySQL database to
 | 
					 | 
				
			||||||
be case sensitive.  This would prevent a user from creating a tag ``Name`` and ``NAME``
 | 
					 | 
				
			||||||
as they are considered the same.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Per Django documentation, to enable this requires manual intervention.  To enable
 | 
					 | 
				
			||||||
case sensetive tables, you can execute the following command against each table:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``ALTER TABLE <table_name> CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can also set the default for new tables (this does NOT affect existing tables) with:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``ALTER DATABASE <db_name> CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;``
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										299
									
								
								docs/api.rst
									
									
									
									
									
								
							
							
						
						
									
										299
									
								
								docs/api.rst
									
									
									
									
									
								
							@@ -1,303 +1,12 @@
 | 
				
			|||||||
 | 
					.. _api:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
************
 | 
					************
 | 
				
			||||||
The REST API
 | 
					The REST API
 | 
				
			||||||
************
 | 
					************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless makes use of the `Django REST Framework`_ standard API interface.
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
It provides a browsable API for most of its endpoints, which you can inspect
 | 
					 | 
				
			||||||
at ``http://<paperless-host>:<port>/api/``. This also documents most of the
 | 
					 | 
				
			||||||
available filters and ordering fields.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. _Django REST Framework: http://django-rest-framework.org/
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The API provides 5 main endpoints:
 | 
					    You will be redirected shortly...
 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``/api/documents/``: Full CRUD support, except POSTing new documents. See below.
 | 
					 | 
				
			||||||
*   ``/api/correspondents/``: Full CRUD support.
 | 
					 | 
				
			||||||
*   ``/api/document_types/``: Full CRUD support.
 | 
					 | 
				
			||||||
*   ``/api/logs/``: Read-Only.
 | 
					 | 
				
			||||||
*   ``/api/tags/``: Full CRUD support.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
All of these endpoints except for the logging endpoint
 | 
					 | 
				
			||||||
allow you to fetch, edit and delete individual objects
 | 
					 | 
				
			||||||
by appending their primary key to the path, for example ``/api/documents/454/``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The objects served by the document endpoint contain the following fields:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``id``: ID of the document. Read-only.
 | 
					 | 
				
			||||||
*   ``title``: Title of the document.
 | 
					 | 
				
			||||||
*   ``content``: Plain text content of the document.
 | 
					 | 
				
			||||||
*   ``tags``: List of IDs of tags assigned to this document, or empty list.
 | 
					 | 
				
			||||||
*   ``document_type``: Document type of this document, or null.
 | 
					 | 
				
			||||||
*   ``correspondent``:  Correspondent of this document or null.
 | 
					 | 
				
			||||||
*   ``created``: The date time at which this document was created.
 | 
					 | 
				
			||||||
*   ``created_date``: The date (YYYY-MM-DD) at which this document was created. Optional. If also passed with created, this is ignored.
 | 
					 | 
				
			||||||
*   ``modified``: The date at which this document was last edited in paperless. Read-only.
 | 
					 | 
				
			||||||
*   ``added``: The date at which this document was added to paperless. Read-only.
 | 
					 | 
				
			||||||
*   ``archive_serial_number``: The identifier of this document in a physical document archive.
 | 
					 | 
				
			||||||
*   ``original_file_name``: Verbose filename of the original document. Read-only.
 | 
					 | 
				
			||||||
*   ``archived_file_name``: Verbose filename of the archived document. Read-only. Null if no archived document is available.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Downloading documents
 | 
					 | 
				
			||||||
#####################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
In addition to that, the document endpoint offers these additional actions on
 | 
					 | 
				
			||||||
individual documents:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``/api/documents/<pk>/download/``: Download the document.
 | 
					 | 
				
			||||||
*   ``/api/documents/<pk>/preview/``: Display the document inline,
 | 
					 | 
				
			||||||
    without downloading it.
 | 
					 | 
				
			||||||
*   ``/api/documents/<pk>/thumb/``: Download the PNG thumbnail of a document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless generates archived PDF/A documents from consumed files and stores both
 | 
					 | 
				
			||||||
the original files as well as the archived files. By default, the endpoints
 | 
					 | 
				
			||||||
for previews and downloads serve the archived file, if it is available.
 | 
					 | 
				
			||||||
Otherwise, the original file is served.
 | 
					 | 
				
			||||||
Some document cannot be archived.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The endpoints correctly serve the response header fields ``Content-Disposition``
 | 
					 | 
				
			||||||
and ``Content-Type`` to indicate the filename for download and the type of content of
 | 
					 | 
				
			||||||
the document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
In order to download or preview the original document when an archived document is available,
 | 
					 | 
				
			||||||
supply the query parameter ``original=true``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless used to provide these functionality at ``/fetch/<pk>/preview``,
 | 
					 | 
				
			||||||
    ``/fetch/<pk>/thumb`` and ``/fetch/<pk>/doc``. Redirects to the new URLs
 | 
					 | 
				
			||||||
    are in place. However, if you use these old URLs to access documents, you
 | 
					 | 
				
			||||||
    should update your app or script to use the new URLs.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Getting document metadata
 | 
					 | 
				
			||||||
#########################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The api also has an endpoint to retrieve read-only metadata about specific documents. this
 | 
					 | 
				
			||||||
information is not served along with the document objects, since it requires reading
 | 
					 | 
				
			||||||
files and would therefore slow down document lists considerably.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Access the metadata of a document with an ID ``id`` at ``/api/documents/<id>/metadata/``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The endpoint reports the following data:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``original_checksum``: MD5 checksum of the original document.
 | 
					 | 
				
			||||||
*   ``original_size``: Size of the original document, in bytes.
 | 
					 | 
				
			||||||
*   ``original_mime_type``: Mime type of the original document.
 | 
					 | 
				
			||||||
*   ``media_filename``: Current filename of the document, under which it is stored inside the media directory.
 | 
					 | 
				
			||||||
*   ``has_archive_version``: True, if this document is archived, false otherwise.
 | 
					 | 
				
			||||||
*   ``original_metadata``: A list of metadata associated with the original document. See below.
 | 
					 | 
				
			||||||
*   ``archive_checksum``: MD5 checksum of the archived document, or null.
 | 
					 | 
				
			||||||
*   ``archive_size``: Size of the archived document in bytes, or null.
 | 
					 | 
				
			||||||
*   ``archive_metadata``: Metadata associated with the archived document, or null. See below.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
File metadata is reported as a list of objects in the following form:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: json
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [
 | 
					 | 
				
			||||||
        {
 | 
					 | 
				
			||||||
            "namespace": "http://ns.adobe.com/pdf/1.3/",
 | 
					 | 
				
			||||||
            "prefix": "pdf",
 | 
					 | 
				
			||||||
            "key": "Producer",
 | 
					 | 
				
			||||||
            "value": "SparklePDF, Fancy edition"
 | 
					 | 
				
			||||||
        },
 | 
					 | 
				
			||||||
    ]
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``namespace`` and ``prefix`` can be null. The actual metadata reported depends on the file type and the metadata
 | 
					 | 
				
			||||||
available in that specific document. Paperless only reports PDF metadata at this point.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Authorization
 | 
					 | 
				
			||||||
#############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The REST api provides three different forms of authentication.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Basic authentication
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Authorize by providing a HTTP header in the form
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Authorization: Basic <credentials>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    where ``credentials`` is a base64-encoded string of ``<username>:<password>``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Session authentication
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    When you're logged into paperless in your browser, you're automatically
 | 
					 | 
				
			||||||
    logged into the API as well and don't need to provide any authorization
 | 
					 | 
				
			||||||
    headers.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Token authentication
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless also offers an endpoint to acquire authentication tokens.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    POST a username and password as a form or json string to ``/api/token/``
 | 
					 | 
				
			||||||
    and paperless will respond with a token, if the login data is correct.
 | 
					 | 
				
			||||||
    This token can be used to authenticate other requests with the
 | 
					 | 
				
			||||||
    following HTTP header:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Authorization: Token <token>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Tokens can be managed and revoked in the paperless admin.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Searching for documents
 | 
					 | 
				
			||||||
#######################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Full text searching is available on the ``/api/documents/`` endpoint. Two specific
 | 
					 | 
				
			||||||
query parameters cause the API to return full text search results:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``/api/documents/?query=your%20search%20query``: Search for a document using a full text query.
 | 
					 | 
				
			||||||
    For details on the syntax, see :ref:`basic-usage_searching`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``/api/documents/?more_like=1234``: Search for documents similar to the document with id 1234.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Pagination works exactly the same as it does for normal requests on this endpoint.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Certain limitations apply to full text queries:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Results are always sorted by search score. The results matching the query best will show up first.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Only a small subset of filtering parameters are supported.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Furthermore, each returned document has an additional ``__search_hit__`` attribute with various information
 | 
					 | 
				
			||||||
about the search results:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    {
 | 
					 | 
				
			||||||
        "count": 31,
 | 
					 | 
				
			||||||
        "next": "http://localhost:8000/api/documents/?page=2&query=test",
 | 
					 | 
				
			||||||
        "previous": null,
 | 
					 | 
				
			||||||
        "results": [
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            {
 | 
					 | 
				
			||||||
                "id": 123,
 | 
					 | 
				
			||||||
                "title": "title",
 | 
					 | 
				
			||||||
                "content": "content",
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                "__search_hit__": {
 | 
					 | 
				
			||||||
                    "score": 0.343,
 | 
					 | 
				
			||||||
                    "highlights": "text <span class=\"match\">Test</span> text",
 | 
					 | 
				
			||||||
                    "rank": 23
 | 
					 | 
				
			||||||
                }
 | 
					 | 
				
			||||||
            },
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        ]
 | 
					 | 
				
			||||||
    }
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``score`` is an indication how well this document matches the query relative to the other search results.
 | 
					 | 
				
			||||||
*   ``highlights`` is an excerpt from the document content and highlights the search terms with ``<span>`` tags as shown above.
 | 
					 | 
				
			||||||
*   ``rank`` is the index of the search results. The first result will have rank 0.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``/api/search/autocomplete/``
 | 
					 | 
				
			||||||
=============================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Get auto completions for a partial search term.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Query parameters:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``term``: The incomplete term.
 | 
					 | 
				
			||||||
*   ``limit``: Amount of results. Defaults to 10.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Results returned by the endpoint are ordered by importance of the term in the
 | 
					 | 
				
			||||||
document index. The first result is the term that has the highest Tf/Idf score
 | 
					 | 
				
			||||||
in the index.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: json
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [
 | 
					 | 
				
			||||||
        "term1",
 | 
					 | 
				
			||||||
        "term3",
 | 
					 | 
				
			||||||
        "term6",
 | 
					 | 
				
			||||||
        "term4"
 | 
					 | 
				
			||||||
    ]
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _api-file_uploads:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
POSTing documents
 | 
					 | 
				
			||||||
#################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The API provides a special endpoint for file uploads:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
``/api/documents/post_document/``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
POST a multipart form to this endpoint, where the form field ``document`` contains
 | 
					 | 
				
			||||||
the document that you want to upload to paperless. The filename is sanitized and
 | 
					 | 
				
			||||||
then used to store the document in a temporary directory, and the consumer will
 | 
					 | 
				
			||||||
be instructed to consume the document from there.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The endpoint supports the following optional form fields:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``title``: Specify a title that the consumer should use for the document.
 | 
					 | 
				
			||||||
*   ``created``: Specify a DateTime where the document was created (e.g. "2016-04-19" or "2016-04-19 06:15:00+02:00").
 | 
					 | 
				
			||||||
*   ``correspondent``: Specify the ID of a correspondent that the consumer should use for the document.
 | 
					 | 
				
			||||||
*   ``document_type``: Similar to correspondent.
 | 
					 | 
				
			||||||
*   ``tags``: Similar to correspondent. Specify this multiple times to have multiple tags added
 | 
					 | 
				
			||||||
    to the document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The endpoint will immediately return "OK" if the document consumption process
 | 
					 | 
				
			||||||
was started successfully. No additional status information about the consumption
 | 
					 | 
				
			||||||
process itself is available, since that happens in a different process.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _api-versioning:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
API Versioning
 | 
					 | 
				
			||||||
##############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The REST API is versioned since Paperless-ngx 1.3.0.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* Versioning ensures that changes to the API don't break older clients.
 | 
					 | 
				
			||||||
* Clients specify the specific version of the API they wish to use with every request and Paperless will handle the request using the specified API version.
 | 
					 | 
				
			||||||
* Even if the underlying data model changes, older API versions will always serve compatible data.
 | 
					 | 
				
			||||||
* If no version is specified, Paperless will serve version 1 to ensure compatibility with older clients that do not request a specific API version.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
API versions are specified by submitting an additional HTTP ``Accept`` header with every request:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Accept: application/json; version=6
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If an invalid version is specified, Paperless 1.3.0 will respond with "406 Not Acceptable" and an error message in the body.
 | 
					 | 
				
			||||||
Earlier versions of Paperless will serve API version 1 regardless of whether a version is specified via the ``Accept`` header.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If a client wishes to verify whether it is compatible with any given server, the following procedure should be performed:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Perform an *authenticated* request against any API endpoint. If the server is on version 1.3.0 or newer, the server will
 | 
					 | 
				
			||||||
    add two custom headers to the response:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        X-Api-Version: 2
 | 
					 | 
				
			||||||
        X-Version: 1.3.0
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Determine whether the client is compatible with this server based on the presence/absence of these headers and their values if present.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
API Changelog
 | 
					 | 
				
			||||||
=============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Version 1
 | 
					 | 
				
			||||||
---------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Initial API version.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Version 2
 | 
					 | 
				
			||||||
---------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* Added field ``Tag.color``. This read/write string field contains a hex color such as ``#a6cee3``.
 | 
					 | 
				
			||||||
* Added read-only field ``Tag.text_color``. This field contains the text color to use for a specific tag, which is either black or white depending on the brightness of ``Tag.color``.
 | 
					 | 
				
			||||||
* Removed field ``Tag.colour``.
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										2445
									
								
								docs/changelog.md
									
									
									
									
									
								
							
							
						
						
									
										2445
									
								
								docs/changelog.md
									
									
									
									
									
								
							
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							
							
								
								
									
										11
									
								
								docs/changelog.rst
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										11
									
								
								docs/changelog.rst
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,11 @@
 | 
				
			|||||||
 | 
					.. _changelog:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					*********
 | 
				
			||||||
 | 
					Changelog
 | 
				
			||||||
 | 
					*********
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    You will be redirected shortly...
 | 
				
			||||||
@@ -4,928 +4,9 @@
 | 
				
			|||||||
Configuration
 | 
					Configuration
 | 
				
			||||||
*************
 | 
					*************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless provides a wide range of customizations.
 | 
					 | 
				
			||||||
Depending on how you run paperless, these settings have to be defined in different
 | 
					 | 
				
			||||||
places.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
*   If you run paperless on docker, ``paperless.conf`` is not used. Rather, configure
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
    paperless by copying necessary options to ``docker-compose.env``.
 | 
					 | 
				
			||||||
*   If you are running paperless on anything else, paperless will search for the
 | 
					 | 
				
			||||||
    configuration file in these locations and use the first one it finds:
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
    .. code::
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
        /path/to/paperless/paperless.conf
 | 
					    You will be redirected shortly...
 | 
				
			||||||
        /etc/paperless.conf
 | 
					 | 
				
			||||||
        /usr/local/etc/paperless.conf
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Required services
 | 
					 | 
				
			||||||
#################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_REDIS=<url>
 | 
					 | 
				
			||||||
    This is required for processing scheduled tasks such as email fetching, index
 | 
					 | 
				
			||||||
    optimization and for training the automatic document matcher.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    * If your Redis server needs login credentials PAPERLESS_REDIS = ``redis://<username>:<password>@<host>:<port>``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    * With the requirepass option PAPERLESS_REDIS = ``redis://:<password>@<host>:<port>``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    `More information on securing your Redis Instance <https://redis.io/docs/getting-started/#securing-redis>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to redis://localhost:6379.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBENGINE=<engine_name>
 | 
					 | 
				
			||||||
    Optional, gives the ability to choose Postgres or MariaDB for database engine.
 | 
					 | 
				
			||||||
    Available options are `postgresql` and `mariadb`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is `postgresql`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
      Using MariaDB comes with some caveats.  See :ref:`advanced-mysql-caveats` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBHOST=<hostname>
 | 
					 | 
				
			||||||
    By default, sqlite is used as the database backend. This can be changed here.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Set PAPERLESS_DBHOST and another database will be used instead of sqlite.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBPORT=<port>
 | 
					 | 
				
			||||||
    Adjust port if necessary.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is 5432.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBNAME=<name>
 | 
					 | 
				
			||||||
    Database name in PostgreSQL or MariaDB.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "paperless".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBUSER=<name>
 | 
					 | 
				
			||||||
    Database user in PostgreSQL or MariaDB.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "paperless".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBPASS=<password>
 | 
					 | 
				
			||||||
    Database password for PostgreSQL or MariaDB.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "paperless".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DBSSLMODE=<mode>
 | 
					 | 
				
			||||||
    SSL mode to use when connecting to PostgreSQL.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    See `the official documentation about sslmode <https://www.postgresql.org/docs/current/libpq-ssl.html>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is ``prefer``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DB_TIMEOUT=<float>
 | 
					 | 
				
			||||||
    Amount of time for a database connection to wait for the database to unlock.
 | 
					 | 
				
			||||||
    Mostly applicable for an sqlite based installation, consider changing to postgresql
 | 
					 | 
				
			||||||
    if you need to increase this.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to unset, keeping the Django defaults.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paths and folders
 | 
					 | 
				
			||||||
#################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMPTION_DIR=<path>
 | 
					 | 
				
			||||||
    This where your documents should go to be consumed.  Make sure that it exists
 | 
					 | 
				
			||||||
    and that the user running the paperless service can read/write its contents
 | 
					 | 
				
			||||||
    before you start Paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Don't change this when using docker, as it only changes the path within the
 | 
					 | 
				
			||||||
    container. Change the local consumption directory in the docker-compose.yml
 | 
					 | 
				
			||||||
    file instead.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "../consume/", relative to the "src" directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DATA_DIR=<path>
 | 
					 | 
				
			||||||
    This is where paperless stores all its data (search index, SQLite database,
 | 
					 | 
				
			||||||
    classification model, etc).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "../data/", relative to the "src" directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_TRASH_DIR=<path>
 | 
					 | 
				
			||||||
    Instead of removing deleted documents, they are moved to this directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This must be writeable by the user running paperless. When running inside
 | 
					 | 
				
			||||||
    docker, ensure that this path is within a permanent volume (such as
 | 
					 | 
				
			||||||
    "../media/trash") so it won't get lost on upgrades.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to empty (i.e. really delete documents).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_MEDIA_ROOT=<path>
 | 
					 | 
				
			||||||
    This is where your documents and thumbnails are stored.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You can set this and PAPERLESS_DATA_DIR to the same folder to have paperless
 | 
					 | 
				
			||||||
    store all its data within the same volume.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "../media/", relative to the "src" directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_STATICDIR=<path>
 | 
					 | 
				
			||||||
    Override the default STATIC_ROOT here.  This is where all static files
 | 
					 | 
				
			||||||
    created using "collectstatic" manager command are stored.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Unless you're doing something fancy, there is no need to override this.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "../static/", relative to the "src" directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_FILENAME_FORMAT=<format>
 | 
					 | 
				
			||||||
    Changes the filenames paperless uses to store documents in the media directory.
 | 
					 | 
				
			||||||
    See :ref:`advanced-file_name_handling` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is none, which disables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=<bool>
 | 
					 | 
				
			||||||
    Tells paperless to replace placeholders in `PAPERLESS_FILENAME_FORMAT` that would resolve
 | 
					 | 
				
			||||||
    to 'none' to be omitted from the resulting filename. This also holds true for directory
 | 
					 | 
				
			||||||
    names.
 | 
					 | 
				
			||||||
    See :ref:`advanced-file_name_handling` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to `false` which disables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_LOGGING_DIR=<path>
 | 
					 | 
				
			||||||
    This is where paperless will store log files.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "``PAPERLESS_DATA_DIR``/log/".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Logging
 | 
					 | 
				
			||||||
#######
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_LOGROTATE_MAX_SIZE=<num>
 | 
					 | 
				
			||||||
    Maximum file size for log files before they are rotated, in bytes.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 1 MiB.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_LOGROTATE_MAX_BACKUPS=<num>
 | 
					 | 
				
			||||||
    Number of rotated log files to keep.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 20.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _hosting-and-security:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Hosting & Security
 | 
					 | 
				
			||||||
##################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_SECRET_KEY=<key>
 | 
					 | 
				
			||||||
    Paperless uses this to make session tokens. If you expose paperless on the
 | 
					 | 
				
			||||||
    internet, you need to change this, since the default secret is well known.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Use any sequence of characters. The more, the better. You don't need to
 | 
					 | 
				
			||||||
    remember this. Just face-roll your keyboard.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is listed in the file ``src/paperless/settings.py``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_URL=<url>
 | 
					 | 
				
			||||||
    This setting can be used to set the three options below (ALLOWED_HOSTS,
 | 
					 | 
				
			||||||
    CORS_ALLOWED_HOSTS and CSRF_TRUSTED_ORIGINS). If the other options are
 | 
					 | 
				
			||||||
    set the values will be combined with this one. Do not include a trailing
 | 
					 | 
				
			||||||
    slash. E.g. https://paperless.domain.com
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to empty string, leaving the other settings unaffected.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CSRF_TRUSTED_ORIGINS=<comma-separated-list>
 | 
					 | 
				
			||||||
    A list of trusted origins for unsafe requests (e.g. POST). As of Django 4.0
 | 
					 | 
				
			||||||
    this is required to access the Django admin via the web.
 | 
					 | 
				
			||||||
    See https://docs.djangoproject.com/en/4.0/ref/settings/#csrf-trusted-origins
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Can also be set using PAPERLESS_URL (see above).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to empty string, which does not add any origins to the trusted list.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ALLOWED_HOSTS=<comma-separated-list>
 | 
					 | 
				
			||||||
    If you're planning on putting Paperless on the open internet, then you
 | 
					 | 
				
			||||||
    really should set this value to the domain name you're using.  Failing to do
 | 
					 | 
				
			||||||
    so leaves you open to HTTP host header attacks:
 | 
					 | 
				
			||||||
    https://docs.djangoproject.com/en/3.1/topics/security/#host-header-validation
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Just remember that this is a comma-separated list, so "example.com" is fine,
 | 
					 | 
				
			||||||
    as is "example.com,www.example.com", but NOT " example.com" or "example.com,"
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Can also be set using PAPERLESS_URL (see above).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If manually set, please remember to include "localhost". Otherwise docker
 | 
					 | 
				
			||||||
    healthcheck will fail.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "*", which is all hosts.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CORS_ALLOWED_HOSTS=<comma-separated-list>
 | 
					 | 
				
			||||||
    You need to add your servers to the list of allowed hosts that can do CORS
 | 
					 | 
				
			||||||
    calls. Set this to your public domain name.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Can also be set using PAPERLESS_URL (see above).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "http://localhost:8000".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_FORCE_SCRIPT_NAME=<path>
 | 
					 | 
				
			||||||
    To host paperless under a subpath url like example.com/paperless you set
 | 
					 | 
				
			||||||
    this value to /paperless. No trailing slash!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to none, which hosts paperless at "/".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_STATIC_URL=<path>
 | 
					 | 
				
			||||||
    Override the STATIC_URL here.  Unless you're hosting Paperless off a
 | 
					 | 
				
			||||||
    subdomain like /paperless/, you probably don't need to change this.
 | 
					 | 
				
			||||||
    If you do change it, be sure to include the trailing slash.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "/static/".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        When hosting paperless behind a reverse proxy like Traefik or Nginx at a subpath e.g.
 | 
					 | 
				
			||||||
        example.com/paperlessngx you will also need to set ``PAPERLESS_FORCE_SCRIPT_NAME``
 | 
					 | 
				
			||||||
        (see above).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_AUTO_LOGIN_USERNAME=<username>
 | 
					 | 
				
			||||||
    Specify a username here so that paperless will automatically perform login
 | 
					 | 
				
			||||||
    with the selected user.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. danger::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Do not use this when exposing paperless on the internet. There are no
 | 
					 | 
				
			||||||
        checks in place that would prevent you from doing this.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to none, which disables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ADMIN_USER=<username>
 | 
					 | 
				
			||||||
    If this environment variable is specified, Paperless automatically creates
 | 
					 | 
				
			||||||
    a superuser with the provided username at start. This is useful in cases
 | 
					 | 
				
			||||||
    where you can not run the `createsuperuser` command separately, such as Kubernetes
 | 
					 | 
				
			||||||
    or AWS ECS.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Requires `PAPERLESS_ADMIN_PASSWORD` to be set.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This will not change an existing [super]user's password, nor will
 | 
					 | 
				
			||||||
        it recreate a user that already exists. You can leave this throughout
 | 
					 | 
				
			||||||
        the lifecycle of the containers.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ADMIN_MAIL=<email>
 | 
					 | 
				
			||||||
    (Optional) Specify superuser email address. Only used when
 | 
					 | 
				
			||||||
    `PAPERLESS_ADMIN_USER` is set.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``root@localhost``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ADMIN_PASSWORD=<password>
 | 
					 | 
				
			||||||
    Only used when `PAPERLESS_ADMIN_USER` is set.
 | 
					 | 
				
			||||||
    This will be the password of the automatically created superuser.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_COOKIE_PREFIX=<str>
 | 
					 | 
				
			||||||
    Specify a prefix that is added to the cookies used by paperless to identify
 | 
					 | 
				
			||||||
    the currently logged in user. This is useful for when you're running two
 | 
					 | 
				
			||||||
    instances of paperless on the same host.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    After changing this, you will have to login again.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``""``, which does not alter the cookie names.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ENABLE_HTTP_REMOTE_USER=<bool>
 | 
					 | 
				
			||||||
    Allows authentication via HTTP_REMOTE_USER which is used by some SSO
 | 
					 | 
				
			||||||
    applications.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This will allow authentication by simply adding a ``Remote-User: <username>`` header
 | 
					 | 
				
			||||||
        to a request. Use with care! You especially *must* ensure that any such header is not
 | 
					 | 
				
			||||||
        passed from your proxy server to paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        If you're exposing paperless to the internet directly, do not use this.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Also see the warning `in the official documentation <https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to `false` which disables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>
 | 
					 | 
				
			||||||
    If `PAPERLESS_ENABLE_HTTP_REMOTE_USER` is enabled, this property allows to
 | 
					 | 
				
			||||||
    customize the name of the HTTP header from which the authenticated username
 | 
					 | 
				
			||||||
    is extracted. Values are in terms of
 | 
					 | 
				
			||||||
    [HttpRequest.META](https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META).
 | 
					 | 
				
			||||||
    Thus, the configured value must start with `HTTP_` followed by the
 | 
					 | 
				
			||||||
    normalized actual header name.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to `HTTP_REMOTE_USER`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_LOGOUT_REDIRECT_URL=<str>
 | 
					 | 
				
			||||||
    URL to redirect the user to after a logout. This can be used together with
 | 
					 | 
				
			||||||
    `PAPERLESS_ENABLE_HTTP_REMOTE_USER` to redirect the user back to the SSO
 | 
					 | 
				
			||||||
    application's logout page.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to None, which disables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _configuration-ocr:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
OCR settings
 | 
					 | 
				
			||||||
############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless uses `OCRmyPDF <https://ocrmypdf.readthedocs.io/en/latest/>`_ for
 | 
					 | 
				
			||||||
performing OCR on documents and images. Paperless uses sensible defaults for
 | 
					 | 
				
			||||||
most settings, but all of them can be configured to your needs.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_LANGUAGE=<lang>
 | 
					 | 
				
			||||||
    Customize the language that paperless will attempt to use when
 | 
					 | 
				
			||||||
    parsing documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    It should be a 3-letter language code consistent with ISO
 | 
					 | 
				
			||||||
    639: https://www.loc.gov/standards/iso639-2/php/code_list.php
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Set this to the language most of your documents are written in.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This can be a combination of multiple languages such as ``deu+eng``,
 | 
					 | 
				
			||||||
    in which case tesseract will use whatever language matches best.
 | 
					 | 
				
			||||||
    Keep in mind that tesseract uses much more cpu time with multiple
 | 
					 | 
				
			||||||
    languages enabled.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "eng".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
		Note: If your language contains a '-' such as chi-sim, you must use chi_sim
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_MODE=<mode>
 | 
					 | 
				
			||||||
    Tell paperless when and how to perform ocr on your documents. Four modes
 | 
					 | 
				
			||||||
    are available:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``skip``: Paperless skips all pages and will perform ocr only on pages
 | 
					 | 
				
			||||||
        where no text is present. This is the safest option.
 | 
					 | 
				
			||||||
    *   ``skip_noarchive``: In addition to skip, paperless won't create an
 | 
					 | 
				
			||||||
        archived version of your documents when it finds any text in them.
 | 
					 | 
				
			||||||
        This is useful if you don't want to have two almost-identical versions
 | 
					 | 
				
			||||||
        of your digital documents in the media folder. This is the fastest option.
 | 
					 | 
				
			||||||
    *   ``redo``: Paperless will OCR all pages of your documents and attempt to
 | 
					 | 
				
			||||||
        replace any existing text layers with new text. This will be useful for
 | 
					 | 
				
			||||||
        documents from scanners that already performed OCR with insufficient
 | 
					 | 
				
			||||||
        results. It will also perform OCR on purely digital documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This option may fail on some documents that have features that cannot
 | 
					 | 
				
			||||||
        be removed, such as forms. In this case, the text from the document is
 | 
					 | 
				
			||||||
        used instead.
 | 
					 | 
				
			||||||
    *   ``force``: Paperless rasterizes your documents, converting any text
 | 
					 | 
				
			||||||
        into images and puts the OCRed text on top. This works for all documents,
 | 
					 | 
				
			||||||
        however, the resulting document may be significantly larger and text
 | 
					 | 
				
			||||||
        won't appear as sharp when zoomed in.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The default is ``skip``, which only performs OCR when necessary and always
 | 
					 | 
				
			||||||
    creates archived documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Read more about this in the `OCRmyPDF documentation <https://ocrmypdf.readthedocs.io/en/latest/advanced.html#when-ocr-is-skipped>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_CLEAN=<mode>
 | 
					 | 
				
			||||||
    Tells paperless to use ``unpaper`` to clean any input document before
 | 
					 | 
				
			||||||
    sending it to tesseract. This uses more resources, but generally results
 | 
					 | 
				
			||||||
    in better OCR results. The following modes are available:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``clean``: Apply unpaper.
 | 
					 | 
				
			||||||
    *   ``clean-final``: Apply unpaper, and use the cleaned images to build the
 | 
					 | 
				
			||||||
        output file instead of the original images.
 | 
					 | 
				
			||||||
    *   ``none``: Do not apply unpaper.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``clean``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        ``clean-final`` is incompatible with ocr mode ``redo``. When both
 | 
					 | 
				
			||||||
        ``clean-final`` and the ocr mode ``redo`` is configured, ``clean``
 | 
					 | 
				
			||||||
        is used instead.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_DESKEW=<bool>
 | 
					 | 
				
			||||||
    Tells paperless to correct skewing (slight rotation of input images mainly
 | 
					 | 
				
			||||||
    due to improper scanning)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``true``, which enables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Deskewing is incompatible with ocr mode ``redo``. Deskewing will get
 | 
					 | 
				
			||||||
        disabled automatically if ``redo`` is used as the ocr mode.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_ROTATE_PAGES=<bool>
 | 
					 | 
				
			||||||
    Tells paperless to correct page rotation (90°, 180° and 270° rotation).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If you notice that paperless is not rotating incorrectly rotated
 | 
					 | 
				
			||||||
    pages (or vice versa), try adjusting the threshold up or down (see below).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``true``, which enables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=<num>
 | 
					 | 
				
			||||||
    Adjust the threshold for automatic page rotation by ``PAPERLESS_OCR_ROTATE_PAGES``.
 | 
					 | 
				
			||||||
    This is an arbitrary value reported by tesseract. "15" is a very conservative value,
 | 
					 | 
				
			||||||
    whereas "2" is a very aggressive option and will often result in correctly rotated pages
 | 
					 | 
				
			||||||
    being rotated as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "12".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_OUTPUT_TYPE=<type>
 | 
					 | 
				
			||||||
    Specify the the type of PDF documents that paperless should produce.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``pdf``: Modify the PDF document as little as possible.
 | 
					 | 
				
			||||||
    *   ``pdfa``: Convert PDF documents into PDF/A-2b documents, which is a
 | 
					 | 
				
			||||||
        subset of the entire PDF specification and meant for storing
 | 
					 | 
				
			||||||
        documents long term.
 | 
					 | 
				
			||||||
    *   ``pdfa-1``, ``pdfa-2``, ``pdfa-3`` to specify the exact version of
 | 
					 | 
				
			||||||
        PDF/A you wish to use.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If not specified, ``pdfa`` is used. Remember that paperless also keeps
 | 
					 | 
				
			||||||
    the original input file as well as the archived version.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_PAGES=<num>
 | 
					 | 
				
			||||||
    Tells paperless to use only the specified amount of pages for OCR. Documents
 | 
					 | 
				
			||||||
    with less than the specified amount of pages get OCR'ed completely.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Specifying 1 here will only use the first page.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    When combined with ``PAPERLESS_OCR_MODE=redo`` or ``PAPERLESS_OCR_MODE=force``,
 | 
					 | 
				
			||||||
    paperless will not modify any text it finds on excluded pages and copy it
 | 
					 | 
				
			||||||
    verbatim.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 0, which disables this feature and always uses all pages.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_IMAGE_DPI=<num>
 | 
					 | 
				
			||||||
    Paperless will OCR any images you put into the system and convert them
 | 
					 | 
				
			||||||
    into PDF documents. This is useful if your scanner produces images.
 | 
					 | 
				
			||||||
    In order to do so, paperless needs to know the DPI of the image.
 | 
					 | 
				
			||||||
    Most images from scanners will have this information embedded and
 | 
					 | 
				
			||||||
    paperless will detect and use that information. In case this fails, it
 | 
					 | 
				
			||||||
    uses this value as a fallback.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Set this to the DPI your scanner produces images at.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is none, which will automatically calculate image DPI so that
 | 
					 | 
				
			||||||
    the produced PDF documents are A4 sized.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_MAX_IMAGE_PIXELS=<num>
 | 
					 | 
				
			||||||
    Paperless will raise a warning when OCRing images which are over this limit and
 | 
					 | 
				
			||||||
    will not OCR images which are more than twice this limit.  Note this does not
 | 
					 | 
				
			||||||
    prevent the document from being consumed, but could result in missing text content.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If unset, will default to the value determined by
 | 
					 | 
				
			||||||
    `Pillow <https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.MAX_IMAGE_PIXELS>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Increasing this limit could cause Paperless to consume additional resources
 | 
					 | 
				
			||||||
        when consuming a file.  Be sure you have sufficient system resources.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        The limit is intended to prevent malicious files from consuming system resources
 | 
					 | 
				
			||||||
        and causing crashes and other errors.  Only increase this value if you are certain
 | 
					 | 
				
			||||||
        your documents are not malicious and you need the text which was not OCRed
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_USER_ARGS=<json>
 | 
					 | 
				
			||||||
    OCRmyPDF offers many more options. Use this parameter to specify any
 | 
					 | 
				
			||||||
    additional arguments you wish to pass to OCRmyPDF. Since Paperless uses
 | 
					 | 
				
			||||||
    the API of OCRmyPDF, you have to specify these in a format that can be
 | 
					 | 
				
			||||||
    passed to the API. See `the API reference of OCRmyPDF <https://ocrmypdf.readthedocs.io/en/latest/api.html#reference>`_
 | 
					 | 
				
			||||||
    for valid parameters. All command line options are supported, but they
 | 
					 | 
				
			||||||
    use underscores instead of dashes.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Paperless has been tested to work with the OCR options provided
 | 
					 | 
				
			||||||
        above. There are many options that are incompatible with each other,
 | 
					 | 
				
			||||||
        so specifying invalid options may prevent paperless from consuming
 | 
					 | 
				
			||||||
        any documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Specify arguments as a JSON dictionary. Keep note of lower case booleans
 | 
					 | 
				
			||||||
    and double quoted parameter names and strings. Examples:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: json
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        {"deskew": true, "optimize": 3, "unpaper_args": "--pre-rotate 90"}
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _configuration-tika:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Tika settings
 | 
					 | 
				
			||||||
#############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless can make use of `Tika <https://tika.apache.org/>`_ and
 | 
					 | 
				
			||||||
`Gotenberg <https://gotenberg.dev/>`_ for parsing and
 | 
					 | 
				
			||||||
converting "Office" documents (such as ".doc", ".xlsx" and ".odt"). If you
 | 
					 | 
				
			||||||
wish to use this, you must provide a Tika server and a Gotenberg server,
 | 
					 | 
				
			||||||
configure their endpoints, and enable the feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_TIKA_ENABLED=<bool>
 | 
					 | 
				
			||||||
    Enable (or disable) the Tika parser.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to false.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_TIKA_ENDPOINT=<url>
 | 
					 | 
				
			||||||
    Set the endpoint URL were Paperless can reach your Tika server.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "http://localhost:9998".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_TIKA_GOTENBERG_ENDPOINT=<url>
 | 
					 | 
				
			||||||
    Set the endpoint URL were Paperless can reach your Gotenberg server.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to "http://localhost:3000".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you run paperless on docker, you can add those services to the docker-compose
 | 
					 | 
				
			||||||
file (see the provided ``docker-compose.sqlite-tika.yml`` file for reference). The changes
 | 
					 | 
				
			||||||
requires are as follows:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: yaml
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    services:
 | 
					 | 
				
			||||||
        # ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        webserver:
 | 
					 | 
				
			||||||
            # ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            environment:
 | 
					 | 
				
			||||||
                # ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                PAPERLESS_TIKA_ENABLED: 1
 | 
					 | 
				
			||||||
                PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
 | 
					 | 
				
			||||||
                PAPERLESS_TIKA_ENDPOINT: http://tika:9998
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        # ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        gotenberg:
 | 
					 | 
				
			||||||
            image: gotenberg/gotenberg:7.6
 | 
					 | 
				
			||||||
            restart: unless-stopped
 | 
					 | 
				
			||||||
            command:
 | 
					 | 
				
			||||||
                - "gotenberg"
 | 
					 | 
				
			||||||
                - "--chromium-disable-routes=true"
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        tika:
 | 
					 | 
				
			||||||
            image: ghcr.io/paperless-ngx/tika:latest
 | 
					 | 
				
			||||||
            restart: unless-stopped
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Add the configuration variables to the environment of the webserver (alternatively
 | 
					 | 
				
			||||||
put the configuration in the ``docker-compose.env`` file) and add the additional
 | 
					 | 
				
			||||||
services below the webserver service. Watch out for indentation.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Make sure to use the correct format `PAPERLESS_TIKA_ENABLED = 1` so python_dotenv can parse the statement correctly.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Software tweaks
 | 
					 | 
				
			||||||
###############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_TASK_WORKERS=<num>
 | 
					 | 
				
			||||||
    Paperless does multiple things in the background: Maintain the search index,
 | 
					 | 
				
			||||||
    maintain the automatic matching algorithm, check emails, consume documents,
 | 
					 | 
				
			||||||
    etc. This variable specifies how many things it will do in parallel.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 1
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_THREADS_PER_WORKER=<num>
 | 
					 | 
				
			||||||
    Furthermore, paperless uses multiple threads when consuming documents to
 | 
					 | 
				
			||||||
    speed up OCR. This variable specifies how many pages paperless will process
 | 
					 | 
				
			||||||
    in parallel on a single document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Ensure that the product
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            PAPERLESS_TASK_WORKERS * PAPERLESS_THREADS_PER_WORKER
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        does not exceed your CPU core count or else paperless will be extremely slow.
 | 
					 | 
				
			||||||
        If you want paperless to process many documents in parallel, choose a high
 | 
					 | 
				
			||||||
        worker count. If you want paperless to process very large documents faster,
 | 
					 | 
				
			||||||
        use a higher thread per worker count.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The default is a balance between the two, according to your CPU core count,
 | 
					 | 
				
			||||||
    with a slight favor towards threads per worker:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    | CPU core count | Workers | Threads |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |              1 |       1 |       1 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |              2 |       2 |       1 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |              4 |       2 |       2 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |              6 |       2 |       3 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |              8 |       2 |       4 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |             12 |       3 |       4 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
    |             16 |       4 |       4 |
 | 
					 | 
				
			||||||
    +----------------+---------+---------+
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If you only specify PAPERLESS_TASK_WORKERS, paperless will adjust
 | 
					 | 
				
			||||||
    PAPERLESS_THREADS_PER_WORKER automatically.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_WORKER_TIMEOUT=<num>
 | 
					 | 
				
			||||||
    Machines with few cores or weak ones might not be able to finish OCR on
 | 
					 | 
				
			||||||
    large documents within the default 1800 seconds. So extending this timeout
 | 
					 | 
				
			||||||
    may prove to be useful on weak hardware setups.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_WORKER_RETRY=<num>
 | 
					 | 
				
			||||||
    If PAPERLESS_WORKER_TIMEOUT has been configured, the retry time for a task can
 | 
					 | 
				
			||||||
    also be configured.  By default, this value will be set to 10s more than the
 | 
					 | 
				
			||||||
    worker timeout.  This value should never be set less than the worker timeout.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_TIME_ZONE=<timezone>
 | 
					 | 
				
			||||||
    Set the time zone here.
 | 
					 | 
				
			||||||
    See https://docs.djangoproject.com/en/3.1/ref/settings/#std:setting-TIME_ZONE
 | 
					 | 
				
			||||||
    for details on how to set it.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to UTC.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _configuration-polling:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_POLLING=<num>
 | 
					 | 
				
			||||||
    If paperless won't find documents added to your consume folder, it might
 | 
					 | 
				
			||||||
    not be able to automatically detect filesystem changes. In that case,
 | 
					 | 
				
			||||||
    specify a polling interval in seconds here, which will then cause paperless
 | 
					 | 
				
			||||||
    to periodically check your consumption directory for changes. This will also
 | 
					 | 
				
			||||||
    disable listening for file system changes with ``inotify``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 0, which disables polling and uses filesystem notifications.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_POLLING_RETRY_COUNT=<num>
 | 
					 | 
				
			||||||
    If consumer polling is enabled, sets the number of times paperless will check for a
 | 
					 | 
				
			||||||
    file to remain unmodified.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 5.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_POLLING_DELAY=<num>
 | 
					 | 
				
			||||||
    If consumer polling is enabled, sets the delay in seconds between each check (above) paperless
 | 
					 | 
				
			||||||
    will do while waiting for a file to remain unmodified.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 5.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _configuration-inotify:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_INOTIFY_DELAY=<num>
 | 
					 | 
				
			||||||
    Sets the time in seconds the consumer will wait for additional events
 | 
					 | 
				
			||||||
    from inotify before the consumer will consider a file ready and begin consumption.
 | 
					 | 
				
			||||||
    Certain scanners or network setups may generate multiple events for a single file,
 | 
					 | 
				
			||||||
    leading to multiple consumers working on the same file.  Configure this to
 | 
					 | 
				
			||||||
    prevent that.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 0.5 seconds.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool>
 | 
					 | 
				
			||||||
    When the consumer detects a duplicate document, it will not touch the
 | 
					 | 
				
			||||||
    original document. This default behavior can be changed here.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to false.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_RECURSIVE=<bool>
 | 
					 | 
				
			||||||
    Enable recursive watching of the consumption directory. Paperless will
 | 
					 | 
				
			||||||
    then pickup files from files in subdirectories within your consumption
 | 
					 | 
				
			||||||
    directory as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to false.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=<bool>
 | 
					 | 
				
			||||||
    Set the names of subdirectories as tags for consumed files.
 | 
					 | 
				
			||||||
    E.g. <CONSUMPTION_DIR>/foo/bar/file.pdf will add the tags "foo" and "bar" to
 | 
					 | 
				
			||||||
    the consumed file. Paperless will create any tags that don't exist yet.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This is useful for sorting documents with certain tags such as ``car`` or
 | 
					 | 
				
			||||||
    ``todo`` prior to consumption. These folders won't be deleted.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    PAPERLESS_CONSUMER_RECURSIVE must be enabled for this to work.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to false.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_ENABLE_BARCODES=<bool>
 | 
					 | 
				
			||||||
    Enables the scanning and page separation based on detected barcodes.
 | 
					 | 
				
			||||||
    This allows for scanning and adding multiple documents per uploaded
 | 
					 | 
				
			||||||
    file, which are separated by one or multiple barcode pages.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    For ease of use, it is suggested to use a standardized separation page,
 | 
					 | 
				
			||||||
    e.g. `here <https://www.alliancegroup.co.uk/patch-codes.htm>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If no barcodes are detected in the uploaded file, no page separation
 | 
					 | 
				
			||||||
    will happen.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The original document will be removed and the separated pages will be
 | 
					 | 
				
			||||||
    saved as pdf.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to false.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_BARCODE_TIFF_SUPPORT=<bool>
 | 
					 | 
				
			||||||
    Whether TIFF image files should be scanned for barcodes.
 | 
					 | 
				
			||||||
    This will automatically convert any TIFF image(s) to pdfs for later
 | 
					 | 
				
			||||||
    processing.
 | 
					 | 
				
			||||||
    This only has an effect, if PAPERLESS_CONSUMER_ENABLE_BARCODES has been
 | 
					 | 
				
			||||||
    enabled.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to false.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_BARCODE_STRING=PATCHT
 | 
					 | 
				
			||||||
  Defines the string to be detected as a separator barcode.
 | 
					 | 
				
			||||||
  If paperless is used with the PATCH-T separator pages, users
 | 
					 | 
				
			||||||
  shouldn't change this.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  Defaults to "PATCHT"
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONVERT_MEMORY_LIMIT=<num>
 | 
					 | 
				
			||||||
    On smaller systems, or even in the case of Very Large Documents, the consumer
 | 
					 | 
				
			||||||
    may explode, complaining about how it's "unable to extend pixel cache".  In
 | 
					 | 
				
			||||||
    such cases, try setting this to a reasonably low value, like 32.  The
 | 
					 | 
				
			||||||
    default is to use whatever is necessary to do everything without writing to
 | 
					 | 
				
			||||||
    disk, and units are in megabytes.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    For more information on how to use this value, you should search
 | 
					 | 
				
			||||||
    the web for "MAGICK_MEMORY_LIMIT".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 0, which disables the limit.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONVERT_TMPDIR=<path>
 | 
					 | 
				
			||||||
    Similar to the memory limit, if you've got a small system and your OS mounts
 | 
					 | 
				
			||||||
    /tmp as tmpfs, you should set this to a path that's on a physical disk, like
 | 
					 | 
				
			||||||
    /home/your_user/tmp or something.  ImageMagick will use this as scratch space
 | 
					 | 
				
			||||||
    when crunching through very large documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    For more information on how to use this value, you should search
 | 
					 | 
				
			||||||
    the web for "MAGICK_TMPDIR".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Default is none, which disables the temporary directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_POST_CONSUME_SCRIPT=<filename>
 | 
					 | 
				
			||||||
    After a document is consumed, Paperless can trigger an arbitrary script if
 | 
					 | 
				
			||||||
    you like.  This script will be passed a number of arguments for you to work
 | 
					 | 
				
			||||||
    with. For more information, take a look at :ref:`advanced-post_consume_script`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The default is blank, which means nothing will be executed.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_FILENAME_DATE_ORDER=<format>
 | 
					 | 
				
			||||||
    Paperless will check the document text for document date information.
 | 
					 | 
				
			||||||
    Use this setting to enable checking the document filename for date
 | 
					 | 
				
			||||||
    information. The date order can be set to any option as specified in
 | 
					 | 
				
			||||||
    https://dateparser.readthedocs.io/en/latest/settings.html#date-order.
 | 
					 | 
				
			||||||
    The filename will be checked first, and if nothing is found, the document
 | 
					 | 
				
			||||||
    text will be checked as normal.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    A date in a filename must have some separators (`.`, `-`, `/`, etc)
 | 
					 | 
				
			||||||
    for it to be parsed.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to none, which disables this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_NUMBER_OF_SUGGESTED_DATES=<num>
 | 
					 | 
				
			||||||
    Paperless searches an entire document for dates. The first date found will
 | 
					 | 
				
			||||||
    be used as the initial value for the created date. When this variable is
 | 
					 | 
				
			||||||
    greater than 0 (or left to it's default value), paperless will also suggest
 | 
					 | 
				
			||||||
    other dates found in the document, up to a maximum of this setting. Note that
 | 
					 | 
				
			||||||
    duplicates will be removed, which can result in fewer dates displayed in the
 | 
					 | 
				
			||||||
    frontend than this setting value.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The task to find all dates can be time-consuming and increases with a higher
 | 
					 | 
				
			||||||
    (maximum) number of suggested dates and slower hardware.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 3. Set to 0 to disable this feature.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_THUMBNAIL_FONT_NAME=<filename>
 | 
					 | 
				
			||||||
    Paperless creates thumbnails for plain text files by rendering the content
 | 
					 | 
				
			||||||
    of the file on an image and uses a predefined font for that. This
 | 
					 | 
				
			||||||
    font can be changed here.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Note that this won't have any effect on already generated thumbnails.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``/usr/share/fonts/liberation/LiberationSerif-Regular.ttf``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_IGNORE_DATES=<string>
 | 
					 | 
				
			||||||
    Paperless parses a documents creation date from filename and file content.
 | 
					 | 
				
			||||||
    You may specify a comma separated list of dates that should be ignored during
 | 
					 | 
				
			||||||
    this process. This is useful for special dates (like date of birth) that appear
 | 
					 | 
				
			||||||
    in documents regularly but are very unlikely to be the documents creation date.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The date is parsed using the order specified in PAPERLESS_DATE_ORDER
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to an empty string to not ignore any dates.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_DATE_ORDER=<format>
 | 
					 | 
				
			||||||
    Paperless will try to determine the document creation date from its contents.
 | 
					 | 
				
			||||||
    Specify the date format Paperless should expect to see within your documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This option defaults to DMY which translates to day first, month second, and year
 | 
					 | 
				
			||||||
    last order. Characters D, M, or Y can be shuffled to meet the required order.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONSUMER_IGNORE_PATTERNS=<json>
 | 
					 | 
				
			||||||
    By default, paperless ignores certain files and folders in the consumption
 | 
					 | 
				
			||||||
    directory, such as system files created by the Mac OS.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This can be adjusted by configuring a custom json array with patterns to exclude.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to ``[".DS_STORE/*", "._*", ".stfolder/*", ".stversions/*", ".localized/*", "desktop.ini"]``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Binaries
 | 
					 | 
				
			||||||
########
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
There are a few external software packages that Paperless expects to find on
 | 
					 | 
				
			||||||
your system when it starts up.  Unless you've done something creative with
 | 
					 | 
				
			||||||
their installation, you probably won't need to edit any of these.  However,
 | 
					 | 
				
			||||||
if you've installed these programs somewhere where simply typing the name of
 | 
					 | 
				
			||||||
the program doesn't automatically execute it (ie. the program isn't in your
 | 
					 | 
				
			||||||
$PATH), then you'll need to specify the literal path for that program.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_CONVERT_BINARY=<path>
 | 
					 | 
				
			||||||
    Defaults to "convert".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_GS_BINARY=<path>
 | 
					 | 
				
			||||||
    Defaults to "gs".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _configuration-docker:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Docker-specific options
 | 
					 | 
				
			||||||
#######################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
These options don't have any effect in ``paperless.conf``. These options adjust
 | 
					 | 
				
			||||||
the behavior of the docker container. Configure these in `docker-compose.env`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_WEBSERVER_WORKERS=<num>
 | 
					 | 
				
			||||||
    The number of worker processes the webserver should spawn. More worker processes
 | 
					 | 
				
			||||||
    usually result in the front end to load data much quicker. However, each worker process
 | 
					 | 
				
			||||||
    also loads the entire application into memory separately, so increasing this value
 | 
					 | 
				
			||||||
    will increase RAM usage.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 1.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_BIND_ADDR=<ip address>
 | 
					 | 
				
			||||||
    The IP address the webserver will listen on inside the container. There are
 | 
					 | 
				
			||||||
    special setups where you may need to configure this value to restrict the
 | 
					 | 
				
			||||||
    Ip address or interface the webserver listens on.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to [::], meaning all interfaces, including IPv6.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_PORT=<port>
 | 
					 | 
				
			||||||
    The port number the webserver will listen on inside the container. There are
 | 
					 | 
				
			||||||
    special setups where you may need this to avoid collisions with other
 | 
					 | 
				
			||||||
    services (like using podman with multiple containers in one pod).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Don't change this when using Docker. To change the port the webserver is
 | 
					 | 
				
			||||||
    reachable outside of the container, instead refer to the "ports" key in
 | 
					 | 
				
			||||||
    ``docker-compose.yml``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 8000.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
USERMAP_UID=<uid>
 | 
					 | 
				
			||||||
    The ID of the paperless user in the container. Set this to your actual user ID on the
 | 
					 | 
				
			||||||
    host system, which you can get by executing
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ id -u
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless will change ownership on its folders to this user, so you need to get this right
 | 
					 | 
				
			||||||
    in order to be able to write to the consumption directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 1000.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
USERMAP_GID=<gid>
 | 
					 | 
				
			||||||
    The ID of the paperless Group in the container. Set this to your actual group ID on the
 | 
					 | 
				
			||||||
    host system, which you can get by executing
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ id -g
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless will change ownership on its folders to this group, so you need to get this right
 | 
					 | 
				
			||||||
    in order to be able to write to the consumption directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to 1000.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_OCR_LANGUAGES=<list>
 | 
					 | 
				
			||||||
    Additional OCR languages to install. By default, paperless comes with
 | 
					 | 
				
			||||||
    English, German, Italian, Spanish and French. If your language is not in this list, install
 | 
					 | 
				
			||||||
    additional languages with this configuration option:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        PAPERLESS_OCR_LANGUAGES=tur ces
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    To actually use these languages, also set the default OCR language of paperless:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        PAPERLESS_OCR_LANGUAGE=tur
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Defaults to none, which does not install any additional languages.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ENABLE_FLOWER=<defined>
 | 
					 | 
				
			||||||
    If this environment variable is defined, the Celery monitoring tool
 | 
					 | 
				
			||||||
    `Flower <https://flower.readthedocs.io/en/latest/index.html>`_ will
 | 
					 | 
				
			||||||
    be started by the container.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You can read more about this in the :ref:`advanced setup <advanced-celery-monitoring>`
 | 
					 | 
				
			||||||
    documentation.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _configuration-update-checking:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Update Checking
 | 
					 | 
				
			||||||
###############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
PAPERLESS_ENABLE_UPDATE_CHECK=<bool>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            This setting was deprecated in favor of a frontend setting after v1.9.2. A one-time
 | 
					 | 
				
			||||||
            migration is performed for users who have this setting set. This setting is always
 | 
					 | 
				
			||||||
            ignored if the corresponding frontend setting has been set.
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
@@ -1,431 +1,12 @@
 | 
				
			|||||||
.. _extending:
 | 
					.. _extending:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					*************************
 | 
				
			||||||
Paperless-ngx Development
 | 
					Paperless-ngx Development
 | 
				
			||||||
#########################
 | 
					*************************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This section describes the steps you need to take to start development on paperless-ngx.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Check out the source from github. The repository is organized in the following way:
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
*   ``main`` always represents the latest release and will only see changes
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
    when a new release is made.
 | 
					 | 
				
			||||||
*   ``dev`` contains the code that will be in the next release.
 | 
					 | 
				
			||||||
*   ``feature-X`` contain bigger changes that will be in some release, but not
 | 
					 | 
				
			||||||
    necessarily the next one.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
When making functional changes to paperless, *always* make your changes on the ``dev`` branch.
 | 
					    You will be redirected shortly...
 | 
				
			||||||
 | 
					 | 
				
			||||||
Apart from that, the folder structure is as follows:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``docs/`` - Documentation.
 | 
					 | 
				
			||||||
*   ``src-ui/`` - Code of the front end.
 | 
					 | 
				
			||||||
*   ``src/`` - Code of the back end.
 | 
					 | 
				
			||||||
*   ``scripts/`` - Various scripts that help with different parts of development.
 | 
					 | 
				
			||||||
*   ``docker/`` - Files required to build the docker image.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Contributing to Paperless
 | 
					 | 
				
			||||||
=========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Maybe you've been using Paperless for a while and want to add a feature or two,
 | 
					 | 
				
			||||||
or maybe you've come across a bug that you have some ideas how to solve.  The
 | 
					 | 
				
			||||||
beauty of open source software is that you can see what's wrong and help to get
 | 
					 | 
				
			||||||
it fixed for everyone!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Before contributing please review our `code of conduct`_ and other important
 | 
					 | 
				
			||||||
information in the `contributing guidelines`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _code-formatting-with-pre-commit-hooks:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Code formatting with pre-commit Hooks
 | 
					 | 
				
			||||||
=====================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
To ensure a consistent style and formatting across the project source, the project
 | 
					 | 
				
			||||||
utilizes a Git `pre-commit` hook to perform some formatting and linting before a
 | 
					 | 
				
			||||||
commit is allowed. That way, everyone uses the same style and some common issues
 | 
					 | 
				
			||||||
can be caught early on. See below for installation instructions.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Once installed, hooks will run when you commit. If the formatting isn't quite right
 | 
					 | 
				
			||||||
or a linter catches something, the commit will be rejected. You'll need to look at the
 | 
					 | 
				
			||||||
output and fix the issue. Some hooks, such as the Python formatting tool `black`,
 | 
					 | 
				
			||||||
will format failing files, so all you need to do is `git add` those files again and
 | 
					 | 
				
			||||||
retry your commit.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Initial setup and first start
 | 
					 | 
				
			||||||
=============================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
After you forked and cloned the code from github you need to perform a first-time setup.
 | 
					 | 
				
			||||||
To do the setup you need to perform the steps from the following chapters in a certain order:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Install prerequisites + pipenv as mentioned in :ref:`Bare metal route <setup-bare_metal>`
 | 
					 | 
				
			||||||
2.  Copy ``paperless.conf.example`` to ``paperless.conf`` and enable debug mode.
 | 
					 | 
				
			||||||
3.  Install the Angular CLI interface:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ npm install -g @angular/cli
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Install pre-commit
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        pre-commit install
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
5.  Create ``consume`` and ``media`` folders in the cloned root folder.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        mkdir -p consume media
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
6.  You can now either ...
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *  install redis or
 | 
					 | 
				
			||||||
    *  use the included scripts/start-services.sh to use docker to fire up a redis instance (and some other services such as tika, gotenberg and a database server) or
 | 
					 | 
				
			||||||
    *  spin up a bare redis container
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            docker run -d -p 6379:6379 --restart unless-stopped redis:latest
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
7.  Install the python dependencies by performing in the src/ directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        pipenv install --dev
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  * Make sure you're using python 3.9.x or lower. Otherwise you might get issues with building dependencies. You can use `pyenv <https://github.com/pyenv/pyenv>`_ to install a specific python version.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
8.  Generate the static UI so you can perform a login to get session that is required for frontend development (this needs to be done one time only). From src-ui directory:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        npm install .
 | 
					 | 
				
			||||||
        ./node_modules/.bin/ng build --configuration production
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
9.  Apply migrations and create a superuser for your dev instance:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        python3 manage.py migrate
 | 
					 | 
				
			||||||
        python3 manage.py createsuperuser
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
10.  Now spin up the dev backend. Depending on which part of paperless you're developing for, you need to have some or all of them running.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
11. Login with the superuser credentials provided in step 8 at ``http://localhost:8000`` to create a session that enables you to use the backend.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Backend development environment is now ready, to start Frontend development go to ``/src-ui`` and run ``ng serve``. From there you can use ``http://localhost:4200`` for a preview.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Back end development
 | 
					 | 
				
			||||||
====================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The backend is a django application. PyCharm works well for development, but you can use whatever
 | 
					 | 
				
			||||||
you want.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Configure the IDE to use the src/ folder as the base source folder. Configure the following
 | 
					 | 
				
			||||||
launch configurations in your IDE:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   python3 manage.py runserver
 | 
					 | 
				
			||||||
*   celery --app paperless worker
 | 
					 | 
				
			||||||
*   python3 manage.py document_consumer
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
To start them all:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Testing and code style:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Run ``pytest`` in the src/ directory to execute all tests. This also generates a HTML coverage
 | 
					 | 
				
			||||||
    report. When runnings test, paperless.conf is loaded as well. However: the tests rely on the default
 | 
					 | 
				
			||||||
    configuration. This is not ideal. But for now, make sure no settings except for DEBUG are overridden when testing.
 | 
					 | 
				
			||||||
*   Coding style is enforced by the Git pre-commit hooks.  These will ensure your code is formatted and do some
 | 
					 | 
				
			||||||
    linting when you do a `git commit`.
 | 
					 | 
				
			||||||
*   You can also run ``black`` manually to format your code
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        The line length rule E501 is generally useful for getting multiple source files
 | 
					 | 
				
			||||||
        next to each other on the screen. However, in some cases, its just not possible
 | 
					 | 
				
			||||||
        to make some lines fit, especially complicated IF cases. Append ``# NOQA: E501``
 | 
					 | 
				
			||||||
        to disable this check for certain lines.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Front end development
 | 
					 | 
				
			||||||
=====================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The front end is built using Angular. In order to get started, you need ``npm``.
 | 
					 | 
				
			||||||
Install the Angular CLI interface with
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ npm install -g @angular/cli
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
and make sure that it's on your path. Next, in the src-ui/ directory, install the
 | 
					 | 
				
			||||||
required dependencies of the project.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ npm install
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can launch a development server by running
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ ng serve
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This will automatically update whenever you save. However, in-place compilation might fail
 | 
					 | 
				
			||||||
on syntax errors, in which case you need to restart it.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
By default, the development server is available on ``http://localhost:4200/`` and is configured
 | 
					 | 
				
			||||||
to access the API at ``http://localhost:8000/api/``, which is the default of the backend.
 | 
					 | 
				
			||||||
If you enabled DEBUG on the back end, several security overrides for allowed hosts, CORS and
 | 
					 | 
				
			||||||
X-Frame-Options are in place so that the front end behaves exactly as in production. This also
 | 
					 | 
				
			||||||
relies on you being logged into the back end. Without a valid session, The front end will simply
 | 
					 | 
				
			||||||
not work.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Testing and code style:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   The frontend code (.ts, .html, .scss) use ``prettier`` for code formatting via the Git
 | 
					 | 
				
			||||||
    ``pre-commit`` hooks which run automatically on commit. See
 | 
					 | 
				
			||||||
    :ref:`above <code-formatting-with-pre-commit-hooks>` for installation. You can also run this
 | 
					 | 
				
			||||||
    via cli with a command such as
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ git ls-files -- '*.ts' | xargs pre-commit run prettier --files
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Frontend testing uses jest and cypress. There is currently a need for significantly more
 | 
					 | 
				
			||||||
    frontend tests. Unit tests and e2e tests, respectively, can be run non-interactively with:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ ng test
 | 
					 | 
				
			||||||
        $ npm run e2e:ci
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Cypress also includes a UI which can be run from within the ``src-ui`` directory with
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ ./node_modules/.bin/cypress open
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
In order to build the front end and serve it as part of django, execute
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ ng build --prod
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This will build the front end and put it in a location from which the Django server will serve
 | 
					 | 
				
			||||||
it as static content. This way, you can verify that authentication is working.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Localization
 | 
					 | 
				
			||||||
============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless is available in many different languages. Since paperless consists both of a django
 | 
					 | 
				
			||||||
application and an Angular front end, both these parts have to be translated separately.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Front end localization
 | 
					 | 
				
			||||||
----------------------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   The Angular front end does localization according to the `Angular documentation <https://angular.io/guide/i18n>`_.
 | 
					 | 
				
			||||||
*   The source language of the project is "en_US".
 | 
					 | 
				
			||||||
*   The source strings end up in the file "src-ui/messages.xlf".
 | 
					 | 
				
			||||||
*   The translated strings need to be placed in the "src-ui/src/locale/" folder.
 | 
					 | 
				
			||||||
*   In order to extract added or changed strings from the source files, call ``ng xi18n --ivy``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Adding new languages requires adding the translated files in the "src-ui/src/locale/" folder and adjusting a couple files.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Adjust "src-ui/angular.json":
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: json
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        "i18n": {
 | 
					 | 
				
			||||||
            "sourceLocale": "en-US",
 | 
					 | 
				
			||||||
            "locales": {
 | 
					 | 
				
			||||||
                "de": "src/locale/messages.de.xlf",
 | 
					 | 
				
			||||||
                "nl-NL": "src/locale/messages.nl_NL.xlf",
 | 
					 | 
				
			||||||
                "fr": "src/locale/messages.fr.xlf",
 | 
					 | 
				
			||||||
                "en-GB": "src/locale/messages.en_GB.xlf",
 | 
					 | 
				
			||||||
                "pt-BR": "src/locale/messages.pt_BR.xlf",
 | 
					 | 
				
			||||||
                "language-code": "language-file"
 | 
					 | 
				
			||||||
            }
 | 
					 | 
				
			||||||
        }
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Add the language to the available options in "src-ui/src/app/services/settings.service.ts":
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: typescript
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        getLanguageOptions(): LanguageOption[] {
 | 
					 | 
				
			||||||
            return [
 | 
					 | 
				
			||||||
                {code: "en-us", name: $localize`English (US)`, englishName: "English (US)", dateInputFormat: "mm/dd/yyyy"},
 | 
					 | 
				
			||||||
                {code: "en-gb", name: $localize`English (GB)`, englishName: "English (GB)", dateInputFormat: "dd/mm/yyyy"},
 | 
					 | 
				
			||||||
                {code: "de", name: $localize`German`, englishName: "German", dateInputFormat: "dd.mm.yyyy"},
 | 
					 | 
				
			||||||
                {code: "nl", name: $localize`Dutch`, englishName: "Dutch", dateInputFormat: "dd-mm-yyyy"},
 | 
					 | 
				
			||||||
                {code: "fr", name: $localize`French`, englishName: "French", dateInputFormat: "dd/mm/yyyy"},
 | 
					 | 
				
			||||||
                {code: "pt-br", name: $localize`Portuguese (Brazil)`, englishName: "Portuguese (Brazil)", dateInputFormat: "dd/mm/yyyy"}
 | 
					 | 
				
			||||||
                // Add your new language here
 | 
					 | 
				
			||||||
            ]
 | 
					 | 
				
			||||||
        }
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    ``dateInputFormat`` is a special string that defines the behavior of the date input fields and absolutely needs to contain "dd", "mm" and "yyyy".
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Import and register the Angular data for this locale in "src-ui/src/app/app.module.ts":
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: typescript
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        import localeDe from '@angular/common/locales/de';
 | 
					 | 
				
			||||||
        registerLocaleData(localeDe)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Back end localization
 | 
					 | 
				
			||||||
---------------------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
A majority of the strings that appear in the back end appear only when the admin is used. However,
 | 
					 | 
				
			||||||
some of these are still shown on the front end (such as error messages).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   The django application does localization according to the `django documentation <https://docs.djangoproject.com/en/3.1/topics/i18n/translation/>`_.
 | 
					 | 
				
			||||||
*   The source language of the project is "en_US".
 | 
					 | 
				
			||||||
*   Localization files end up in the folder "src/locale/".
 | 
					 | 
				
			||||||
*   In order to extract strings from the application, call ``python3 manage.py makemessages -l en_US``. This is important after making changes to translatable strings.
 | 
					 | 
				
			||||||
*   The message files need to be compiled for them to show up in the application. Call ``python3 manage.py compilemessages`` to do this. The generated files don't get
 | 
					 | 
				
			||||||
    committed into git, since these are derived artifacts. The build pipeline takes care of executing this command.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Adding new languages requires adding the translated files in the "src/locale/" folder and adjusting the file "src/paperless/settings.py" to include the new language:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: python
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    LANGUAGES = [
 | 
					 | 
				
			||||||
        ("en-us", _("English (US)")),
 | 
					 | 
				
			||||||
        ("en-gb", _("English (GB)")),
 | 
					 | 
				
			||||||
        ("de", _("German")),
 | 
					 | 
				
			||||||
        ("nl-nl", _("Dutch")),
 | 
					 | 
				
			||||||
        ("fr", _("French")),
 | 
					 | 
				
			||||||
        ("pt-br", _("Portuguese (Brazil)")),
 | 
					 | 
				
			||||||
        # Add language here.
 | 
					 | 
				
			||||||
    ]
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Building the documentation
 | 
					 | 
				
			||||||
==========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The documentation is built using sphinx. I've configured ReadTheDocs to automatically build
 | 
					 | 
				
			||||||
the documentation when changes are pushed. If you want to build the documentation locally,
 | 
					 | 
				
			||||||
this is how you do it:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Install python dependencies.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ cd /path/to/paperless
 | 
					 | 
				
			||||||
        $ pipenv install --dev
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Build the documentation
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ cd /path/to/paperless/docs
 | 
					 | 
				
			||||||
        $ pipenv run make clean html
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This will build the HTML documentation, and put the resulting files in the ``_build/html``
 | 
					 | 
				
			||||||
directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Building the Docker image
 | 
					 | 
				
			||||||
=========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The docker image is primarily built by the GitHub actions workflow, but it can be
 | 
					 | 
				
			||||||
faster when developing to build and tag an image locally.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
To provide the build arguments automatically, build the image using the helper
 | 
					 | 
				
			||||||
script ``build-docker-image.sh``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Building the docker image from source:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        ./build-docker-image.sh Dockerfile -t <your-tag>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Extending Paperless
 | 
					 | 
				
			||||||
===================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless does not have any fancy plugin systems and will probably never have. However,
 | 
					 | 
				
			||||||
some parts of the application have been designed to allow easy integration of additional
 | 
					 | 
				
			||||||
features without any modification to the base code.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Making custom parsers
 | 
					 | 
				
			||||||
---------------------
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless uses parsers to add documents to paperless. A parser is responsible for:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Retrieve the content from the original
 | 
					 | 
				
			||||||
*   Create a thumbnail
 | 
					 | 
				
			||||||
*   Optional: Retrieve a created date from the original
 | 
					 | 
				
			||||||
*   Optional: Create an archived document from the original
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Custom parsers can be added to paperless to support more file types. In order to do that,
 | 
					 | 
				
			||||||
you need to write the parser itself and announce its existence to paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The parser itself must extend ``documents.parsers.DocumentParser`` and must implement the
 | 
					 | 
				
			||||||
methods ``parse`` and ``get_thumbnail``. You can provide your own implementation to
 | 
					 | 
				
			||||||
``get_date`` if you don't want to rely on paperless' default date guessing mechanisms.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: python
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    class MyCustomParser(DocumentParser):
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        def parse(self, document_path, mime_type):
 | 
					 | 
				
			||||||
            # This method does not return anything. Rather, you should assign
 | 
					 | 
				
			||||||
            # whatever you got from the document to the following fields:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            # The content of the document.
 | 
					 | 
				
			||||||
            self.text = "content"
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            # Optional: path to a PDF document that you created from the original.
 | 
					 | 
				
			||||||
            self.archive_path = os.path.join(self.tempdir, "archived.pdf")
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            # Optional: "created" date of the document.
 | 
					 | 
				
			||||||
            self.date = get_created_from_metadata(document_path)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        def get_thumbnail(self, document_path, mime_type):
 | 
					 | 
				
			||||||
            # This should return the path to a thumbnail you created for this
 | 
					 | 
				
			||||||
            # document.
 | 
					 | 
				
			||||||
            return os.path.join(self.tempdir, "thumb.png")
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you encounter any issues during parsing, raise a ``documents.parsers.ParseError``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The ``self.tempdir`` directory is a temporary directory that is guaranteed to be empty
 | 
					 | 
				
			||||||
and removed after consumption finished. You can use that directory to store any
 | 
					 | 
				
			||||||
intermediate files and also use it to store the thumbnail / archived document.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
After that, you need to announce your parser to paperless. You need to connect a
 | 
					 | 
				
			||||||
handler to the ``document_consumer_declaration`` signal. Have a look in the file
 | 
					 | 
				
			||||||
``src/paperless_tesseract/apps.py`` on how that's done. The handler is a method
 | 
					 | 
				
			||||||
that returns information about your parser:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: python
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    def myparser_consumer_declaration(sender, **kwargs):
 | 
					 | 
				
			||||||
        return {
 | 
					 | 
				
			||||||
            "parser": MyCustomParser,
 | 
					 | 
				
			||||||
            "weight": 0,
 | 
					 | 
				
			||||||
            "mime_types": {
 | 
					 | 
				
			||||||
                "application/pdf": ".pdf",
 | 
					 | 
				
			||||||
                "image/jpeg": ".jpg",
 | 
					 | 
				
			||||||
            }
 | 
					 | 
				
			||||||
        }
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``parser`` is a reference to a class that extends ``DocumentParser``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``weight`` is used whenever two or more parsers are able to parse a file: The parser with
 | 
					 | 
				
			||||||
    the higher weight wins. This can be used to override the parsers provided by
 | 
					 | 
				
			||||||
    paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   ``mime_types`` is a dictionary. The keys are the mime types your parser supports and the value
 | 
					 | 
				
			||||||
    is the default file extension that paperless should use when storing files and serving them for
 | 
					 | 
				
			||||||
    download. We could guess that from the file extensions, but some mime types have many extensions
 | 
					 | 
				
			||||||
    associated with them and the python methods responsible for guessing the extension do not always
 | 
					 | 
				
			||||||
    return the same value.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _code of conduct: https://github.com/paperless-ngx/paperless-ngx/blob/main/CODE_OF_CONDUCT.md
 | 
					 | 
				
			||||||
.. _contributing guidelines: https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTING.md
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										113
									
								
								docs/faq.rst
									
									
									
									
									
								
							
							
						
						
									
										113
									
								
								docs/faq.rst
									
									
									
									
									
								
							@@ -1,117 +1,12 @@
 | 
				
			|||||||
 | 
					.. _faq:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
**************************
 | 
					**************************
 | 
				
			||||||
Frequently asked questions
 | 
					Frequently asked questions
 | 
				
			||||||
**************************
 | 
					**************************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
**Q:** *What's the general plan for Paperless-ngx?*
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
**A:** While Paperless-ngx is already considered largely "feature-complete" it is a community-driven
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
project and development will be guided in this way. New features can be submitted via
 | 
					 | 
				
			||||||
GitHub discussions and "up-voted" by the community but this is not a guarantee the feature
 | 
					 | 
				
			||||||
will be implemented. This project will always be open to collaboration in the form of PRs,
 | 
					 | 
				
			||||||
ideas etc.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
**Q:** *I'm using docker. Where are my documents?*
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
**A:** Your documents are stored inside the docker volume ``paperless_media``.
 | 
					    You will be redirected shortly...
 | 
				
			||||||
Docker manages this volume automatically for you. It is a persistent storage
 | 
					 | 
				
			||||||
and will persist as long as you don't explicitly delete it. The actual location
 | 
					 | 
				
			||||||
depends on your host operating system. On Linux, chances are high that this location
 | 
					 | 
				
			||||||
is
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    /var/lib/docker/volumes/paperless_media/_data
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Do not mess with this folder. Don't change permissions and don't move
 | 
					 | 
				
			||||||
    files around manually. This folder is meant to be entirely managed by docker
 | 
					 | 
				
			||||||
    and paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *Let's say I want to switch tools in a year. Can I easily move to other systems?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** Your documents are stored as plain files inside the media folder. You can always drag those files
 | 
					 | 
				
			||||||
out of that folder to use them elsewhere. Here are a couple notes about that.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Paperless-ngx never modifies your original documents. It keeps checksums of all documents and uses a
 | 
					 | 
				
			||||||
    scheduled sanity checker to check that they remain the same.
 | 
					 | 
				
			||||||
*   By default, paperless uses the internal ID of each document as its filename. This might not be very
 | 
					 | 
				
			||||||
    convenient for export. However, you can adjust the way files are stored in paperless by
 | 
					 | 
				
			||||||
    :ref:`configuring the filename format <advanced-file_name_handling>`.
 | 
					 | 
				
			||||||
*   :ref:`The exporter <utilities-exporter>` is another easy way to get your files out of paperless with reasonable file names.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *What file types does paperless-ngx support?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** Currently, the following files are supported:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   PDF documents, PNG images, JPEG images, TIFF images and GIF images are processed with OCR and converted into PDF documents.
 | 
					 | 
				
			||||||
*   Plain text documents are supported as well and are added verbatim
 | 
					 | 
				
			||||||
    to paperless.
 | 
					 | 
				
			||||||
*   With the optional Tika integration enabled (see :ref:`Configuration <configuration-tika>`), Paperless also supports various
 | 
					 | 
				
			||||||
    Office documents (.docx, .doc, odt, .ppt, .pptx, .odp, .xls, .xlsx, .ods).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx determines the type of a file by inspecting its content. The
 | 
					 | 
				
			||||||
file extensions do not matter.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *Will paperless-ngx run on Raspberry Pi?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** The short answer is yes. I've tested it on a Raspberry Pi 3 B.
 | 
					 | 
				
			||||||
The long answer is that certain parts of
 | 
					 | 
				
			||||||
Paperless will run very slow, such as the OCR. On Raspberry Pi,
 | 
					 | 
				
			||||||
try to OCR documents before feeding them into paperless so that paperless can
 | 
					 | 
				
			||||||
reuse the text. The web interface is a lot snappier, since it runs
 | 
					 | 
				
			||||||
in your browser and paperless has to do much less work to serve the data.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You can adjust some of the settings so that paperless uses less processing
 | 
					 | 
				
			||||||
    power. See :ref:`setup-less_powerful_devices` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *How do I install paperless-ngx on Raspberry Pi?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** Docker images are available for arm and arm64 hardware, so just follow
 | 
					 | 
				
			||||||
the docker-compose instructions. Apart from more required disk space compared to
 | 
					 | 
				
			||||||
a bare metal installation, docker comes with close to zero overhead, even on
 | 
					 | 
				
			||||||
Raspberry Pi.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you decide to got with the bare metal route, be aware that some of the
 | 
					 | 
				
			||||||
python requirements do not have precompiled packages for ARM / ARM64. Installation
 | 
					 | 
				
			||||||
of these will require additional development libraries and compilation will take
 | 
					 | 
				
			||||||
a long time.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *How do I run this on Unraid?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** Paperless-ngx is available as `community app <https://unraid.net/community/apps?q=paperless-ngx>`_
 | 
					 | 
				
			||||||
in Unraid. `Uli Fahrer <https://github.com/Tooa>`_ created a container template for that.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *How do I run this on my toaster?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** I honestly don't know! As for all other devices that might be able
 | 
					 | 
				
			||||||
to run paperless, you're a bit on your own. If you can't run the docker image,
 | 
					 | 
				
			||||||
the documentation has instructions for bare metal installs. I'm running
 | 
					 | 
				
			||||||
paperless on an i3 processor from 2015 or so. This is also what I use to test
 | 
					 | 
				
			||||||
new releases with. Apart from that, I also have a Raspberry Pi, which I
 | 
					 | 
				
			||||||
occasionally build the image on and see if it works.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *How do I proxy this with NGINX?*
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** See :ref:`here <setup-nginx>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _faq-mod_wsgi:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**Q:** *How do I get WebSocket support with Apache mod_wsgi*?
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
**A:** ``mod_wsgi`` by itself does not support ASGI. Paperless will continue
 | 
					 | 
				
			||||||
to work with WSGI, but certain features such as status notifications about
 | 
					 | 
				
			||||||
document consumption won't be available.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you want to continue using ``mod_wsgi``, you will have to run an ASGI-enabled
 | 
					 | 
				
			||||||
web server as well that processes WebSocket connections, and configure Apache to
 | 
					 | 
				
			||||||
redirect WebSocket connections to this server. Multiple options for ASGI servers
 | 
					 | 
				
			||||||
exist:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* ``gunicorn`` with ``uvicorn`` as the worker implementation (the default of paperless)
 | 
					 | 
				
			||||||
* ``daphne`` as a standalone server, which is the reference implementation for ASGI.
 | 
					 | 
				
			||||||
* ``uvicorn`` as a standalone server
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
@@ -2,74 +2,24 @@
 | 
				
			|||||||
Paperless
 | 
					Paperless
 | 
				
			||||||
*********
 | 
					*********
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless is a simple Django application running in two parts:
 | 
					 | 
				
			||||||
a *Consumer* (the thing that does the indexing) and
 | 
					 | 
				
			||||||
the *Web server* (the part that lets you search &
 | 
					 | 
				
			||||||
download already-indexed documents). If you want to learn more about its
 | 
					 | 
				
			||||||
functions keep on reading after the installation section.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Why This Exists
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
===============
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paper is a nightmare.  Environmental issues aside, there's no excuse for it in
 | 
					    You will be redirected shortly...
 | 
				
			||||||
the 21st century.  It takes up space, collects dust, doesn't support any form
 | 
					 | 
				
			||||||
of a search feature, indexing is tedious, it's heavy and prone to damage &
 | 
					 | 
				
			||||||
loss.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
I wrote this to make "going paperless" easier.  I do not have to worry about
 | 
					 | 
				
			||||||
finding stuff again. I feed documents right from the post box into the scanner
 | 
					 | 
				
			||||||
and then shred them.  Perhaps you might find it useful too.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx
 | 
					 | 
				
			||||||
=============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx is a document management system that transforms your physical
 | 
					 | 
				
			||||||
documents into a searchable online archive so you can keep, well, *less paper*.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx forked from paperless-ng to continue the great work and
 | 
					 | 
				
			||||||
distribute responsibility of supporting and advancing the project among a team
 | 
					 | 
				
			||||||
of people.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
NG stands for both Angular (the framework used for the
 | 
					 | 
				
			||||||
Frontend) and next-gen. Publishing this project under a different name also
 | 
					 | 
				
			||||||
avoids confusion between paperless and paperless-ngx.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you want to learn about what's different in paperless-ngx from Paperless, check out these
 | 
					 | 
				
			||||||
resources in the documentation:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   :ref:`Some screenshots <screenshots>` of the new UI are available.
 | 
					 | 
				
			||||||
*   Read :ref:`this section <advanced-automatic_matching>` if you want to
 | 
					 | 
				
			||||||
    learn about how paperless automates all tagging using machine learning.
 | 
					 | 
				
			||||||
*   Paperless now comes with a :ref:`proper email consumer <usage-email>`
 | 
					 | 
				
			||||||
    that's fully tested and production ready.
 | 
					 | 
				
			||||||
*   Paperless creates searchable PDF/A documents from whatever you put into
 | 
					 | 
				
			||||||
    the consumption directory. This means that you can select text in
 | 
					 | 
				
			||||||
    image-only documents coming from your scanner.
 | 
					 | 
				
			||||||
*   See :ref:`this note <utilities-encyption>` about GnuPG encryption in
 | 
					 | 
				
			||||||
    paperless-ngx.
 | 
					 | 
				
			||||||
*   Paperless is now integrated with a
 | 
					 | 
				
			||||||
    :ref:`task processing queue <setup-task_processor>` that tells you
 | 
					 | 
				
			||||||
    at a glance when and why something is not working.
 | 
					 | 
				
			||||||
*   The :doc:`changelog </changelog>` contains a detailed list of all changes
 | 
					 | 
				
			||||||
    in paperless-ngx.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Contents
 | 
					 | 
				
			||||||
========
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. toctree::
 | 
					.. toctree::
 | 
				
			||||||
   :maxdepth: 1
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
   setup
 | 
					    screenshots
 | 
				
			||||||
   usage_overview
 | 
					    scanners
 | 
				
			||||||
   advanced_usage
 | 
					    administration
 | 
				
			||||||
   administration
 | 
					    advanced_usage
 | 
				
			||||||
   configuration
 | 
					    usage_overview
 | 
				
			||||||
   api
 | 
					    setup
 | 
				
			||||||
   faq
 | 
					    troubleshooting
 | 
				
			||||||
   troubleshooting
 | 
					    changelog
 | 
				
			||||||
   extending
 | 
					    configuration
 | 
				
			||||||
   scanners
 | 
					    extending
 | 
				
			||||||
   screenshots
 | 
					    api
 | 
				
			||||||
   changelog
 | 
					    faq
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -1,8 +1,12 @@
 | 
				
			|||||||
 | 
					 | 
				
			||||||
.. _scanners:
 | 
					.. _scanners:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
*******************
 | 
					*******************
 | 
				
			||||||
Scanners & Software
 | 
					Scanners & Software
 | 
				
			||||||
*******************
 | 
					*******************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless-ngx is compatible with many different scanners and scanning tools. A user-maintained list of scanners and other software is available on `the wiki <https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations>`_.
 | 
					
 | 
				
			||||||
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    You will be redirected shortly...
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -4,60 +4,9 @@
 | 
				
			|||||||
Screenshots
 | 
					Screenshots
 | 
				
			||||||
***********
 | 
					***********
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This is what Paperless-ngx looks like.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
The dashboard shows customizable views on your document and allows document uploads:
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. image:: _static/screenshots/dashboard.png
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
    :target: _static/screenshots/dashboard.png
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
The document list provides three different styles to scroll through your documents:
 | 
					    You will be redirected shortly...
 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/documents-table.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/documents-table.png
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/documents-smallcards.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/documents-smallcards.png
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/documents-largecards.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/documents-largecards.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx also supports "dark mode":
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/documents-smallcards-dark.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/documents-smallcards-dark.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Extensive filtering mechanisms:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/documents-filter.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/documents-filter.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Bulk editing of document tags, correspondents, etc.:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/bulk-edit.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/bulk-edit.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Side-by-side editing of documents:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/editing.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/editing.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Tag editing. This looks about the same for correspondents and document types.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/new-tag.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/new-tag.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Searching provides auto complete and highlights the results.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/search-preview.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/search-preview.png
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/search-results.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/search-results.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Fancy mail filters!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/mail-rules-edited.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/mail-rules-edited.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Mobile devices are supported.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/screenshots/mobile.png
 | 
					 | 
				
			||||||
    :target: _static/screenshots/mobile.png
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
							
								
								
									
										890
									
								
								docs/setup.rst
									
									
									
									
									
								
							
							
						
						
									
										890
									
								
								docs/setup.rst
									
									
									
									
									
								
							@@ -1,894 +1,12 @@
 | 
				
			|||||||
 | 
					.. _setup:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
*****
 | 
					*****
 | 
				
			||||||
Setup
 | 
					Setup
 | 
				
			||||||
*****
 | 
					*****
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Overview of Paperless-ngx
 | 
					 | 
				
			||||||
#########################
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Compared to paperless, paperless-ngx works a little different under the hood and has
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
more moving parts that work together. While this increases the complexity of
 | 
					 | 
				
			||||||
the system, it also brings many benefits.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless consists of the following components:
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
*   **The webserver:** This is pretty much the same as in paperless. It serves
 | 
					    You will be redirected shortly...
 | 
				
			||||||
    the administration pages, the API, and the new frontend. This is the main
 | 
					 | 
				
			||||||
    tool you'll be using to interact with paperless. You may start the webserver
 | 
					 | 
				
			||||||
    with
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ cd /path/to/paperless/src/
 | 
					 | 
				
			||||||
        $ gunicorn -c ../gunicorn.conf.py paperless.wsgi
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    or by any other means such as Apache ``mod_wsgi``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   **The consumer:** This is what watches your consumption folder for documents.
 | 
					 | 
				
			||||||
    However, the consumer itself does not really consume your documents.
 | 
					 | 
				
			||||||
    Now it notifies a task processor that a new file is ready for consumption.
 | 
					 | 
				
			||||||
    I suppose it should be named differently.
 | 
					 | 
				
			||||||
    This was also used to check your emails, but that's now done elsewhere as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Start the consumer with the management command ``document_consumer``:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ cd /path/to/paperless/src/
 | 
					 | 
				
			||||||
        $ python3 manage.py document_consumer
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. _setup-task_processor:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   **The task processor:** Paperless relies on `Celery - Distributed Task Queue <https://docs.celeryq.dev/en/stable/index.html>`_
 | 
					 | 
				
			||||||
    for doing most of the heavy lifting. This is a task queue that accepts tasks from
 | 
					 | 
				
			||||||
    multiple sources and processes these in parallel. It also comes with a scheduler that executes
 | 
					 | 
				
			||||||
    certain commands periodically.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This task processor is responsible for:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   Consuming documents. When the consumer finds new documents, it notifies the task processor to
 | 
					 | 
				
			||||||
        start a consumption task.
 | 
					 | 
				
			||||||
    *   The task processor also performs the consumption of any documents you upload through
 | 
					 | 
				
			||||||
        the web interface.
 | 
					 | 
				
			||||||
    *   Consuming emails. It periodically checks your configured accounts for new emails and
 | 
					 | 
				
			||||||
        notifies the task processor to consume the attachment of an email.
 | 
					 | 
				
			||||||
    *   Maintaining the search index and the automatic matching algorithm. These are things that paperless
 | 
					 | 
				
			||||||
        needs to do from time to time in order to operate properly.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This allows paperless to process multiple documents from your consumption folder in parallel! On
 | 
					 | 
				
			||||||
    a modern multi core system, this makes the consumption process with full OCR blazingly fast.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The task processor comes with a built-in admin interface that you can use to check whenever any of the
 | 
					 | 
				
			||||||
    tasks fail and inspect the errors (i.e., wrong email credentials, errors during consuming a specific
 | 
					 | 
				
			||||||
    file, etc).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   A `redis <https://redis.io/>`_ message broker: This is a really lightweight service that is responsible
 | 
					 | 
				
			||||||
    for getting the tasks from the webserver and the consumer to the task scheduler. These run in a different
 | 
					 | 
				
			||||||
    process (maybe even on different machines!), and therefore, this is necessary.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Optional: A database server. Paperless supports PostgreSQL, MariaDB and SQLite for storing its data.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Installation
 | 
					 | 
				
			||||||
############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can go multiple routes to setup and run Paperless:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* :ref:`Use the easy install docker script <setup-docker_script>`
 | 
					 | 
				
			||||||
* :ref:`Pull the image from Docker Hub <setup-docker_hub>`
 | 
					 | 
				
			||||||
* :ref:`Build the Docker image yourself <setup-docker_build>`
 | 
					 | 
				
			||||||
* :ref:`Install Paperless directly on your system manually (bare metal) <setup-bare_metal>`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The Docker routes are quick & easy. These are the recommended routes. This configures all the stuff
 | 
					 | 
				
			||||||
from the above automatically so that it just works and uses sensible defaults for all configuration options.
 | 
					 | 
				
			||||||
Here you find a cheat-sheet for docker beginners: `CLI Basics <https://www.sehn.tech/refs/devops-with-docker/>`_
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The bare metal route is complicated to setup but makes it easier
 | 
					 | 
				
			||||||
should you want to contribute some code back. You need to configure and
 | 
					 | 
				
			||||||
run the above mentioned components yourself.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _CLI Basics: https://www.sehn.tech/refs/devops-with-docker/
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-docker_script:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Install Paperless from Docker Hub using the installation script
 | 
					 | 
				
			||||||
===============================================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless provides an interactive installation script. This script will ask you
 | 
					 | 
				
			||||||
for a couple configuration options, download and create the necessary configuration files, pull the docker image, start paperless and create your user account. This script essentially
 | 
					 | 
				
			||||||
performs all the steps described in :ref:`setup-docker_hub` automatically.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Make sure that docker and docker-compose are installed.
 | 
					 | 
				
			||||||
2.  Download and run the installation script:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ bash -c "$(curl -L https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-docker_hub:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Install Paperless from Docker Hub
 | 
					 | 
				
			||||||
=================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Login with your user and create a folder in your home-directory `mkdir -v ~/paperless-ngx` to have a place for your configuration files and consumption directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Go to the `/docker/compose directory on the project page <https://github.com/paperless-ngx/paperless-ngx/tree/master/docker/compose>`_
 | 
					 | 
				
			||||||
    and download one of the `docker-compose.*.yml` files, depending on which database backend you
 | 
					 | 
				
			||||||
    want to use. Rename this file to `docker-compose.yml`.
 | 
					 | 
				
			||||||
    If you want to enable optional support for Office documents, download a file with `-tika` in the file name.
 | 
					 | 
				
			||||||
    Download the ``docker-compose.env`` file and the ``.env`` file as well and store them
 | 
					 | 
				
			||||||
    in the same directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        For new installations, it is recommended to use PostgreSQL as the database
 | 
					 | 
				
			||||||
        backend.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Install `Docker`_ and `docker-compose`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        If you want to use the included ``docker-compose.*.yml`` file, you
 | 
					 | 
				
			||||||
        need to have at least Docker version **17.09.0** and docker-compose
 | 
					 | 
				
			||||||
        version **1.17.0**.
 | 
					 | 
				
			||||||
        To check do: `docker-compose -v` or `docker -v`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        See the `Docker installation guide`_ on how to install the current
 | 
					 | 
				
			||||||
        version of Docker for your operating system or Linux distribution of
 | 
					 | 
				
			||||||
        choice. To get the latest version of docker-compose, follow the
 | 
					 | 
				
			||||||
        `docker-compose installation guide`_ if your package repository doesn't
 | 
					 | 
				
			||||||
        include it.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        .. _Docker installation guide: https://docs.docker.com/engine/installation/
 | 
					 | 
				
			||||||
        .. _docker-compose installation guide: https://docs.docker.com/compose/install/
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Modify ``docker-compose.yml`` to your preferences. You may want to change the path
 | 
					 | 
				
			||||||
    to the consumption directory. Find the line that specifies where
 | 
					 | 
				
			||||||
    to mount the consumption directory:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        - ./consume:/usr/src/paperless/consume
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Replace the part BEFORE the colon with a local directory of your choice:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        - /home/jonaswinkler/paperless-inbox:/usr/src/paperless/consume
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Don't change the part after the colon or paperless wont find your documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You may also need to change the default port that the webserver will use
 | 
					 | 
				
			||||||
    from the default (8000):
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
     .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        ports:
 | 
					 | 
				
			||||||
          - 8000:8000
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Replace the part BEFORE the colon with a port of your choice:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
     .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        ports:
 | 
					 | 
				
			||||||
          - 8010:8000
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Don't change the part after the colon or edit other lines that refer to
 | 
					 | 
				
			||||||
    port 8000. Modifying the part before the colon will map requests on another
 | 
					 | 
				
			||||||
    port to the webserver running on the default port.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    **Rootless**
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If you want to run Paperless as a rootless container, you will need to do the
 | 
					 | 
				
			||||||
    following in your ``docker-compose.yml``:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    - set the ``user`` running the container to map to the ``paperless`` user in the
 | 
					 | 
				
			||||||
      container.
 | 
					 | 
				
			||||||
      This value (``user_id`` below), should be the same id that ``USERMAP_UID`` and
 | 
					 | 
				
			||||||
      ``USERMAP_GID`` are set to in the next step.
 | 
					 | 
				
			||||||
      See ``USERMAP_UID`` and ``USERMAP_GID`` :ref:`here <configuration-docker>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Your entry for Paperless should contain something like:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
     .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        webserver:
 | 
					 | 
				
			||||||
          image: ghcr.io/paperless-ngx/paperless-ngx:latest
 | 
					 | 
				
			||||||
          user: <user_id>
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
5.  Modify ``docker-compose.env``, following the comments in the file. The
 | 
					 | 
				
			||||||
    most important change is to set ``USERMAP_UID`` and ``USERMAP_GID``
 | 
					 | 
				
			||||||
    to the uid and gid of your user on the host system. Use ``id -u`` and
 | 
					 | 
				
			||||||
    ``id -g`` to get these.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This ensures that
 | 
					 | 
				
			||||||
    both the docker container and you on the host machine have write access
 | 
					 | 
				
			||||||
    to the consumption directory. If your UID and GID on the host system is
 | 
					 | 
				
			||||||
    1000 (the default for the first normal user on most systems), it will
 | 
					 | 
				
			||||||
    work out of the box without any modifications. `id "username"` to check.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        You can copy any setting from the file ``paperless.conf.example`` and paste it here.
 | 
					 | 
				
			||||||
        Have a look at :ref:`configuration` to see what's available.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        You can utilize Docker secrets for some configuration settings by
 | 
					 | 
				
			||||||
        appending `_FILE` to some configuration values.  This is supported currently
 | 
					 | 
				
			||||||
        only by:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
          * PAPERLESS_DBUSER
 | 
					 | 
				
			||||||
          * PAPERLESS_DBPASS
 | 
					 | 
				
			||||||
          * PAPERLESS_SECRET_KEY
 | 
					 | 
				
			||||||
          * PAPERLESS_AUTO_LOGIN_USERNAME
 | 
					 | 
				
			||||||
          * PAPERLESS_ADMIN_USER
 | 
					 | 
				
			||||||
          * PAPERLESS_ADMIN_MAIL
 | 
					 | 
				
			||||||
          * PAPERLESS_ADMIN_PASSWORD
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Some file systems such as NFS network shares don't support file system
 | 
					 | 
				
			||||||
        notifications with ``inotify``. When storing the consumption directory
 | 
					 | 
				
			||||||
        on such a file system, paperless will not pick up new files
 | 
					 | 
				
			||||||
        with the default configuration. You will need to use ``PAPERLESS_CONSUMER_POLLING``,
 | 
					 | 
				
			||||||
        which will disable inotify. See :ref:`here <configuration-polling>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
6.  Run ``docker-compose pull``, followed by ``docker-compose up -d``.
 | 
					 | 
				
			||||||
    This will pull the image, create and start the necessary containers.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
7.  To be able to login, you will need a super user. To create it, execute the
 | 
					 | 
				
			||||||
    following command:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code-block:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ docker-compose run --rm webserver createsuperuser
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This will prompt you to set a username, an optional e-mail address and
 | 
					 | 
				
			||||||
    finally a password (at least 8 characters).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
8.  The default ``docker-compose.yml`` exports the webserver on your local port
 | 
					 | 
				
			||||||
    8000. If you did not change this, you should now be able to visit your
 | 
					 | 
				
			||||||
    Paperless instance at ``http://127.0.0.1:8000`` or your servers IP-Address:8000.
 | 
					 | 
				
			||||||
    Use the login credentials you have created with the previous step.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _Docker: https://www.docker.com/
 | 
					 | 
				
			||||||
.. _docker-compose: https://docs.docker.com/compose/install/
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-docker_build:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Build the Docker image yourself
 | 
					 | 
				
			||||||
===============================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Clone the entire repository of paperless:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        git clone https://github.com/paperless-ngx/paperless-ngx
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The master branch always reflects the latest stable version.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Copy one of the ``docker/compose/docker-compose.*.yml`` to ``docker-compose.yml`` in the root folder,
 | 
					 | 
				
			||||||
    depending on which database backend you want to use. Copy
 | 
					 | 
				
			||||||
    ``docker-compose.env`` into the project root as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  In the ``docker-compose.yml`` file, find the line that instructs docker-compose to pull the paperless image from Docker Hub:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: yaml
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        webserver:
 | 
					 | 
				
			||||||
            image: ghcr.io/paperless-ngx/paperless-ngx:latest
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    and replace it with a line that instructs docker-compose to build the image from the current working directory instead:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: yaml
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        webserver:
 | 
					 | 
				
			||||||
            build:
 | 
					 | 
				
			||||||
              context: .
 | 
					 | 
				
			||||||
              args:
 | 
					 | 
				
			||||||
                QPDF_VERSION: x.y.x
 | 
					 | 
				
			||||||
                PIKEPDF_VERSION: x.y.z
 | 
					 | 
				
			||||||
                PSYCOPG2_VERSION: x.y.z
 | 
					 | 
				
			||||||
                JBIG2ENC_VERSION: 0.29
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        You should match the build argument versions to the version for the release you have
 | 
					 | 
				
			||||||
        checked out.  These are pre-built images with certain, more updated software.
 | 
					 | 
				
			||||||
        If you want to build these images your self, that is possible, but beyond
 | 
					 | 
				
			||||||
        the scope of these steps.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Follow steps 3 to 8 of :ref:`setup-docker_hub`. When asked to run
 | 
					 | 
				
			||||||
    ``docker-compose pull`` to pull the image, do
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ docker-compose build
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    instead to build the image.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-bare_metal:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Bare Metal Route
 | 
					 | 
				
			||||||
================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless runs on linux only. The following procedure has been tested on a minimal
 | 
					 | 
				
			||||||
installation of Debian/Buster, which is the current stable release at the time of
 | 
					 | 
				
			||||||
writing. Windows is not and will never be supported.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Install dependencies. Paperless requires the following packages.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``python3`` 3.8, 3.9
 | 
					 | 
				
			||||||
    *   ``python3-pip``
 | 
					 | 
				
			||||||
    *   ``python3-dev``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``default-libmysqlclient-dev`` for MariaDB
 | 
					 | 
				
			||||||
    *   ``fonts-liberation`` for generating thumbnails for plain text files
 | 
					 | 
				
			||||||
    *   ``imagemagick`` >= 6 for PDF conversion
 | 
					 | 
				
			||||||
    *   ``gnupg`` for handling encrypted documents
 | 
					 | 
				
			||||||
    *   ``libpq-dev`` for PostgreSQL
 | 
					 | 
				
			||||||
    *   ``libmagic-dev`` for mime type detection
 | 
					 | 
				
			||||||
    *   ``mariadb-client`` for MariaDB compile time
 | 
					 | 
				
			||||||
    *   ``mime-support`` for mime type detection
 | 
					 | 
				
			||||||
    *   ``libzbar0`` for barcode detection
 | 
					 | 
				
			||||||
    *   ``poppler-utils`` for barcode detection
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Use this list for your preferred package management:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        python3 python3-pip python3-dev imagemagick fonts-liberation gnupg libpq-dev default-libmysqlclient-dev libmagic-dev mime-support libzbar0 poppler-utils
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    These dependencies are required for OCRmyPDF, which is used for text recognition.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``unpaper``
 | 
					 | 
				
			||||||
    *   ``ghostscript``
 | 
					 | 
				
			||||||
    *   ``icc-profiles-free``
 | 
					 | 
				
			||||||
    *   ``qpdf``
 | 
					 | 
				
			||||||
    *   ``liblept5``
 | 
					 | 
				
			||||||
    *   ``libxml2``
 | 
					 | 
				
			||||||
    *   ``pngquant`` (suggested for certain PDF image optimizations)
 | 
					 | 
				
			||||||
    *   ``zlib1g``
 | 
					 | 
				
			||||||
    *   ``tesseract-ocr`` >= 4.0.0 for OCR
 | 
					 | 
				
			||||||
    *   ``tesseract-ocr`` language packs (``tesseract-ocr-eng``, ``tesseract-ocr-deu``, etc)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Use this list for your preferred package management:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        unpaper ghostscript icc-profiles-free qpdf liblept5 libxml2 pngquant zlib1g tesseract-ocr
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    On Raspberry Pi, these libraries are required as well:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``libatlas-base-dev``
 | 
					 | 
				
			||||||
    *   ``libxslt1-dev``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You will also need ``build-essential``, ``python3-setuptools`` and ``python3-wheel``
 | 
					 | 
				
			||||||
    for installing some of the python dependencies.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Install ``redis`` >= 6.0 and configure it to start automatically.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Optional. Install ``postgresql`` and configure a database, user and password for paperless. If you do not wish
 | 
					 | 
				
			||||||
    to use PostgreSQL, MariaDB and SQLite are available as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        On bare-metal installations using SQLite, ensure the
 | 
					 | 
				
			||||||
        `JSON1 extension <https://code.djangoproject.com/wiki/JSON1Extension>`_ is enabled. This is
 | 
					 | 
				
			||||||
        usually the case, but not always.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Get the release archive from `<https://github.com/paperless-ngx/paperless-ngx/releases>`_.
 | 
					 | 
				
			||||||
    If you clone the git repo as it is, you also have to compile the front end by yourself.
 | 
					 | 
				
			||||||
    Extract the archive to a place from where you wish to execute it, such as ``/opt/paperless``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
5.  Configure paperless. See :ref:`configuration` for details. Edit the included ``paperless.conf`` and adjust the
 | 
					 | 
				
			||||||
    settings to your needs. Required settings for getting paperless running are:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``PAPERLESS_REDIS`` should point to your redis server, such as redis://localhost:6379.
 | 
					 | 
				
			||||||
    *   ``PAPERLESS_DBENGINE`` optional, and should be one of `postgres, mariadb, or sqlite`
 | 
					 | 
				
			||||||
    *   ``PAPERLESS_DBHOST`` should be the hostname on which your PostgreSQL server is running. Do not configure this
 | 
					 | 
				
			||||||
        to use SQLite instead. Also configure port, database name, user and password as necessary.
 | 
					 | 
				
			||||||
    *   ``PAPERLESS_CONSUMPTION_DIR`` should point to a folder which paperless should watch for documents. You might
 | 
					 | 
				
			||||||
        want to have this somewhere else. Likewise, ``PAPERLESS_DATA_DIR`` and ``PAPERLESS_MEDIA_ROOT`` define where
 | 
					 | 
				
			||||||
        paperless stores its data. If you like, you can point both to the same directory.
 | 
					 | 
				
			||||||
    *   ``PAPERLESS_SECRET_KEY`` should be a random sequence of characters. It's used for authentication. Failure
 | 
					 | 
				
			||||||
        to do so allows third parties to forge authentication credentials.
 | 
					 | 
				
			||||||
    *   ``PAPERLESS_URL`` if you are behind a reverse proxy. This should point to your domain. Please see
 | 
					 | 
				
			||||||
        :ref:`configuration` for more information.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Many more adjustments can be made to paperless, especially the OCR part. The following options are recommended
 | 
					 | 
				
			||||||
    for everyone:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   Set ``PAPERLESS_OCR_LANGUAGE`` to the language most of your documents are written in.
 | 
					 | 
				
			||||||
    *   Set ``PAPERLESS_TIME_ZONE`` to your local time zone.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
6.  Create a system user under which you wish to run paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        adduser paperless --system --home /opt/paperless --group
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
7.  Ensure that these directories exist
 | 
					 | 
				
			||||||
    and that the paperless user has write permissions to the following directories:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    *   ``/opt/paperless/media``
 | 
					 | 
				
			||||||
    *   ``/opt/paperless/data``
 | 
					 | 
				
			||||||
    *   ``/opt/paperless/consume``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Adjust as necessary if you configured different folders.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
8.  Install python requirements from the ``requirements.txt`` file.
 | 
					 | 
				
			||||||
    It is up to you if you wish to use a virtual environment or not. First you should update your pip, so it gets the actual packages.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        sudo -Hu paperless pip3 install --upgrade pip
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        sudo -Hu paperless pip3 install -r requirements.txt
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This will install all python dependencies in the home directory of
 | 
					 | 
				
			||||||
    the new paperless user.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
9.  Go to ``/opt/paperless/src``, and execute the following commands:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        # This creates the database schema.
 | 
					 | 
				
			||||||
        sudo -Hu paperless python3 manage.py migrate
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        # This creates your first paperless user
 | 
					 | 
				
			||||||
        sudo -Hu paperless python3 manage.py createsuperuser
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
10. Optional: Test that paperless is working by executing
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
      .. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        # This collects static files from paperless and django.
 | 
					 | 
				
			||||||
        sudo -Hu paperless python3 manage.py runserver
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    and pointing your browser to http://localhost:8000/.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This is a development server which should not be used in
 | 
					 | 
				
			||||||
        production. It is not audited for security and performance
 | 
					 | 
				
			||||||
        is inferior to production ready web servers.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This will not start the consumer. Paperless does this in a
 | 
					 | 
				
			||||||
        separate process.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
11. Setup systemd services to run paperless automatically. You may
 | 
					 | 
				
			||||||
    use the service definition files included in the ``scripts`` folder
 | 
					 | 
				
			||||||
    as a starting point.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless needs the ``webserver`` script to run the webserver, the
 | 
					 | 
				
			||||||
    ``consumer`` script to watch the input folder, ``taskqueue`` for the background workers
 | 
					 | 
				
			||||||
    used to handle things like document consumption and the ``scheduler`` script to run tasks such as
 | 
					 | 
				
			||||||
    email checking at certain times .
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
		The ``socket`` script enables ``gunicorn`` to run on port 80 without
 | 
					 | 
				
			||||||
		root privileges. For this you need to uncomment the ``Require=paperless-webserver.socket``
 | 
					 | 
				
			||||||
		in the ``webserver`` script and configure ``gunicorn`` to listen on port 80 (see ``paperless/gunicorn.conf.py``).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You may need to adjust the path to the ``gunicorn`` executable. This
 | 
					 | 
				
			||||||
    will be installed as part of the python dependencies, and is either located
 | 
					 | 
				
			||||||
    in the ``bin`` folder of your virtual environment, or in ``~/.local/bin/`` if
 | 
					 | 
				
			||||||
    no virtual environment is used.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    These services rely on redis and optionally the database server, but
 | 
					 | 
				
			||||||
    don't need to be started in any particular order. The example files
 | 
					 | 
				
			||||||
    depend on redis being started. If you use a database server, you should
 | 
					 | 
				
			||||||
    add additional dependencies.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        The included scripts run a ``gunicorn`` standalone server,
 | 
					 | 
				
			||||||
        which is fine for running paperless. It does support SSL,
 | 
					 | 
				
			||||||
        however, the documentation of GUnicorn states that you should
 | 
					 | 
				
			||||||
        use a proxy server in front of gunicorn instead.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        For instructions on how to use nginx for that,
 | 
					 | 
				
			||||||
        :ref:`see the instructions below <setup-nginx>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
12. Optional: Install a samba server and make the consumption folder
 | 
					 | 
				
			||||||
    available as a network share.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
13. Configure ImageMagick to allow processing of PDF documents. Most distributions have
 | 
					 | 
				
			||||||
    this disabled by default, since PDF documents can contain malware. If
 | 
					 | 
				
			||||||
    you don't do this, paperless will fall back to ghostscript for certain steps
 | 
					 | 
				
			||||||
    such as thumbnail generation.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Edit ``/etc/ImageMagick-6/policy.xml`` and adjust
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        <policy domain="coder" rights="none" pattern="PDF" />
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    to
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        <policy domain="coder" rights="read|write" pattern="PDF" />
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
14. Optional: Install the `jbig2enc <https://ocrmypdf.readthedocs.io/en/latest/jbig2.html>`_
 | 
					 | 
				
			||||||
    encoder. This will reduce the size of generated PDF documents. You'll most likely need
 | 
					 | 
				
			||||||
    to compile this by yourself, because this software has been patented until around 2017 and
 | 
					 | 
				
			||||||
    binary packages are not available for most distributions.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
15. Optional: If using the NLTK machine learning processing (see ``PAPERLESS_ENABLE_NLTK`` in
 | 
					 | 
				
			||||||
    :ref:`configuration` for details), download the NLTK data for the Snowball Stemmer, Stopwords
 | 
					 | 
				
			||||||
    and Punkt tokenizer to your ``PAPERLESS_DATA_DIR/nltk``.  Refer to
 | 
					 | 
				
			||||||
    the `NLTK instructions <https://www.nltk.org/data.html>`_ for details on how to
 | 
					 | 
				
			||||||
    download the data.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Migrating to Paperless-ngx
 | 
					 | 
				
			||||||
##########################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Migration is possible both from Paperless-ng or directly from the 'original' Paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Migrating from Paperless-ng
 | 
					 | 
				
			||||||
===========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx is meant to be a drop-in replacement for Paperless-ng and thus upgrading should be
 | 
					 | 
				
			||||||
trivial for most users, especially when using docker. However, as with any major change, it is
 | 
					 | 
				
			||||||
recommended to take a full backup first. Once you are ready, simply change the docker image to
 | 
					 | 
				
			||||||
point to the new source. E.g. if using Docker Compose, edit ``docker-compose.yml`` and change:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  image: jonaswinkler/paperless-ng:latest
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
to
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  image: ghcr.io/paperless-ngx/paperless-ngx:latest
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
and then run ``docker-compose up -d`` which will pull the new image recreate the container.
 | 
					 | 
				
			||||||
That's it!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Users who installed with the bare-metal route should also update their Git clone to point to
 | 
					 | 
				
			||||||
``https://github.com/paperless-ngx/paperless-ngx``, e.g. using the command
 | 
					 | 
				
			||||||
``git remote set-url origin https://github.com/paperless-ngx/paperless-ngx`` and then pull the
 | 
					 | 
				
			||||||
lastest version.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Migrating from Paperless
 | 
					 | 
				
			||||||
========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
At its core, paperless-ngx is still paperless and fully compatible. However, some
 | 
					 | 
				
			||||||
things have changed under the hood, so you need to adapt your setup depending on
 | 
					 | 
				
			||||||
how you installed paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This setup describes how to update an existing paperless Docker installation.
 | 
					 | 
				
			||||||
The important things to keep in mind are as follows:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* Read the :doc:`changelog </changelog>` and take note of breaking changes.
 | 
					 | 
				
			||||||
* You should decide if you want to stick with SQLite or want to migrate your database
 | 
					 | 
				
			||||||
  to PostgreSQL. See :ref:`setup-sqlite_to_psql` for details on how to move your data from
 | 
					 | 
				
			||||||
  SQLite to PostgreSQL. Both work fine with paperless. However, if you already have a
 | 
					 | 
				
			||||||
  database server running for other services, you might as well use it for paperless as well.
 | 
					 | 
				
			||||||
* The task scheduler of paperless, which is used to execute periodic tasks
 | 
					 | 
				
			||||||
  such as email checking and maintenance, requires a `redis`_ message broker
 | 
					 | 
				
			||||||
  instance. The docker-compose route takes care of that.
 | 
					 | 
				
			||||||
* The layout of the folder structure for your documents and data remains the
 | 
					 | 
				
			||||||
  same, so you can just plug your old docker volumes into paperless-ngx and
 | 
					 | 
				
			||||||
  expect it to find everything where it should be.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Migration to paperless-ngx is then performed in a few simple steps:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Stop paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ cd /path/to/current/paperless
 | 
					 | 
				
			||||||
        $ docker-compose down
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Do a backup for two purposes: If something goes wrong, you still have your
 | 
					 | 
				
			||||||
    data. Second, if you don't like paperless-ngx, you can switch back to
 | 
					 | 
				
			||||||
    paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Download the latest release of paperless-ngx. You can either go with the
 | 
					 | 
				
			||||||
    docker-compose files from `here <https://github.com/paperless-ngx/paperless-ngx/tree/master/docker/compose>`__
 | 
					 | 
				
			||||||
    or clone the repository to build the image yourself (see :ref:`above <setup-docker_build>`).
 | 
					 | 
				
			||||||
    You can either replace your current paperless folder or put paperless-ngx
 | 
					 | 
				
			||||||
    in a different location.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        Paperless-ngx includes a ``.env`` file. This will set the
 | 
					 | 
				
			||||||
        project name for docker compose to ``paperless``, which will also define the name
 | 
					 | 
				
			||||||
        of the volumes by paperless-ngx. However, if you experience that paperless-ngx
 | 
					 | 
				
			||||||
        is not using your old paperless volumes, verify the names of your volumes with
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            $ docker volume ls | grep _data
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        and adjust the project name in the ``.env`` file so that it matches the name
 | 
					 | 
				
			||||||
        of the volumes before the ``_data`` part.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Download the ``docker-compose.sqlite.yml`` file to ``docker-compose.yml``.
 | 
					 | 
				
			||||||
    If you want to switch to PostgreSQL, do that after you migrated your existing
 | 
					 | 
				
			||||||
    SQLite database.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
5.  Adjust ``docker-compose.yml`` and ``docker-compose.env`` to your needs.
 | 
					 | 
				
			||||||
    See :ref:`setup-docker_hub` for details on which edits are advised.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
6.  :ref:`Update paperless. <administration-updating>`
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
7.  In order to find your existing documents with the new search feature, you need
 | 
					 | 
				
			||||||
    to invoke a one-time operation that will create the search index:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ docker-compose run --rm webserver document_index reindex
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This will migrate your database and create the search index. After that,
 | 
					 | 
				
			||||||
    paperless will take care of maintaining the index by itself.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
8.  Start paperless-ngx.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ docker-compose up -d
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This will run paperless in the background and automatically start it on system boot.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
9.  Paperless installed a permanent redirect to ``admin/`` in your browser. This
 | 
					 | 
				
			||||||
    redirect is still in place and prevents access to the new UI. Clear your
 | 
					 | 
				
			||||||
    browsing cache in order to fix this.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
10.  Optionally, follow the instructions below to migrate your existing data to PostgreSQL.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Migrating from LinuxServer.io Docker Image
 | 
					 | 
				
			||||||
==========================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
As with any upgrades and large changes, it is highly recommended to create a backup before
 | 
					 | 
				
			||||||
starting.  This assumes the image was running using Docker Compose, but the instructions
 | 
					 | 
				
			||||||
are translatable to Docker commands as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Stop and remove the paperless container
 | 
					 | 
				
			||||||
2.  If using an external database, stop the container
 | 
					 | 
				
			||||||
3.  Update Redis configuration
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a)  If ``REDIS_URL`` is already set, change it to ``PAPERLESS_REDIS`` and continue
 | 
					 | 
				
			||||||
        to step 4.
 | 
					 | 
				
			||||||
    b)  Otherwise, in the ``docker-compose.yml`` add a new service for Redis,
 | 
					 | 
				
			||||||
        following `the example compose files <https://github.com/paperless-ngx/paperless-ngx/tree/main/docker/compose>`_
 | 
					 | 
				
			||||||
    c)  Set the environment variable ``PAPERLESS_REDIS`` so it points to the new Redis container
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Update user mapping
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a)  If set, change the environment variable ``PUID`` to ``USERMAP_UID``
 | 
					 | 
				
			||||||
    b)  If set, change the environment variable ``PGID`` to ``USERMAP_GID``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
5.  Update configuration paths
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a) Set the environment variable ``PAPERLESS_DATA_DIR``
 | 
					 | 
				
			||||||
       to ``/config``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
6.  Update media paths
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a) Set the environment variable ``PAPERLESS_MEDIA_ROOT``
 | 
					 | 
				
			||||||
       to ``/data/media``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
7.  Update timezone
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a) Set the environment variable ``PAPERLESS_TIME_ZONE``
 | 
					 | 
				
			||||||
       to the same value as ``TZ``
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
8.  Modify the ``image:`` to point to ``ghcr.io/paperless-ngx/paperless-ngx:latest`` or
 | 
					 | 
				
			||||||
    a specific version if preferred.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
9.  Start the containers as before, using ``docker-compose``.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-sqlite_to_psql:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Moving data from SQLite to PostgreSQL or MySQL/MariaDB
 | 
					 | 
				
			||||||
======================================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Moving your data from SQLite to PostgreSQL or MySQL/MariaDB is done via executing a series of django
 | 
					 | 
				
			||||||
management commands as below.  The commands below use PostgreSQL, but are applicable to MySQL/MariaDB
 | 
					 | 
				
			||||||
with the
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Make sure that your SQLite database is migrated to the latest version.
 | 
					 | 
				
			||||||
    Starting paperless will make sure that this is the case. If your try to
 | 
					 | 
				
			||||||
    load data from an old database schema in SQLite into a newer database
 | 
					 | 
				
			||||||
    schema in PostgreSQL, you will run into trouble.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    On some database fields, PostgreSQL enforces predefined limits on maximum
 | 
					 | 
				
			||||||
    length, whereas SQLite does not. The fields in question are the title of documents
 | 
					 | 
				
			||||||
    (128 characters), names of document types, tags and correspondents (128 characters),
 | 
					 | 
				
			||||||
    and filenames (1024 characters). If you have data in these fields that surpasses these
 | 
					 | 
				
			||||||
    limits, migration to PostgreSQL is not possible and will fail with an error.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    MySQL is case insensitive by default, treating values like "Name" and "NAME" as identical.
 | 
					 | 
				
			||||||
    See :ref:`advanced-mysql-caveats` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Stop paperless, if it is running.
 | 
					 | 
				
			||||||
2.  Tell paperless to use PostgreSQL:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a)  With docker, copy the provided ``docker-compose.postgres.yml`` file to
 | 
					 | 
				
			||||||
        ``docker-compose.yml``. Remember to adjust the consumption directory,
 | 
					 | 
				
			||||||
        if necessary.
 | 
					 | 
				
			||||||
    b)  Without docker, configure the database in your ``paperless.conf`` file.
 | 
					 | 
				
			||||||
        See :ref:`configuration` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
3.  Open a shell and initialize the database:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    a)  With docker, run the following command to open a shell within the paperless
 | 
					 | 
				
			||||||
        container:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            $ cd /path/to/paperless
 | 
					 | 
				
			||||||
            $ docker-compose run --rm webserver /bin/bash
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This will launch the container and initialize the PostgreSQL database.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    b)  Without docker, remember to activate any virtual environment, switch to
 | 
					 | 
				
			||||||
        the ``src`` directory and create the database schema:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            $ cd /path/to/paperless/src
 | 
					 | 
				
			||||||
            $ python3 manage.py migrate
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        This will not copy any data yet.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
4.  Dump your data from SQLite:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ python3 manage.py dumpdata --database=sqlite --exclude=contenttypes --exclude=auth.Permission > data.json
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
5.  Load your data into PostgreSQL:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ python3 manage.py loaddata data.json
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
6.  If operating inside Docker, you may exit the shell now.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ exit
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
7.  Start paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Moving back to Paperless
 | 
					 | 
				
			||||||
========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Lets say you migrated to Paperless-ngx and used it for a while, but decided that
 | 
					 | 
				
			||||||
you don't like it and want to move back (If you do, send me a mail about what
 | 
					 | 
				
			||||||
part you didn't like!), you can totally do that with a few simple steps.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless-ngx modified the database schema slightly, however, these changes can
 | 
					 | 
				
			||||||
be reverted while keeping your current data, so that your current data will
 | 
					 | 
				
			||||||
be compatible with original Paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Execute this:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ cd /path/to/paperless
 | 
					 | 
				
			||||||
    $ docker-compose run --rm webserver migrate documents 0023
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Or without docker:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    $ cd /path/to/paperless/src
 | 
					 | 
				
			||||||
    $ python3 manage.py migrate documents 0023
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
After that, you need to clear your cookies (Paperless-ngx comes with updated
 | 
					 | 
				
			||||||
dependencies that do cookie-processing differently) and probably your cache
 | 
					 | 
				
			||||||
as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-less_powerful_devices:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Considerations for less powerful devices
 | 
					 | 
				
			||||||
########################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless runs on Raspberry Pi. However, some things are rather slow on the Pi and
 | 
					 | 
				
			||||||
configuring some options in paperless can help improve performance immensely:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Stick with SQLite to save some resources.
 | 
					 | 
				
			||||||
*   Consider setting ``PAPERLESS_OCR_PAGES`` to 1, so that paperless will only OCR
 | 
					 | 
				
			||||||
    the first page of your documents. In most cases, this page contains enough
 | 
					 | 
				
			||||||
    information to be able to find it.
 | 
					 | 
				
			||||||
*   ``PAPERLESS_TASK_WORKERS`` and ``PAPERLESS_THREADS_PER_WORKER`` are configured
 | 
					 | 
				
			||||||
    to use all cores. The Raspberry Pi models 3 and up have 4 cores, meaning that
 | 
					 | 
				
			||||||
    paperless will use 2 workers and 2 threads per worker. This may result in
 | 
					 | 
				
			||||||
    sluggish response times during consumption, so you might want to lower these
 | 
					 | 
				
			||||||
    settings (example: 2 workers and 1 thread to always have some computing power
 | 
					 | 
				
			||||||
    left for other tasks).
 | 
					 | 
				
			||||||
*   Keep ``PAPERLESS_OCR_MODE`` at its default value ``skip`` and consider OCR'ing
 | 
					 | 
				
			||||||
    your documents before feeding them into paperless. Some scanners are able to
 | 
					 | 
				
			||||||
    do this! You might want to even specify ``skip_noarchive`` to skip archive
 | 
					 | 
				
			||||||
    file generation for already ocr'ed documents entirely.
 | 
					 | 
				
			||||||
*   If you want to perform OCR on the device, consider using ``PAPERLESS_OCR_CLEAN=none``.
 | 
					 | 
				
			||||||
    This will speed up OCR times and use less memory at the expense of slightly worse
 | 
					 | 
				
			||||||
    OCR results.
 | 
					 | 
				
			||||||
*   If using docker, consider setting ``PAPERLESS_WEBSERVER_WORKERS`` to
 | 
					 | 
				
			||||||
    1. This will save some memory.
 | 
					 | 
				
			||||||
*   Consider setting ``PAPERLESS_ENABLE_NLTK`` to false, to disable the more
 | 
					 | 
				
			||||||
    advanced language processing, which can take more memory and processing time.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
For details, refer to :ref:`configuration`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Updating the :ref:`automatic matching algorithm <advanced-automatic_matching>`
 | 
					 | 
				
			||||||
    takes quite a bit of time. However, the update mechanism checks if your
 | 
					 | 
				
			||||||
    data has changed before doing the heavy lifting. If you experience the
 | 
					 | 
				
			||||||
    algorithm taking too much cpu time, consider changing the schedule in the
 | 
					 | 
				
			||||||
    admin interface to daily. You can also manually invoke the task
 | 
					 | 
				
			||||||
    by changing the date and time of the next run to today/now.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The actual matching of the algorithm is fast and works on Raspberry Pi as
 | 
					 | 
				
			||||||
    well as on any other device.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _redis: https://redis.io/
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _setup-nginx:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Using nginx as a reverse proxy
 | 
					 | 
				
			||||||
##############################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you want to expose paperless to the internet, you should hide it behind a
 | 
					 | 
				
			||||||
reverse proxy with SSL enabled.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
In addition to the usual configuration for SSL,
 | 
					 | 
				
			||||||
the following configuration is required for paperless to operate:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: nginx
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    http {
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        # Adjust as required. This is the maximum size for file uploads.
 | 
					 | 
				
			||||||
        # The default value 1M might be a little too small.
 | 
					 | 
				
			||||||
        client_max_body_size 10M;
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        server {
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
            location / {
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                # Adjust host and port as required.
 | 
					 | 
				
			||||||
                proxy_pass http://localhost:8000/;
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                # These configuration options are required for WebSockets to work.
 | 
					 | 
				
			||||||
                proxy_http_version 1.1;
 | 
					 | 
				
			||||||
                proxy_set_header Upgrade $http_upgrade;
 | 
					 | 
				
			||||||
                proxy_set_header Connection "upgrade";
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
                proxy_redirect off;
 | 
					 | 
				
			||||||
                proxy_set_header Host $host;
 | 
					 | 
				
			||||||
                proxy_set_header X-Real-IP $remote_addr;
 | 
					 | 
				
			||||||
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 | 
					 | 
				
			||||||
                proxy_set_header X-Forwarded-Host $server_name;
 | 
					 | 
				
			||||||
            }
 | 
					 | 
				
			||||||
        }
 | 
					 | 
				
			||||||
    }
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The ``PAPERLESS_URL`` configuration variable is also required when using a reverse proxy. Please refer to the :ref:`hosting-and-security` docs.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Also read `this <https://channels.readthedocs.io/en/stable/deploying.html#nginx-supervisor-ubuntu>`__, towards the end of the section.
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
@@ -1,328 +1,12 @@
 | 
				
			|||||||
 | 
					.. _troubleshooting:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
***************
 | 
					***************
 | 
				
			||||||
Troubleshooting
 | 
					Troubleshooting
 | 
				
			||||||
***************
 | 
					***************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
No files are added by the consumer
 | 
					 | 
				
			||||||
##################################
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Check for the following issues:
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
*   Ensure that the directory you're putting your documents in is the folder
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
    paperless is watching. With docker, this setting is performed in the
 | 
					 | 
				
			||||||
    ``docker-compose.yml`` file. Without docker, look at the ``CONSUMPTION_DIR``
 | 
					 | 
				
			||||||
    setting. Don't adjust this setting if you're using docker.
 | 
					 | 
				
			||||||
*   Ensure that redis is up and running. Paperless does its task processing
 | 
					 | 
				
			||||||
    asynchronously, and for documents to arrive at the task processor, it needs
 | 
					 | 
				
			||||||
    redis to run.
 | 
					 | 
				
			||||||
*   Ensure that the task processor is running. Docker does this automatically.
 | 
					 | 
				
			||||||
    Manually invoke the task processor by executing
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
    .. code:: shell-session
 | 
					    You will be redirected shortly...
 | 
				
			||||||
 | 
					 | 
				
			||||||
        $ celery --app paperless worker
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   Look at the output of paperless and inspect it for any errors.
 | 
					 | 
				
			||||||
*   Go to the admin interface, and check if there are failed tasks. If so, the
 | 
					 | 
				
			||||||
    tasks will contain an error message.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consumer warns ``OCR for XX failed``
 | 
					 | 
				
			||||||
####################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you find the OCR accuracy to be too low, and/or the document consumer warns
 | 
					 | 
				
			||||||
that ``OCR for XX failed, but we're going to stick with what we've got since
 | 
					 | 
				
			||||||
FORGIVING_OCR is enabled``, then you might need to install the
 | 
					 | 
				
			||||||
`Tesseract language files <http://packages.ubuntu.com/search?keywords=tesseract-ocr>`_
 | 
					 | 
				
			||||||
marching your document's languages.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
As an example, if you are running Paperless-ngx from any Ubuntu or Debian
 | 
					 | 
				
			||||||
box, and your documents are written in Spanish you may need to run::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    apt-get install -y tesseract-ocr-spa
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consumer fails to pickup any new files
 | 
					 | 
				
			||||||
######################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you notice that the consumer will only pickup files in the consumption
 | 
					 | 
				
			||||||
directory at startup, but won't find any other files added later, you will need to
 | 
					 | 
				
			||||||
enable filesystem polling with the configuration option
 | 
					 | 
				
			||||||
``PAPERLESS_CONSUMER_POLLING``, see :ref:`here <configuration-polling>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This will disable listening to filesystem changes with inotify and paperless will
 | 
					 | 
				
			||||||
manually check the consumption directory for changes instead.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless always redirects to /admin
 | 
					 | 
				
			||||||
####################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You probably had the old paperless installed at some point. Paperless installed
 | 
					 | 
				
			||||||
a permanent redirect to /admin in your browser, and you need to clear your
 | 
					 | 
				
			||||||
browsing data / cache to fix that.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Operation not permitted
 | 
					 | 
				
			||||||
#######################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might see errors such as:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    chown: changing ownership of '../export': Operation not permitted
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The container tries to set file ownership on the listed directories. This is
 | 
					 | 
				
			||||||
required so that the user running paperless inside docker has write permissions
 | 
					 | 
				
			||||||
to these folders. This happens when pointing these directories to NFS shares,
 | 
					 | 
				
			||||||
for example.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Ensure that ``chown`` is possible on these directories.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Classifier error: No training data available
 | 
					 | 
				
			||||||
############################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This indicates that the Auto matching algorithm found no documents to learn from.
 | 
					 | 
				
			||||||
This may have two reasons:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   You don't use the Auto matching algorithm: The error can be safely ignored in this case.
 | 
					 | 
				
			||||||
*   You are using the Auto matching algorithm: The classifier explicitly excludes documents
 | 
					 | 
				
			||||||
    with Inbox tags. Verify that there are documents in your archive without inbox tags.
 | 
					 | 
				
			||||||
    The algorithm will only learn from documents not in your inbox.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
UserWarning in sklearn on every single document
 | 
					 | 
				
			||||||
###############################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You may encounter warnings like this:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    /usr/local/lib/python3.7/site-packages/sklearn/base.py:315:
 | 
					 | 
				
			||||||
    UserWarning: Trying to unpickle estimator CountVectorizer from version 0.23.2 when using version 0.24.0.
 | 
					 | 
				
			||||||
    This might lead to breaking code or invalid results. Use at your own risk.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This happens when certain dependencies of paperless that are responsible for the auto matching algorithm are
 | 
					 | 
				
			||||||
updated. After updating these, your current training data *might* not be compatible anymore. This can be ignored
 | 
					 | 
				
			||||||
in most cases. This warning will disappear automatically when paperless updates the training data.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you want to get rid of the warning or actually experience issues with automatic matching, delete
 | 
					 | 
				
			||||||
the file ``classification_model.pickle`` in the data directory and let paperless recreate it.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
504 Server Error: Gateway Timeout when adding Office documents
 | 
					 | 
				
			||||||
##############################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You may experience these errors when using the optional TIKA integration:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: http://gotenberg:3000/forms/libreoffice/convert
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Gotenberg is a server that converts Office documents into PDF documents and has a default timeout of 30 seconds.
 | 
					 | 
				
			||||||
When conversion takes longer, Gotenberg raises this error.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can increase the timeout by configuring a command flag for Gotenberg (see also `here <https://gotenberg.dev/docs/modules/api#properties>`__).
 | 
					 | 
				
			||||||
If using docker-compose, this is achieved by the following configuration change in the ``docker-compose.yml`` file:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: yaml
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    gotenberg:
 | 
					 | 
				
			||||||
        image: gotenberg/gotenberg:7.6
 | 
					 | 
				
			||||||
        restart: unless-stopped
 | 
					 | 
				
			||||||
        command:
 | 
					 | 
				
			||||||
            - "gotenberg"
 | 
					 | 
				
			||||||
            - "--chromium-disable-routes=true"
 | 
					 | 
				
			||||||
            - "--api-timeout=60"
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Permission denied errors in the consumption directory
 | 
					 | 
				
			||||||
#####################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might encounter errors such as:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The following error occured while consuming document.pdf: [Errno 13] Permission denied: '/usr/src/paperless/src/../consume/document.pdf'
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This happens when paperless does not have permission to delete files inside the consumption directory.
 | 
					 | 
				
			||||||
Ensure that ``USERMAP_UID`` and ``USERMAP_GID`` are set to the user id and group id you use on the host operating system, if these are
 | 
					 | 
				
			||||||
different from ``1000``. See :ref:`setup-docker_hub`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Also ensure that you are able to read and write to the consumption directory on the host.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
OSError: [Errno 19] No such device when consuming files
 | 
					 | 
				
			||||||
#######################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
If you experience errors such as:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code:: shell-session
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    File "/usr/local/lib/python3.7/site-packages/whoosh/codec/base.py", line 570, in open_compound_file
 | 
					 | 
				
			||||||
    return CompoundStorage(dbfile, use_mmap=storage.supports_mmap)
 | 
					 | 
				
			||||||
    File "/usr/local/lib/python3.7/site-packages/whoosh/filedb/compound.py", line 75, in __init__
 | 
					 | 
				
			||||||
    self._source = mmap.mmap(fileno, 0, access=mmap.ACCESS_READ)
 | 
					 | 
				
			||||||
    OSError: [Errno 19] No such device
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    During handling of the above exception, another exception occurred:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Traceback (most recent call last):
 | 
					 | 
				
			||||||
    File "/usr/local/lib/python3.7/site-packages/django_q/cluster.py", line 436, in worker
 | 
					 | 
				
			||||||
    res = f(*task["args"], **task["kwargs"])
 | 
					 | 
				
			||||||
    File "/usr/src/paperless/src/documents/tasks.py", line 73, in consume_file
 | 
					 | 
				
			||||||
    override_tag_ids=override_tag_ids)
 | 
					 | 
				
			||||||
    File "/usr/src/paperless/src/documents/consumer.py", line 271, in try_consume_file
 | 
					 | 
				
			||||||
    raise ConsumerError(e)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless uses a search index to provide better and faster full text searching. This search index is stored inside
 | 
					 | 
				
			||||||
the ``data`` folder. The search index uses memory-mapped files (mmap). The above error indicates that paperless
 | 
					 | 
				
			||||||
was unable to create and open these files.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This happens when you're trying to store the data directory on certain file systems (mostly network shares)
 | 
					 | 
				
			||||||
that don't support memory-mapped files.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Web-UI stuck at "Loading..."
 | 
					 | 
				
			||||||
############################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This might have multiple reasons.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  If you built the docker image yourself or deployed using the bare metal route,
 | 
					 | 
				
			||||||
    make sure that there are files in ``<paperless-root>/static/frontend/<lang-code>/``.
 | 
					 | 
				
			||||||
    If there are no files, make sure that you executed ``collectstatic`` successfully, either
 | 
					 | 
				
			||||||
    manually or as part of the docker image build.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    If the front end is still missing, make sure that the front end is compiled (files present in
 | 
					 | 
				
			||||||
    ``src/documents/static/frontend``). If it is not, you need to compile the front end yourself
 | 
					 | 
				
			||||||
    or download the release archive instead of cloning the repository.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
2.  Check the output of the web server. You might see errors like this:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        [2021-01-25 10:08:04 +0000] [40] [ERROR] Socket error processing request.
 | 
					 | 
				
			||||||
        Traceback (most recent call last):
 | 
					 | 
				
			||||||
        File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 134, in handle
 | 
					 | 
				
			||||||
            self.handle_request(listener, req, client, addr)
 | 
					 | 
				
			||||||
        File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 190, in handle_request
 | 
					 | 
				
			||||||
            util.reraise(*sys.exc_info())
 | 
					 | 
				
			||||||
        File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 625, in reraise
 | 
					 | 
				
			||||||
            raise value
 | 
					 | 
				
			||||||
        File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 178, in handle_request
 | 
					 | 
				
			||||||
            resp.write_file(respiter)
 | 
					 | 
				
			||||||
        File "/usr/local/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 396, in write_file
 | 
					 | 
				
			||||||
            if not self.sendfile(respiter):
 | 
					 | 
				
			||||||
        File "/usr/local/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 386, in sendfile
 | 
					 | 
				
			||||||
            sent += os.sendfile(sockno, fileno, offset + sent, count)
 | 
					 | 
				
			||||||
        OSError: [Errno 22] Invalid argument
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    To fix this issue, add
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    .. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
        SENDFILE=0
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    to your `docker-compose.env` file.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Error while reading metadata
 | 
					 | 
				
			||||||
############################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might find messages like these in your log files:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [WARNING] [paperless.parsing.tesseract] Error while reading metadata
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This indicates that paperless failed to read PDF metadata from one of your documents. This happens when you
 | 
					 | 
				
			||||||
open the affected documents in paperless for editing. Paperless will continue to work, and will simply not
 | 
					 | 
				
			||||||
show the invalid metadata.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consumer fails with a FileNotFoundError
 | 
					 | 
				
			||||||
#######################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might find messages like these in your log files:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [ERROR] [paperless.consumer] Error while consuming document SCN_0001.pdf: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.yhk3zbv0/origin.pdf'
 | 
					 | 
				
			||||||
    Traceback (most recent call last):
 | 
					 | 
				
			||||||
      File "/app/paperless/src/paperless_tesseract/parsers.py", line 261, in parse
 | 
					 | 
				
			||||||
        ocrmypdf.ocr(**args)
 | 
					 | 
				
			||||||
      File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/api.py", line 337, in ocr
 | 
					 | 
				
			||||||
        return run_pipeline(options=options, plugin_manager=plugin_manager, api=True)
 | 
					 | 
				
			||||||
      File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_sync.py", line 385, in run_pipeline
 | 
					 | 
				
			||||||
        exec_concurrent(context, executor)
 | 
					 | 
				
			||||||
      File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_sync.py", line 302, in exec_concurrent
 | 
					 | 
				
			||||||
        pdf = post_process(pdf, context, executor)
 | 
					 | 
				
			||||||
      File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_sync.py", line 235, in post_process
 | 
					 | 
				
			||||||
        pdf_out = metadata_fixup(pdf_out, context)
 | 
					 | 
				
			||||||
      File "/usr/local/lib/python3.8/dist-packages/ocrmypdf/_pipeline.py", line 798, in metadata_fixup
 | 
					 | 
				
			||||||
        with pikepdf.open(context.origin) as original, pikepdf.open(working_file) as pdf:
 | 
					 | 
				
			||||||
      File "/usr/local/lib/python3.8/dist-packages/pikepdf/_methods.py", line 923, in open
 | 
					 | 
				
			||||||
        pdf = Pdf._open(
 | 
					 | 
				
			||||||
    FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.yhk3zbv0/origin.pdf'
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This probably indicates paperless tried to consume the same file twice.  This can happen for a number of reasons,
 | 
					 | 
				
			||||||
depending on how documents are placed into the consume folder.  If paperless is using inotify (the default) to
 | 
					 | 
				
			||||||
check for documents, try adjusting the :ref:`inotify configuration <configuration-inotify>`.  If polling is enabled,
 | 
					 | 
				
			||||||
try adjusting the :ref:`polling configuration <configuration-polling>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consumer fails waiting for file to remain unmodified.
 | 
					 | 
				
			||||||
#####################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might find messages like these in your log files:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [ERROR] [paperless.management.consumer] Timeout while waiting on file /usr/src/paperless/src/../consume/SCN_0001.pdf to remain unmodified.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This indicates paperless timed out while waiting for the file to be completely written to the consume folder.
 | 
					 | 
				
			||||||
Adjusting :ref:`polling configuration <configuration-polling>` values should resolve the issue.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The user will need to manually move the file out of the consume folder and
 | 
					 | 
				
			||||||
    back in, for the initial failing file to be consumed.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consumer fails reporting "OS reports file as busy still".
 | 
					 | 
				
			||||||
#########################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might find messages like these in your log files:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/src/../consume/SCN_0001.pdf: OS reports file as busy still
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
This indicates paperless was unable to open the file, as the OS reported the file as still being in use.  To prevent a
 | 
					 | 
				
			||||||
crash, paperless did not try to consume the file.  If paperless is using inotify (the default) to
 | 
					 | 
				
			||||||
check for documents, try adjusting the :ref:`inotify configuration <configuration-inotify>`.  If polling is enabled,
 | 
					 | 
				
			||||||
try adjusting the :ref:`polling configuration <configuration-polling>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The user will need to manually move the file out of the consume folder and
 | 
					 | 
				
			||||||
    back in, for the initial failing file to be consumed.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Log reports "Creating PaperlessTask failed".
 | 
					 | 
				
			||||||
#########################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You might find messages like these in your log files:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    [ERROR] [paperless.management.consumer] Creating PaperlessTask failed: db locked
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You are likely using an sqlite based installation, with an increased number of workers and are running into sqlite's concurrency limitations.
 | 
					 | 
				
			||||||
Uploading or consuming multiple files at once results in many workers attempting to access the database simultaneously.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Consider changing to the PostgreSQL database if you will be processing many documents at once often.  Otherwise,
 | 
					 | 
				
			||||||
try tweaking the ``PAPERLESS_DB_TIMEOUT`` setting to allow more time for the database to unlock.  This may have
 | 
					 | 
				
			||||||
minor performance implications.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
gunicorn fails to start with "is not a valid port number"
 | 
					 | 
				
			||||||
#########################################################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You are likely running using Kubernetes, which automatically creates an environment variable named `${serviceName}_PORT`.
 | 
					 | 
				
			||||||
This is the same environment variable which is used by Paperless to optionally change the port gunicorn listens on.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
To fix this, set `PAPERLESS_PORT` again to your desired port, or the default of 8000.
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
@@ -1,420 +1,12 @@
 | 
				
			|||||||
 | 
					.. _usage_overview:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
**************
 | 
					**************
 | 
				
			||||||
Usage Overview
 | 
					Usage Overview
 | 
				
			||||||
**************
 | 
					**************
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless is an application that manages your personal documents. With
 | 
					 | 
				
			||||||
the help of a document scanner (see :ref:`scanners`), paperless transforms
 | 
					 | 
				
			||||||
your wieldy physical document binders into a searchable archive and
 | 
					 | 
				
			||||||
provides many utilities for finding and managing your documents.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					.. cssclass:: redirect-notice
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Terms and definitions
 | 
					    The Paperless-ngx documentation has permanently moved.
 | 
				
			||||||
#####################
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless essentially consists of two different parts for managing your
 | 
					    You will be redirected shortly...
 | 
				
			||||||
documents:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* The *consumer* watches a specified folder and adds all documents in that
 | 
					 | 
				
			||||||
  folder to paperless.
 | 
					 | 
				
			||||||
* The *web server* provides a UI that you use to manage and search for your
 | 
					 | 
				
			||||||
  scanned documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Each document has a couple of fields that you can assign to them:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* A *Document* is a piece of paper that sometimes contains valuable
 | 
					 | 
				
			||||||
  information.
 | 
					 | 
				
			||||||
* The *correspondent* of a document is the person, institution or company that
 | 
					 | 
				
			||||||
  a document either originates from, or is sent to.
 | 
					 | 
				
			||||||
* A *tag* is a label that you can assign to documents. Think of labels as more
 | 
					 | 
				
			||||||
  powerful folders: Multiple documents can be grouped together with a single
 | 
					 | 
				
			||||||
  tag, however, a single document can also have multiple tags. This is not
 | 
					 | 
				
			||||||
  possible with folders. The reason folders are not implemented in paperless
 | 
					 | 
				
			||||||
  is simply that tags are much more versatile than folders.
 | 
					 | 
				
			||||||
* A *document type* is used to demarcate the type of a document such as letter,
 | 
					 | 
				
			||||||
  bank statement, invoice, contract, etc. It is used to identify what a document
 | 
					 | 
				
			||||||
  is about.
 | 
					 | 
				
			||||||
* The *date added* of a document is the date the document was scanned into
 | 
					 | 
				
			||||||
  paperless. You cannot and should not change this date.
 | 
					 | 
				
			||||||
* The *date created* of a document is the date the document was initially issued.
 | 
					 | 
				
			||||||
  This can be the date you bought a product, the date you signed a contract, or
 | 
					 | 
				
			||||||
  the date a letter was sent to you.
 | 
					 | 
				
			||||||
* The *archive serial number* (short: ASN) of a document is the identifier of
 | 
					 | 
				
			||||||
  the document in your physical document binders. See
 | 
					 | 
				
			||||||
  :ref:`usage-recommended_workflow` below.
 | 
					 | 
				
			||||||
* The *content* of a document is the text that was OCR'ed from the document.
 | 
					 | 
				
			||||||
  This text is fed into the search engine and is used for matching tags,
 | 
					 | 
				
			||||||
  correspondents and document types.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Frontend overview
 | 
					 | 
				
			||||||
#################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. warning::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    TBD. Add some fancy screenshots!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Adding documents to paperless
 | 
					 | 
				
			||||||
#############################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Once you've got Paperless setup, you need to start feeding documents into it.
 | 
					 | 
				
			||||||
When adding documents to paperless, it will perform the following operations on
 | 
					 | 
				
			||||||
your documents:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  OCR the document, if it has no text. Digital documents usually have text,
 | 
					 | 
				
			||||||
    and this step will be skipped for those documents.
 | 
					 | 
				
			||||||
2.  Paperless will create an archivable PDF/A document from your document.
 | 
					 | 
				
			||||||
    If this document is coming from your scanner, it will have embedded selectable text.
 | 
					 | 
				
			||||||
3.  Paperless performs automatic matching of tags, correspondents and types on the
 | 
					 | 
				
			||||||
    document before storing it in the database.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    This process can be configured to fit your needs. If you don't want paperless
 | 
					 | 
				
			||||||
    to create archived versions for digital documents, you can configure that by
 | 
					 | 
				
			||||||
    configuring ``PAPERLESS_OCR_MODE=skip_noarchive``. Please read the
 | 
					 | 
				
			||||||
    :ref:`relevant section in the documentation <configuration-ocr>`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    No matter which options you choose, Paperless will always store the original
 | 
					 | 
				
			||||||
    document that it found in the consumption directory or in the mail and
 | 
					 | 
				
			||||||
    will never overwrite that document. Archived versions are stored alongside the
 | 
					 | 
				
			||||||
    original versions.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The consumption directory
 | 
					 | 
				
			||||||
=========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The primary method of getting documents into your database is by putting them in
 | 
					 | 
				
			||||||
the consumption directory.  The consumer runs in an infinite loop, looking for new
 | 
					 | 
				
			||||||
additions to this directory. When it finds them, the consumer goes about the process
 | 
					 | 
				
			||||||
of parsing them with the OCR, indexing what it finds, and storing it in the media directory.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Getting stuff into this directory is up to you.  If you're running Paperless
 | 
					 | 
				
			||||||
on your local computer, you might just want to drag and drop files there, but if
 | 
					 | 
				
			||||||
you're running this on a server and want your scanner to automatically push
 | 
					 | 
				
			||||||
files to this directory, you'll need to setup some sort of service to accept the
 | 
					 | 
				
			||||||
files from the scanner.  Typically, you're looking at an FTP server like
 | 
					 | 
				
			||||||
`Proftpd`_ or a Windows folder share with `Samba`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _Proftpd: http://www.proftpd.org/
 | 
					 | 
				
			||||||
.. _Samba: http://www.samba.org/
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. TODO: hyperref to configuration of the location of this magic folder.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Web UI Upload
 | 
					 | 
				
			||||||
=============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The dashboard has a file drop field to upload documents to paperless. Simply drag a file
 | 
					 | 
				
			||||||
onto this field or select a file with the file dialog. Multiple files are supported.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can also upload documents on any other page of the web UI by dragging-and-dropping
 | 
					 | 
				
			||||||
files into your browser window.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _usage-mobile_upload:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Mobile upload
 | 
					 | 
				
			||||||
=============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The mobile app over at `<https://github.com/qcasey/paperless_share>`_ allows Android users
 | 
					 | 
				
			||||||
to share any documents with paperless. This can be combined with any of the mobile
 | 
					 | 
				
			||||||
scanning apps out there, such as Office Lens.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Furthermore, there is the  `Paperless App <https://github.com/bauerj/paperless_app>`_ as well,
 | 
					 | 
				
			||||||
which not only has document upload, but also document browsing and download features.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _usage-email:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
IMAP (Email)
 | 
					 | 
				
			||||||
============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can tell paperless-ngx to consume documents from your email accounts.
 | 
					 | 
				
			||||||
This is a very flexible and powerful feature, if you regularly received documents
 | 
					 | 
				
			||||||
via mail that you need to archive. The mail consumer can be configured by using the
 | 
					 | 
				
			||||||
admin interface in the following manner:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Define e-mail accounts.
 | 
					 | 
				
			||||||
2.  Define mail rules for your account.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
These rules perform the following:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Connect to the mail server.
 | 
					 | 
				
			||||||
2.  Fetch all matching mails (as defined by folder, maximum age and the filters)
 | 
					 | 
				
			||||||
3.  Check if there are any consumable attachments.
 | 
					 | 
				
			||||||
4.  If so, instruct paperless to consume the attachments and optionally
 | 
					 | 
				
			||||||
    use the metadata provided in the rule for the new document.
 | 
					 | 
				
			||||||
5.  If documents were consumed from a mail, the rule action is performed
 | 
					 | 
				
			||||||
    on that mail.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless will completely ignore mails that do not match your filters. It will also
 | 
					 | 
				
			||||||
only perform the action on mails that it has consumed documents from.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The actions all ensure that the same mail is not consumed twice by different means.
 | 
					 | 
				
			||||||
These are as follows:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
*   **Delete:** Immediately deletes mail that paperless has consumed documents from.
 | 
					 | 
				
			||||||
    Use with caution.
 | 
					 | 
				
			||||||
*   **Mark as read:** Mark consumed mail as read. Paperless will not consume documents
 | 
					 | 
				
			||||||
    from already read mails. If you read a mail before paperless sees it, it will be
 | 
					 | 
				
			||||||
    ignored.
 | 
					 | 
				
			||||||
*   **Flag:** Sets the 'important' flag on mails with consumed documents. Paperless
 | 
					 | 
				
			||||||
    will not consume flagged mails.
 | 
					 | 
				
			||||||
*   **Move to folder:** Moves consumed mails out of the way so that paperless wont
 | 
					 | 
				
			||||||
    consume them again.
 | 
					 | 
				
			||||||
*   **Add custom Tag:** Adds a custom tag to mails with consumed documents (the IMAP
 | 
					 | 
				
			||||||
    standard calls these "keywords"). Paperless will not consume mails already tagged.
 | 
					 | 
				
			||||||
    Not all mail servers support this feature!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. caution::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    The mail consumer will perform these actions on all mails it has consumed
 | 
					 | 
				
			||||||
    documents from. Keep in mind that the actual consumption process may fail
 | 
					 | 
				
			||||||
    for some reason, leaving you with missing documents in paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    With the correct set of rules, you can completely automate your email documents.
 | 
					 | 
				
			||||||
    Create rules for every correspondent you receive digital documents from and
 | 
					 | 
				
			||||||
    paperless will read them automatically. The default action "mark as read" is
 | 
					 | 
				
			||||||
    pretty tame and will not cause any damage or data loss whatsoever.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You can also setup a special folder in your mail account for paperless and use
 | 
					 | 
				
			||||||
    your favorite mail client to move to be consumed mails into that folder
 | 
					 | 
				
			||||||
    automatically or manually and tell paperless to move them to yet another folder
 | 
					 | 
				
			||||||
    after consumption. It's up to you.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    When defining a mail rule with a folder, you may need to try different characters to
 | 
					 | 
				
			||||||
    define how the sub-folders are separated.  Common values include ".", "/" or "|", but
 | 
					 | 
				
			||||||
    this varies by the mail server.  Check the documentation for your mail server.  In the
 | 
					 | 
				
			||||||
    event of an error fetching mail from a certain folder, check the Paperless logs.  When
 | 
					 | 
				
			||||||
    a folder is not located, Paperless will attempt to list all folders found in the account
 | 
					 | 
				
			||||||
    to the Paperless logs.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    Paperless will process the rules in the order defined in the admin page.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You can define catch-all rules and have them executed last to consume
 | 
					 | 
				
			||||||
    any documents not matched by previous rules. Such a rule may assign an "Unknown
 | 
					 | 
				
			||||||
    mail document" tag to consumed documents so you can inspect them further.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless is set up to check your mails every 10 minutes. This can be configured on the
 | 
					 | 
				
			||||||
'Scheduled tasks' page in the admin.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
REST API
 | 
					 | 
				
			||||||
========
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
You can also submit a document using the REST API, see :ref:`api-file_uploads` for details.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _basic-searching:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Best practices
 | 
					 | 
				
			||||||
##############
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless offers a couple tools that help you organize your document collection. However,
 | 
					 | 
				
			||||||
it is up to you to use them in a way that helps you organize documents and find specific
 | 
					 | 
				
			||||||
documents when you need them. This section offers a couple ideas for managing your collection.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Document types allow you to classify documents according to what they are. You can define
 | 
					 | 
				
			||||||
types such as "Receipt", "Invoice", or "Contract". If you used to collect all your receipts
 | 
					 | 
				
			||||||
in a single binder, you can recreate that system in paperless by defining a document type,
 | 
					 | 
				
			||||||
assigning documents to that type and then filtering by that type to only see all receipts.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Not all documents need document types. Sometimes its hard to determine what the type of a
 | 
					 | 
				
			||||||
document is or it is hard to justify creating a document type that you only need once or twice.
 | 
					 | 
				
			||||||
This is okay. As long as the types you define help you organize your collection in the way
 | 
					 | 
				
			||||||
you want, paperless is doing its job.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Tags can be used in many different ways. Think of tags are more versatile folders or binders.
 | 
					 | 
				
			||||||
If you have a binder for documents related to university / your car or health care, you can
 | 
					 | 
				
			||||||
create these binders in paperless by creating tags and assigning them to relevant documents.
 | 
					 | 
				
			||||||
Just as with documents, you can filter the document list by tags and only see documents of
 | 
					 | 
				
			||||||
a certain topic.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
With physical documents, you'll often need to decide which folder the document belongs to.
 | 
					 | 
				
			||||||
The advantage of tags over folders and binders is that a single document can have multiple
 | 
					 | 
				
			||||||
tags. A physical document cannot magically appear in two different folders, but with tags,
 | 
					 | 
				
			||||||
this is entirely possible.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  This can be used in many different ways. One example: Imagine you're working on a particular
 | 
					 | 
				
			||||||
  task, such as signing up for university. Usually you'll need to collect a bunch of different
 | 
					 | 
				
			||||||
  documents that are already sorted into various folders. With the tag system of paperless,
 | 
					 | 
				
			||||||
  you can create a new group of documents that are relevant to this task without destroying
 | 
					 | 
				
			||||||
  the already existing organization. When you're done with the task, you could delete the
 | 
					 | 
				
			||||||
  tag again, which would be equal to sorting documents back into the folder they belong into.
 | 
					 | 
				
			||||||
  Or keep the tag, up to you.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
All of the logic above applies to correspondents as well. Attach them to documents if you
 | 
					 | 
				
			||||||
feel that they help you organize your collection.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When you've started organizing your documents, create a couple saved views for document collections
 | 
					 | 
				
			||||||
you regularly access. This is equal to having labeled physical binders on your desk, except
 | 
					 | 
				
			||||||
that these saved views are dynamic and simply update themselves as you add documents to the system.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Here are a couple examples of tags and types that you could use in your collection.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* An ``inbox`` tag for newly added documents that you haven't manually edited yet.
 | 
					 | 
				
			||||||
* A tag ``car`` for everything car related (repairs, registration, insurance, etc)
 | 
					 | 
				
			||||||
* A tag ``todo`` for documents that you still need to do something with, such as reply, or
 | 
					 | 
				
			||||||
  perform some task online.
 | 
					 | 
				
			||||||
* A tag ``bank account x`` for all bank statement related to that account.
 | 
					 | 
				
			||||||
* A tag ``mail`` for anything that you added to paperless via its mail processing capabilities.
 | 
					 | 
				
			||||||
* A tag ``missing_metadata`` when you still need to add some metadata to a document, but can't
 | 
					 | 
				
			||||||
  or don't want to do this right now.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _basic-usage_searching:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Searching
 | 
					 | 
				
			||||||
#########
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Paperless offers an extensive searching mechanism that is designed to allow you to quickly
 | 
					 | 
				
			||||||
find a document you're looking for (for example, that thing that just broke and you bought
 | 
					 | 
				
			||||||
a couple months ago, that contract you signed 8 years ago).
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
When you search paperless for a document, it tries to match this query against your documents.
 | 
					 | 
				
			||||||
Paperless will look for matching documents by inspecting their content, title, correspondent,
 | 
					 | 
				
			||||||
type and tags. Paperless returns a scored list of results, so that documents matching your query
 | 
					 | 
				
			||||||
better will appear further up in the search results.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
By default, paperless returns only documents which contain all words typed in the search bar.
 | 
					 | 
				
			||||||
However, paperless also offers advanced search syntax if you want to drill down the results
 | 
					 | 
				
			||||||
further.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Matching documents with logical expressions:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  shopname AND (product1 OR product2)
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Matching specific tags, correspondents or types:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  type:invoice tag:unpaid
 | 
					 | 
				
			||||||
  correspondent:university certificate
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Matching dates:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  created:[2005 to 2009]
 | 
					 | 
				
			||||||
  added:yesterday
 | 
					 | 
				
			||||||
  modified:today
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Matching inexact words:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. code::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  produ*name
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. note::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
  Inexact terms are hard for search indexes. These queries might take a while to execute. That's why paperless offers
 | 
					 | 
				
			||||||
  auto complete and query correction.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
All of these constructs can be combined as you see fit.
 | 
					 | 
				
			||||||
If you want to learn more about the query language used by paperless, paperless uses Whoosh's default query language.
 | 
					 | 
				
			||||||
Head over to `Whoosh query language <https://whoosh.readthedocs.io/en/latest/querylang.html>`_.
 | 
					 | 
				
			||||||
For details on what date parsing utilities are available, see
 | 
					 | 
				
			||||||
`Date parsing <https://whoosh.readthedocs.io/en/latest/dates.html#parsing-date-queries>`_.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. _usage-recommended_workflow:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The recommended workflow
 | 
					 | 
				
			||||||
########################
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Once you have familiarized yourself with paperless and are ready to use it
 | 
					 | 
				
			||||||
for all your documents, the recommended workflow for managing your documents
 | 
					 | 
				
			||||||
is as follows. This workflow also takes into account that some documents
 | 
					 | 
				
			||||||
have to be kept in physical form, but still ensures that you get all the
 | 
					 | 
				
			||||||
advantages for these documents as well.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The following diagram shows how easy it is to manage your documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. image:: _static/recommended_workflow.png
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Preparations in paperless
 | 
					 | 
				
			||||||
=========================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* Create an inbox tag that gets assigned to all new documents.
 | 
					 | 
				
			||||||
* Create a TODO tag.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Processing of the physical documents
 | 
					 | 
				
			||||||
====================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Keep a physical inbox. Whenever you receive a document that you need to
 | 
					 | 
				
			||||||
archive, put it into your inbox. Regularly, do the following for all documents
 | 
					 | 
				
			||||||
in your inbox:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  For each document, decide if you need to keep the document in physical
 | 
					 | 
				
			||||||
    form. This applies to certain important documents, such as contracts and
 | 
					 | 
				
			||||||
    certificates.
 | 
					 | 
				
			||||||
2.  If you need to keep the document, write a running number on the document
 | 
					 | 
				
			||||||
    before scanning, starting at one and counting upwards. This is the archive
 | 
					 | 
				
			||||||
    serial number, or ASN in short.
 | 
					 | 
				
			||||||
3.  Scan the document.
 | 
					 | 
				
			||||||
4.  If the document has an ASN assigned, store it in a *single* binder, sorted
 | 
					 | 
				
			||||||
    by ASN. Don't order this binder in any other way.
 | 
					 | 
				
			||||||
5.  If the document has no ASN, throw it away. Yay!
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Over time, you will notice that your physical binder will fill up. If it is
 | 
					 | 
				
			||||||
full, label the binder with the range of ASNs in this binder (i.e., "Documents
 | 
					 | 
				
			||||||
1 to 343"), store the binder in your cellar or elsewhere, and start a new
 | 
					 | 
				
			||||||
binder.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
The idea behind this process is that you will never have to use the physical
 | 
					 | 
				
			||||||
binders to find a document. If you need a specific physical document, you
 | 
					 | 
				
			||||||
may find this document by:
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  Searching in paperless for the document.
 | 
					 | 
				
			||||||
2.  Identify the ASN of the document, since it appears on the scan.
 | 
					 | 
				
			||||||
3.  Grab the relevant document binder and get the document. This is easy since
 | 
					 | 
				
			||||||
    they are sorted by ASN.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Processing of documents in paperless
 | 
					 | 
				
			||||||
====================================
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Once you have scanned in a document, proceed in paperless as follows.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
1.  If the document has an ASN, assign the ASN to the document.
 | 
					 | 
				
			||||||
2.  Assign a correspondent to the document (i.e., your employer, bank, etc)
 | 
					 | 
				
			||||||
    This isn't strictly necessary but helps in finding a document when you need
 | 
					 | 
				
			||||||
    it.
 | 
					 | 
				
			||||||
3.  Assign a document type (i.e., invoice, bank statement, etc) to the document
 | 
					 | 
				
			||||||
    This isn't strictly necessary but helps in finding a document when you need
 | 
					 | 
				
			||||||
    it.
 | 
					 | 
				
			||||||
4.  Assign a proper title to the document (the name of an item you bought, the
 | 
					 | 
				
			||||||
    subject of the letter, etc)
 | 
					 | 
				
			||||||
5.  Check that the date of the document is correct. Paperless tries to read
 | 
					 | 
				
			||||||
    the date from the content of the document, but this fails sometimes if the
 | 
					 | 
				
			||||||
    OCR is bad or multiple dates appear on the document.
 | 
					 | 
				
			||||||
6.  Remove inbox tags from the documents.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
.. hint::
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
    You can setup manual matching rules for your correspondents and tags and
 | 
					 | 
				
			||||||
    paperless will assign them automatically. After consuming a couple documents,
 | 
					 | 
				
			||||||
    you can even ask paperless to *learn* when to assign tags and correspondents
 | 
					 | 
				
			||||||
    by itself. For details on this feature, see :ref:`advanced-matching`.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Task management
 | 
					 | 
				
			||||||
===============
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
Some documents require attention and require you to act on the document. You
 | 
					 | 
				
			||||||
may take two different approaches to handle these documents based on how
 | 
					 | 
				
			||||||
regularly you intend to scan documents and use paperless.
 | 
					 | 
				
			||||||
 | 
					 | 
				
			||||||
* If you scan and process your documents in paperless regularly, assign a
 | 
					 | 
				
			||||||
  TODO tag to all scanned documents that you need to process. Create a saved
 | 
					 | 
				
			||||||
  view on the dashboard that shows all documents with this tag.
 | 
					 | 
				
			||||||
* If you do not scan documents regularly and use paperless solely for archiving,
 | 
					 | 
				
			||||||
  create a physical todo box next to your physical inbox and put documents you
 | 
					 | 
				
			||||||
  need to process in the TODO box. When you performed the task associated with
 | 
					 | 
				
			||||||
  the document, move it to the inbox.
 | 
					 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user