mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00

This commit adds a `Dockerfile` to the root of the project, accompanied by a `docker-compose.yml.example` for simplified deployment. The `Dockerfile` is agnostic to whether it will be the webserver, the consumer, or if it is run for a one-off command (i.e. creation of a superuser, migration of the database, document export, ...). The containers entrypoint is the `scripts/docker-entrypoint.sh` script. This script verifies that the required permissions are set, remaps the default users and/or groups id if required and installs additional languages if the user wishes to. After initialization, it analyzes the command the user supplied: - If the command starts with a slash, it is expected that the user wants to execute a binary file and the command will be executed without further intervention. (Using `exec` to effectively replace the started shell-script and not have any reaping-issues.) - If the command does not start with a slash, the command will be passed directly to the `manage.py` script without further modification. (Again using `exec`.) The default command is set to `--help`. If the user wants to execute a command that is not meant for `manage.py` but doesn't start with a slash, the Docker `--entrypoint` parameter can be used to circumvent the mechanics of `docker-entrypoint.sh`. Further information can be found in `docs/setup.rst` and in `docs/migrating.rst`. For additional convenience, a `Dockerfile` has been added to the `docs/` directory which allows for easy building and serving of the documentation. This is documented in `docs/requirements.rst`.
286 lines
12 KiB
ReStructuredText
286 lines
12 KiB
ReStructuredText
.. _setup:
|
|
|
|
Setup
|
|
=====
|
|
|
|
Paperless isn't a very complicated app, but there are a few components, so some
|
|
basic documentation is in order. If you go follow along in this document and
|
|
still have trouble, please open an `issue on GitHub`_ so I can fill in the gaps.
|
|
|
|
.. _issue on GitHub: https://github.com/danielquinn/paperless/issues
|
|
|
|
|
|
.. _setup-download:
|
|
|
|
Download
|
|
--------
|
|
|
|
The source is currently only available via GitHub, so grab it from there, either
|
|
by using ``git``:
|
|
|
|
.. code:: bash
|
|
|
|
$ git clone https://github.com/danielquinn/paperless.git
|
|
$ cd paperless
|
|
|
|
or just download the tarball and go that route:
|
|
|
|
.. code:: bash
|
|
|
|
$ wget https://github.com/danielquinn/paperless/archive/master.zip
|
|
$ unzip master.zip
|
|
$ cd paperless-master
|
|
|
|
|
|
.. _setup-installation:
|
|
|
|
Installation & Configuration
|
|
----------------------------
|
|
|
|
You can go multiple routes with setting up and running Paperless. The `Vagrant
|
|
route`_ is quick & easy, but means you're running a VM which comes with memory
|
|
consumption etc. We also `support Docker`_, which you can use natively under
|
|
Linux and in a VM with `Docker Machine`_ (this guide was written for native
|
|
Docker usage under Linux, you might have to adapt it for Docker Machine.)
|
|
Alternatively the standard, `bare metal`_ approach is a little more complicated.
|
|
|
|
.. _Vagrant route: setup-installation-vagrant_
|
|
.. _support Docker: setup-installation-docker_
|
|
.. _bare metal: setup-installation-standard_
|
|
|
|
.. _Docker Machine: https://docs.docker.com/machine/
|
|
|
|
.. _setup-installation-standard:
|
|
|
|
Standard (Bare Metal)
|
|
.....................
|
|
|
|
1. Install the requirements as per the :ref:`requirements <requirements>` page.
|
|
2. Change to the ``src`` directory in this repo.
|
|
3. Edit ``paperless/settings.py`` and be sure to set the values for:
|
|
* ``CONSUMPTION_DIR``: this is where your documents will be dumped to be
|
|
consumed by Paperless.
|
|
* ``PASSPHRASE``: this is the passphrase Paperless uses to encrypt/decrypt
|
|
the original document. The default value attempts to source the
|
|
passphrase from the environment, so if you don't set it to a static value
|
|
here, you must set ``PAPERLESS_PASSPHRASE=some-secret-string`` on the
|
|
command line whenever invoking the consumer or webserver.
|
|
* ``OCR_THREADS``: this is the number of threads the OCR process will spawn
|
|
to process document pages in parallel. The default value gets sourced from
|
|
the environment-variable ``PAPERLESS_OCR_THREADS`` and expects it to be an
|
|
integer. If the variable is not set, Python determines the core-count of
|
|
your CPU and uses that value.
|
|
4. Initialise the database with ``./manage.py migrate``.
|
|
5. Create a user for your Paperless instance with
|
|
``./manage.py createsuperuser``. Follow the prompts to create your user.
|
|
6. Start the webserver with ``./manage.py runserver <IP>:<PORT>``.
|
|
If no specifc IP or port are given, the default is ``127.0.0.1:8000``.
|
|
You should now be able to visit your (empty) `Paperless webserver`_ at
|
|
``127.0.0.1:8000`` (or whatever you chose). You can login with the
|
|
user/pass you created in #5.
|
|
7. In a separate window, change to the ``src`` directory in this repo again, but
|
|
this time, you should start the consumer script with
|
|
``./manage.py document_consumer``.
|
|
8. Scan something. Put it in the ``CONSUMPTION_DIR``.
|
|
9. Wait a few minutes
|
|
10. Visit the document list on your webserver, and it should be there, indexed
|
|
and downloadable.
|
|
|
|
.. _Paperless webserver: http://127.0.0.1:8000
|
|
|
|
|
|
.. _setup-installation-vagrant:
|
|
|
|
Vagrant Method
|
|
..............
|
|
|
|
1. Install `Vagrant`_. How you do that is really between you and your OS.
|
|
2. Run ``vagrant up``. An instance will start up for you. When it's ready and
|
|
provisioned...
|
|
3. Run ``vagrant ssh`` and once inside your new vagrant box, edit
|
|
``/opt/paperless/src/paperless/settings.py`` and set the values for:
|
|
* ``CONSUMPTION_DIR``: this is where your documents will be dumped to be
|
|
consumed by Paperless.
|
|
* ``PASSPHRASE``: this is the passphrase Paperless uses to encrypt/decrypt
|
|
the original document. The default value attempts to source the
|
|
passphrase from the environment, so if you don't set it to a static value
|
|
here, you must set ``PAPERLESS_PASSPHRASE=some-secret-string`` on the
|
|
command line whenever invoking the consumer or webserver.
|
|
4. Initialise the database with ``/opt/paperless/src/manage.py migrate``.
|
|
5. Still inside your vagrant box, create a user for your Paperless instance with
|
|
``/opt/paperless/src/manage.py createsuperuser``. Follow the prompts to
|
|
create your user.
|
|
6. Start the webserver with ``/opt/paperless/src/manage.py runserver 0.0.0.0:8000``.
|
|
You should now be able to visit your (empty) `Paperless webserver`_ at
|
|
``172.28.128.4:8000``. You can login with the user/pass you created in #5.
|
|
7. In a separate window, run ``vagrant ssh`` again, but this time once inside
|
|
your vagrant instance, you should start the consumer script with
|
|
``/opt/paperless/src/manage.py document_consumer``.
|
|
8. Scan something. Put it in the ``CONSUMPTION_DIR``.
|
|
9. Wait a few minutes
|
|
10. Visit the document list on your webserver, and it should be there, indexed
|
|
and downloadable.
|
|
|
|
.. _Vagrant: https://vagrantup.com/
|
|
.. _Paperless server: http://172.28.128.4:8000
|
|
|
|
|
|
.. _setup-installation-docker:
|
|
|
|
Docker Method
|
|
.............
|
|
|
|
1. Install `Docker`_.
|
|
|
|
.. caution::
|
|
|
|
As mentioned earlier, this guide assumes that you use Docker natively
|
|
under Linux. If you are using `Docker Machine`_ under Mac OS X or Windows,
|
|
you will have to adapt IP addresses, volume-mounting, command execution
|
|
and maybe more.
|
|
|
|
2. Install `docker-compose`_. [#compose]_
|
|
|
|
.. caution::
|
|
|
|
If you want to use the included ``docker-compose.yml.example`` file, you
|
|
need to have at least Docker version **1.10.0** and docker-compose
|
|
version **1.6.0**.
|
|
|
|
See the `Docker installation guide`_ on how to install the current
|
|
version of Docker for your operating system or Linux distribution of
|
|
choice. To get an up-to-date version of docker-compose, follow the
|
|
`docker-compose installation guide`_ if your package repository doesn't
|
|
include it.
|
|
|
|
.. _Docker installation guide: https://docs.docker.com/engine/installation/
|
|
.. _docker-compose installation guide: https://docs.docker.com/compose/install/
|
|
|
|
3. Create a copy of ``docker-compose.yml.example`` as ``docker-compose.yml``.
|
|
4. Modify ``docker-compose.env`` and adapt the following environment variables:
|
|
|
|
``PAPERLESS_PASSPHRASE``
|
|
This is the passphrase Paperless uses to encrypt/decrypt the original
|
|
document.
|
|
|
|
``PAPERLESS_OCR_THREADS``
|
|
This is the number of threads the OCR process will spawn to process
|
|
document pages in parallel. If the variable is not set, Python determines
|
|
the core-count of your CPU and uses that value.
|
|
|
|
``PAPERLESS_OCR_LANGUAGES``
|
|
If you want the OCR to recognize other languages in addition to the default
|
|
English, set this parameter to a space separated list of three-letter
|
|
language-codes after `ISO 639-2/T`_. For a list of available languages --
|
|
including their three letter codes -- see the `Debian packagelist`_.
|
|
|
|
``USERMAP_UID`` and ``USERMAP_GID``
|
|
If you want to mount the consumption volume (directory ``/consume`` within
|
|
the containers) to a host-directory -- which you probably want to do --
|
|
access rights might be an issue. The default user and group ``paperless``
|
|
in the containers have an id of 1000. The containers will enforce that the
|
|
owning group of the consumption directory will be ``paperless`` to be able
|
|
to delete consumed documents. If your host-system has a group with an id of
|
|
1000 and you don't want this group to have access rights to the consumption
|
|
directory, you can use ``USERMAP_GID`` to change the id in the container
|
|
and thus the one of the consumption directory. Furthermore, you can change
|
|
the id of the default user as well using ``USERMAP_UID``.
|
|
|
|
5. Run ``docker-compose up -d``. This will create and start the necessary
|
|
containers.
|
|
6. To be able to login, you will need a super user. To create it, execute the
|
|
following command:
|
|
|
|
.. code-block:: shell-session
|
|
|
|
$ docker-compose run --rm webserver createsuperuser
|
|
|
|
This will prompt you to set a username (default ``paperless``), an optional
|
|
e-mail address and finally a password.
|
|
7. The default ``docker-compose.yml`` exports the webserver on your local port
|
|
8000. If you haven't adapted this, you should now be able to visit your
|
|
`Paperless webserver`_ at ``http://127.0.0.1:8000``. You can login with the
|
|
user and password you just created.
|
|
8. Add files to consumption directory the way you prefer to. Following are two
|
|
possible options:
|
|
|
|
1. Mount the consumption directory to a local host path by modifying your
|
|
``docker-compose.yml``:
|
|
|
|
.. code-block:: diff
|
|
|
|
diff --git a/docker-compose.yml b/docker-compose.yml
|
|
--- a/docker-compose.yml
|
|
+++ b/docker-compose.yml
|
|
@@ -17,9 +18,8 @@ services:
|
|
volumes:
|
|
- paperless-data:/usr/src/paperless/data
|
|
- paperless-media:/usr/src/paperless/media
|
|
- - /consume
|
|
+ - /local/path/you/choose:/consume
|
|
|
|
.. danger::
|
|
|
|
While the consumption container will ensure at startup that it can
|
|
**delete** a consumed file from a host-mounted directory, it might not
|
|
be able to **read** the document in the first place if the access
|
|
rights to the file are incorrect.
|
|
|
|
Make sure that the documents you put into the consumption directory
|
|
will either be readable by everyone (``chmod o+r file.pdf``) or
|
|
readable by the default user or group id 1000 (or the one you have set
|
|
with ``USERMAP_UID`` or ``USERMAP_GID`` respectively).
|
|
|
|
2. Use ``docker cp`` to copy your files directly into the container:
|
|
|
|
.. code-block:: shell-session
|
|
|
|
$ # Identify your containers
|
|
$ docker-compose ps
|
|
Name Command State Ports
|
|
-------------------------------------------------------------------------
|
|
paperless_consumer_1 /sbin/docker-entrypoint.sh ... Exit 0
|
|
paperless_webserver_1 /sbin/docker-entrypoint.sh ... Exit 0
|
|
|
|
$ docker cp /path/to/your/file.pdf paperless_consumer_1:/consume
|
|
|
|
``docker cp`` is a one-shot-command, just like ``cp``. This means that
|
|
every time you want to consume a new document, you will have to execute
|
|
``docker cp`` again. You can of course automate this process, but option 1
|
|
is generally the preferred one.
|
|
|
|
.. danger::
|
|
|
|
``docker cp`` will change the owning user and group of a copied file
|
|
to the acting user at the destination, which will be ``root``.
|
|
|
|
You therefore need to ensure that the documents you want to copy into
|
|
the container are readable by everyone (``chmod o+r file.pdf``) before
|
|
copying them.
|
|
|
|
|
|
.. _Docker: https://www.docker.com/
|
|
.. _docker-compose: https://docs.docker.com/compose/install/
|
|
.. _ISO 639-2/T: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
|
|
.. _Debian packagelist: https://packages.debian.org/search?suite=jessie&searchon=names&keywords=tesseract-ocr-
|
|
|
|
.. [#compose] You of course don't have to use docker-compose, but it
|
|
simplifies deployment immensely. If you know your way around Docker, feel
|
|
free to tinker around without using compose!
|
|
|
|
|
|
.. _making-things-a-little-more-permanent:
|
|
|
|
Making Things a Little more Permanent
|
|
-------------------------------------
|
|
|
|
Once you've tested things and are happy with the work flow, you can automate the
|
|
process of starting the webserver and consumer automatically. If you're running
|
|
on a bare metal system that's using Systemd, you can use the service unit files
|
|
in the ``scripts`` directory to set this up. If you're on another startup
|
|
system or are using a Vagrant box, then you're currently on your own. If you are
|
|
using Docker, you can set a restart-policy_ in the ``docker-compose.yml`` to
|
|
have the containers automatically start with the Docker daemon.
|
|
|
|
.. _restart-policy: https://docs.docker.com/engine/reference/commandline/run/#restart-policies-restart
|