Add Dockerfile for application and documentation

This commit adds a `Dockerfile` to the root of the project, accompanied
by a `docker-compose.yml.example` for simplified deployment. The
`Dockerfile` is agnostic to whether it will be the webserver, the
consumer, or if it is run for a one-off command (i.e. creation of a
superuser, migration of the database, document export, ...).

The containers entrypoint is the `scripts/docker-entrypoint.sh` script.
This script verifies that the required permissions are set, remaps the
default users and/or groups id if required and installs additional
languages if the user wishes to.

After initialization, it analyzes the command the user supplied:

  - If the command starts with a slash, it is expected that the user
    wants to execute a binary file and the command will be executed
    without further intervention. (Using `exec` to effectively replace
    the started shell-script and not have any reaping-issues.)

  - If the command does not start with a slash, the command will be
    passed directly to the `manage.py` script without further
    modification. (Again using `exec`.)

The default command is set to `--help`.

If the user wants to execute a command that is not meant for `manage.py`
but doesn't start with a slash, the Docker `--entrypoint` parameter can
be used to circumvent the mechanics of `docker-entrypoint.sh`.

Further information can be found in `docs/setup.rst` and in
`docs/migrating.rst`.

For additional convenience, a `Dockerfile` has been added to the `docs/`
directory which allows for easy building and serving of the
documentation. This is documented in `docs/requirements.rst`.
This commit is contained in:
Pit Kleyersburg
2016-02-17 18:45:04 +01:00
parent 57bcb883bf
commit 724afa59c7
10 changed files with 474 additions and 6 deletions

View File

@@ -37,11 +37,18 @@ or just download the tarball and go that route:
Installation & Configuration
----------------------------
You can go two routes with setting up and running Paperless. The *Vagrant*
route is quick & easy, but means you're running a VM which comes with memory
consumption etc. Alternatively the standard, "bare metal" approach is a little
more complicated.
You can go multiple routes with setting up and running Paperless. The `Vagrant
route`_ is quick & easy, but means you're running a VM which comes with memory
consumption etc. We also `support Docker`_, which you can use natively under
Linux and in a VM with `Docker Machine`_ (this guide was written for native
Docker usage under Linux, you might have to adapt it for Docker Machine.)
Alternatively the standard, `bare metal`_ approach is a little more complicated.
.. _Vagrant route: setup-installation-vagrant_
.. _support Docker: setup-installation-docker_
.. _bare metal: setup-installation-standard_
.. _Docker Machine: https://docs.docker.com/machine/
.. _setup-installation-standard:
@@ -118,6 +125,150 @@ Vagrant Method
.. _Paperless server: http://172.28.128.4:8000
.. _setup-installation-docker:
Docker Method
.............
1. Install `Docker`_.
.. caution::
As mentioned earlier, this guide assumes that you use Docker natively
under Linux. If you are using `Docker Machine`_ under Mac OS X or Windows,
you will have to adapt IP addresses, volume-mounting, command execution
and maybe more.
2. Install `docker-compose`_. [#compose]_
.. caution::
If you want to use the included ``docker-compose.yml.example`` file, you
need to have at least Docker version **1.10.0** and docker-compose
version **1.6.0**.
See the `Docker installation guide`_ on how to install the current
version of Docker for your operating system or Linux distribution of
choice. To get an up-to-date version of docker-compose, follow the
`docker-compose installation guide`_ if your package repository doesn't
include it.
.. _Docker installation guide: https://docs.docker.com/engine/installation/
.. _docker-compose installation guide: https://docs.docker.com/compose/install/
3. Create a copy of ``docker-compose.yml.example`` as ``docker-compose.yml``.
4. Modify ``docker-compose.env`` and adapt the following environment variables:
``PAPERLESS_PASSPHRASE``
This is the passphrase Paperless uses to encrypt/decrypt the original
document.
``PAPERLESS_OCR_THREADS``
This is the number of threads the OCR process will spawn to process
document pages in parallel. If the variable is not set, Python determines
the core-count of your CPU and uses that value.
``PAPERLESS_OCR_LANGUAGES``
If you want the OCR to recognize other languages in addition to the default
English, set this parameter to a space separated list of three-letter
language-codes after `ISO 639-2/T`_. For a list of available languages --
including their three letter codes -- see the `Debian packagelist`_.
``USERMAP_UID`` and ``USERMAP_GID``
If you want to mount the consumption volume (directory ``/consume`` within
the containers) to a host-directory -- which you probably want to do --
access rights might be an issue. The default user and group ``paperless``
in the containers have an id of 1000. The containers will enforce that the
owning group of the consumption directory will be ``paperless`` to be able
to delete consumed documents. If your host-system has a group with an id of
1000 and you don't want this group to have access rights to the consumption
directory, you can use ``USERMAP_GID`` to change the id in the container
and thus the one of the consumption directory. Furthermore, you can change
the id of the default user as well using ``USERMAP_UID``.
5. Run ``docker-compose up -d``. This will create and start the necessary
containers.
6. To be able to login, you will need a super user. To create it, execute the
following command:
.. code-block:: shell-session
$ docker-compose run --rm webserver createsuperuser
This will prompt you to set a username (default ``paperless``), an optional
e-mail address and finally a password.
7. The default ``docker-compose.yml`` exports the webserver on your local port
8000. If you haven't adapted this, you should now be able to visit your
`Paperless webserver`_ at ``http://127.0.0.1:8000``. You can login with the
user and password you just created.
8. Add files to consumption directory the way you prefer to. Following are two
possible options:
1. Mount the consumption directory to a local host path by modifying your
``docker-compose.yml``:
.. code-block:: diff
diff --git a/docker-compose.yml b/docker-compose.yml
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -17,9 +18,8 @@ services:
volumes:
- paperless-data:/usr/src/paperless/data
- paperless-media:/usr/src/paperless/media
- - /consume
+ - /local/path/you/choose:/consume
.. danger::
While the consumption container will ensure at startup that it can
**delete** a consumed file from a host-mounted directory, it might not
be able to **read** the document in the first place if the access
rights to the file are incorrect.
Make sure that the documents you put into the consumption directory
will either be readable by everyone (``chmod o+r file.pdf``) or
readable by the default user or group id 1000 (or the one you have set
with ``USERMAP_UID`` or ``USERMAP_GID`` respectively).
2. Use ``docker cp`` to copy your files directly into the container:
.. code-block:: shell-session
$ # Identify your containers
$ docker-compose ps
Name Command State Ports
-------------------------------------------------------------------------
paperless_consumer_1 /sbin/docker-entrypoint.sh ... Exit 0
paperless_webserver_1 /sbin/docker-entrypoint.sh ... Exit 0
$ docker cp /path/to/your/file.pdf paperless_consumer_1:/consume
``docker cp`` is a one-shot-command, just like ``cp``. This means that
every time you want to consume a new document, you will have to execute
``docker cp`` again. You can of course automate this process, but option 1
is generally the preferred one.
.. danger::
``docker cp`` will change the owning user and group of a copied file
to the acting user at the destination, which will be ``root``.
You therefore need to ensure that the documents you want to copy into
the container are readable by everyone (``chmod o+r file.pdf``) before
copying them.
.. _Docker: https://www.docker.com/
.. _docker-compose: https://docs.docker.com/compose/install/
.. _ISO 639-2/T: https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes
.. _Debian packagelist: https://packages.debian.org/search?suite=jessie&searchon=names&keywords=tesseract-ocr-
.. [#compose] You of course don't have to use docker-compose, but it
simplifies deployment immensely. If you know your way around Docker, feel
free to tinker around without using compose!
.. _making-things-a-little-more-permanent:
Making Things a Little more Permanent
@@ -126,5 +277,9 @@ Making Things a Little more Permanent
Once you've tested things and are happy with the work flow, you can automate the
process of starting the webserver and consumer automatically. If you're running
on a bare metal system that's using Systemd, you can use the service unit files
in the ``scripts`` directory to set this up. If you're on a SysV or other
startup system (like the Vagrant box), then you're currently on your own.
in the ``scripts`` directory to set this up. If you're on another startup
system or are using a Vagrant box, then you're currently on your own. If you are
using Docker, you can set a restart-policy_ in the ``docker-compose.yml`` to
have the containers automatically start with the Docker daemon.
.. _restart-policy: https://docs.docker.com/engine/reference/commandline/run/#restart-policies-restart