more documentation.

2025-07-24 18:04:39 -05:00 · 2020-11-18 00:00:55 +01:00 · 2020-11-18 00:00:55 +01:00 · 758d53d816
commit 758d53d816
parent 42a02991e7
3 changed files with 93 additions and 7 deletions
--- a/docs/api.rst
+++ b/docs/api.rst
@ -66,7 +66,7 @@ Result list object returned by the endpoint:
        "page": 1,
        "page_count": 1,
        "results": [
-            ...
+            
        ]
    }

@ -83,11 +83,13 @@ Result object:

    {
        "id": 1,
-        "highlights": ...,
+        "highlights": [
+            
+        ],
        "score": 6.34234,
        "rank": 23,
        "document": {
-            ...
+            
        }

 *   ``id``: the primary key of the found document
--- a/docs/setup.rst
+++ b/docs/setup.rst
@ -23,6 +23,77 @@ There are multiple options available.
    that need to be compiled, and that's already done for you in the release.


+Overview of Paperless-ng
+########################
+
+Compared to paperless, paperless-ng works a little different under the hood and has
+more moving parts that work together. While this increases the complexity of
+the system, it also brings many benefits. 
+
+Paperless consists of the following components:
+
+*   **The webserver:** This is pretty much the same as in paperless. It serves 
+    the administration pages, the API, and the new frontend. This is the main
+    tool you'll be using to interact with paperless. You may start the webserver
+    with
+
+    .. code:: shell-session
+        
+        $ cd /path/to/paperless/src/
+        $ pipenv run gunicorn -c /usr/src/paperless/gunicorn.conf.py -b 0.0.0.0:8000 paperless.wsgi
+    
+    or by any other means such as Apache ``mod_wsgi``.
+
+*   **The consumer:** This is what watches your consumption folder for documents.
+    However, the consumer itself does not consume really consume your documents anymore.
+    It rather notifies a task processor that a new file is ready for consumption.
+    I suppose it should be named differently.
+    This also used to check your emails, but that's now gone elsewhere as well.
+
+    Start the consumer with the management command ``document_consumer``:
+
+    .. code:: shell-session
+    
+        $ cd /path/to/paperless/src/
+        $ pipenv run python3 manage.py document_consumer
+
+*   **The task processor:** Paperless relies on `Django Q <https://django-q.readthedocs.io/en/latest/>`_
+    for doing much of the heavy lifting. This is a task queue that accepts tasks from
+    multiple sources and processes tasks in parallel. It also comes with a scheduler that executes
+    certain commands periodically.
+    
+    This task processor is responsible for:
+
+    *   Consuming documents. When the consumer finds new documents, it notifies the task processor to
+        start a consumption task.
+    *   Consuming emails. It periodically checks your configured accounts for new mails and
+        produces consumption tasks for any documents it finds.
+    *   The task processor also performs the consumption of any documents you upload through
+        the web interface.
+    *   Maintain the search index and the automatic matching algorithm. These are things that paperless
+        needs to do from time to time in order to operate properly.
+    
+    This allows paperless to process multiple documents from your consumption folder in parallel! On
+    a modern multicore system, consumption with full ocr is blazing fast.
+
+    The task processor comes with a built-in admin interface that you can use to see whenever any of the
+    tasks fail and inspect the errors.
+
+    You may start the task processor by executing:
+
+    .. code:: shell-session
+    
+        $ cd /path/to/paperless/src/
+        $ pipenv run python3 manage.py qcluster
+
+*   A `redis <https://redis.io/>`_ message broker: This is a really lightweight service that is responsible
+    for getting the tasks from the webserver and consumer to the task scheduler. These run in different
+    processes (maybe even on different machines!), and therefore, this is necessary.
+
+*   A database server. Paperless supports PostgreSQL and sqlite for storing its data. However, with the
+    added concurrency, it is strongly advised to use PostgreSQL, as sqlite has its limits in that regard.
+
+
 Installation
 ############

@ -31,10 +102,12 @@ You can go multiple routes with setting up and running Paperless:
 * The `docker route`_
 * The `bare metal route`_

-The `docker route`_ is quick & easy. This is the recommended route.
+The `docker route`_ is quick & easy. This is the recommended route. This configures all the stuff
+from above automatically so that it just works and uses sensible defaults for all configuration options.

 The `bare metal route`_ is more complicated to setup but makes it easier
-should you want to contribute some code back.
+should you want to contribute some code back. You need to configure and
+run the above mentioned components yourself.

 Docker Route
 ============
--- a/scripts/make-release.sh
+++ b/scripts/make-release.sh
@ -2,6 +2,15 @@

 set -e

+
+VERSION=$1
+
+if [ -z "$VERSION" ]
+then
+	echo "Need a version string."
+	exit 1
+fi
+
 # source root directory of paperless
 PAPERLESS_ROOT=$(git rev-parse --show-toplevel)

@ -81,10 +90,12 @@ cp "$PAPERLESS_ROOT/docker/supervisord.conf" "$PAPERLESS_DIST_APP/docker/"

 cd "$PAPERLESS_DIST_APP"

-docker-compose build
+docker build . -t "jonaswinkler/paperless-ng:$VERSION"
+
+docker push "jonaswinkler/paperless-ng:$VERSION"

 # works. package the app!

 cd "$PAPERLESS_DIST"

-tar -cJf paperless-ng.tar.xz paperless-ng/
+tar -cJf "paperless-ng-$VERSION.tar.xz" paperless-ng/