Added some documentation

2025-12-16 01:31:09 -06:00 · 2016-01-24 20:15:50 -05:00
parent 0612cd53b7
commit 286292dbf9
10 changed files with 846 additions and 0 deletions
--- a/docs/utilities.rst
+++ b/docs/utilities.rst
@@ -0,0 +1,107 @@
+.. _utilities:
+
+Utilities
+=========
+
+There's basically three utilities to *Paperless*: the webserver, consumer, and
+if needed, the exporter.  They're all detailed here.
+
+
+.. _utilities-webserver:
+
+The Webserver
+-------------
+
+At the heart of it, *Paperless* is a simple Django webservice, and the entire
+interface is based on Django's standard admin interface.  Once running, visiting
+the URL for your service delivers the admin, through which you can get a
+detailed listing of all available documents, search for specific files, and
+download whatever it is you're looking for.
+
+
+.. _utilities-webserver-howto:
+
+How to Use It
+.............
+
+The webserver is started via the ``manage.py`` script:
+
+.. code:: bash
+
+    $ /path/to/paperless/src/manage.py runserver
+
+By default, the server runs on localhost, port 8000, but you can change this
+with a few arguments, run ``manage.py --help`` for more information.
+
+Note that this command runs continuously, so exiting it will mean your webserver
+disappears.  If you want to run this full-time (which is kind of the point)
+you'll need to have it start in the background -- something you'll need to
+figure out for your own system.  To get you started though, there are Systemd
+service files in the ``scripts`` directory.
+
+
+.. _utilities-consumer:
+
+The Consumer
+------------
+
+The consumer script runs in an infinite loop, constantly looking at a directory
+for PDF files to parse and index.  The process is pretty straightforward:
+
+1. Look in ``CONSUMPTION_DIR`` for a PDF.  If one is found, go to #2.  If not,
+   wait 10 seconds and try again.
+2. Parse the PDF with Tesseract
+3. Create a new record in the database with the OCR'd text
+4. Encrypt the PDF and store it in the ``media`` directory under
+   ``documents/pdf``.
+5. Go to #1.
+
+
+.. _utilities-consumer-howto:
+
+How to Use It
+.............
+
+The consumer is started via the ``manage.py`` script:
+
+.. code:: bash
+
+    $ /path/to/paperless/src/manage.py document_consumer
+
+This starts the service that will run in a loop, consuming PDF files as they
+appear in ``CONSUMPTION_DIR``.
+
+Note that this command runs continuously, so exiting it will mean your webserver
+disappears.  If you want to run this full-time (which is kind of the point)
+you'll need to have it start in the background -- something you'll need to
+figure out for your own system.  To get you started though, there are Systemd
+service files in the ``scripts`` directory.
+
+
+.. _utilities-exporter:
+
+The Exporter
+------------
+
+Tired of fiddling with *Paperless*, or just want to do something stupid and are
+afraid of accidentally damaging your files?  You can export all of your PDFs
+into neatly named, dated, and unencrypted.
+
+
+.. _utilities-exporter-howto:
+
+How to Use It
+.............
+
+This too is done via the ``manage.py`` script:
+
+.. code:: bash
+
+    $ /path/to/paperless/src/manage.py document_exporter /path/to/somewhere
+
+This will dump all of your PDFs into ``/path/to/somewhere`` for you to do with
+as you please.  The naming scheme on export is identical to that used for
+import, so should you can now safely delete the entire project directly,
+database, encrypted PDFs and all, and later create it all again simply by
+running the consumer again and dumping all of these files into
+``CONSUMPTION_DIR``.