Merge pull request #222 from tido-/master

little changes to reflect as much as possible
2025-07-22 17:54:40 -05:00 · 2017-05-10 15:25:35 -07:00 · 2017-05-10 15:25:35 -07:00 · 3477b96d87
commit 3477b96d87
parent c7876dbbe8 ac850b64aa
4 changed files with 57 additions and 46 deletions
--- a/README.rst
+++ b/README.rst
@ -6,7 +6,7 @@ Paperless
 |Travis|
 |Dependencies|

-Scan, index, and archive all of your paper documents
+Index and archive all of your scanned paper documents

 I hate paper.  Environmental issues aside, it's a tech person's nightmare:

@ -23,6 +23,8 @@ it... because paper.  I wrote this to make my life easier.
 How it Works
 ============

+Paperless does not control your scanner, it only helps you deal with what your scanner produces
+
 1. Buy a document scanner like `this one`_ (used by me) or `this other one`_
   recommended by another user.
 2. Set it up to "scan to FTP" or something similar. It should be able to push
@ -30,7 +32,7 @@ How it Works
   scanner doesn't know how to automatically upload the file somewhere, you can
   always do that manually.  Paperless doesn't care how the documents get into
   its local consumption directory.
-3. Have the target server run the Paperless consumption script to OCR the PDF
+3. Have the target server run the Paperless consumption script to OCR the file
   and index it into a local database.
 4. Use the web frontend to sift through the database and find what you want.
 5. Download the PDF you need/want via the web interface and do whatever you
@ -48,9 +50,8 @@ Stability
 =========

 Paperless is still under active development (just look at the git commit
-history) so don't expect it to be 100% stable.  I'm using it for my own
-documents, but I'm crazy like that.  If you use this and it breaks something,
-you get to keep all the shiny pieces.
+history) so don't expect it to be 100% stable.  You can backup the sqlite3 
+database, media directory and your configuration file to be on the safe side.


 Requirements
@ -83,22 +84,22 @@ Similar Projects

 There's another project out there called `Mayan EDMS`_ that has a surprising
 amount of technical overlap with Paperless.  Also based on Django and using
-a consumer model with Tesseract and unpaper, Mayan EDMS is *much* more
-featureful and comes with a slick UI as well.  It may be that Paperless is
-better suited for low-resource environments (like a Rasberry Pi), but to be
-honest, this is just a guess as I haven't tested this myself.  One thing's
-for certain though, *Paperless* is a **much** better name.
+a consumer model with Tesseract and Unpaper, Mayan EDMS is *much* more
+featureful and comes with a slick UI as well, but still in Python 2. It may be 
+that Paperless consumes fewer resources, but to be honest, this is just a guess 
+as I haven't tested this myself.  One thing's for certain though, *Paperless* 
+is a **much** better name.


 Important Note
 ==============

 Document scanners are typically used to scan sensitive documents.  Things like
-your social insurance number, tax records, invoices, etc.  While paperless
-encrypts the original PDFs via the consumption script, the OCR'd text is *not*
+your social insurance number, tax records, invoices, etc.  While Paperless
+encrypts the original files via the consumption script, the OCR'd text is *not*
 encrypted and is therefore stored in the clear (it needs to be searchable, so
 if someone has ideas on how to do that on encrypted data, I'm all ears).  This
-means that paperless should never be run on an untrusted host.  Instead, I
+means that Paperless should never be run on an untrusted host.  Instead, I
 recommend that if you do want to use it, run it locally on a server in your own
 home.

--- a/docs/index.rst
+++ b/docs/index.rst
@ -3,7 +3,11 @@
 Paperless
 =========

-Scan, index, and archive all of your paper documents.  Say goodbye to paper.
+Paperless is a simple Django application running in two parts: 
+a :ref:`consumer <utilities-consumer>` (the thing that does the indexing) and 
+the :ref:`webserver <utilities-webserver>` (the part that lets you search & download
+already-indexed documents). If you want to learn more about its functions keep on 
+reading after the installation section.


 .. _index-why-this-exists:
@ -15,10 +19,11 @@ Paper is a nightmare.  Environmental issues aside, there's no excuse for it in
 the 21st century.  It takes up space, collects dust, doesn't support any form of
 a search feature, indexing is tedious, it's heavy and prone to damage & loss.

-I wrote this to make "going paperless" easier.  I wanted to be able to feed
-documents right from the post box into the scanner and then shred them so I
-never have to worry about finding stuff again.  Perhaps you might find it useful
-too.
+I wrote this to make "going paperless" easier.  I do not have to worry about 
+finding stuff again. I feed documents right from the post box into the scanner and 
+then shred them.  Perhaps you might find it useful too.
+
+


 Contents
--- a/docs/requirements.rst
+++ b/docs/requirements.rst
@ -4,7 +4,7 @@ Requirements
 ============

 You need a Linux machine or Unix-like setup (theoretically an Apple machine
-should work) that has the following software installed on it:
+should work) that has the following software installed:

 * `Python3`_ (with development libraries, pip and virtualenv)
 * `GNU Privacy Guard`_
@ -21,14 +21,14 @@ should work) that has the following software installed on it:
 Notably, you should confirm how you access your Python3 installation.  Many
 Linux distributions will install Python3 in parallel to Python2, using the names
 ``python3`` and ``python`` respectively.  The same goes for ``pip3`` and
-``pip``.  Using Python2 will likely break things, so make sure that you're using
-the right version.
+``pip``.  Running Paperless with Python2 will likely break things, so make sure that 
+you're using the right version.

 For the purposes of simplicity, ``python`` and ``pip`` is used everywhere to
-refer to their Python 3 versions.
+refer to their Python3 versions.

 In addition to the above, there are a number of Python requirements, all of
-which are listed in a file called ``requirements.txt`` in the project root.
+which are listed in a file called ``requirements.txt`` in the project root directory.

 If you're not working on a virtual environment (like Vagrant or Docker), you
 should probably be using a virtualenv, but that's your call.  The reasons why
@ -67,7 +67,7 @@ dependencies is easy:

    $ pip install --user --requirement /path/to/paperless/requirements.txt

-This should download and install all of the requirements into
+This will download and install all of the requirements into
 ``${HOME}/.local``.  Remember that your distribution may be using ``pip3`` as
 mentioned above.

@ -86,8 +86,8 @@ enter it, and install the requirements using the ``requirements.txt`` file:
    $ . /path/to/arbitrary/directory/bin/activate
    $ pip install  --requirement /path/to/paperless/requirements.txt

-Now you're ready to go.  Just remember to enter your virtualenv whenever you
-want to use Paperless.
+Now you're ready to go.  Just remember to enter (activate) your virtualenv 
+whenever you want to use Paperless.


 .. _requirements-documentation:
@ -95,7 +95,7 @@ want to use Paperless.
 Documentation
 -------------

-As generation of the documentation is not required for use of Paperless,
+As generation of the documentation is not required for the use of Paperless,
 dependencies for this process are not included in ``requirements.txt``.  If
 you'd like to generate your own docs locally, you'll need to:

--- a/docs/setup.rst
+++ b/docs/setup.rst
@ -4,9 +4,8 @@ Setup
 =====

 Paperless isn't a very complicated app, but there are a few components, so some
-basic documentation is in order.  If you go follow along in this document and
-still have trouble, please open an `issue on GitHub`_ so I can fill in the
-gaps.
+basic documentation is in order.  If you follow along in this document and still 
+have trouble, please open an `issue on GitHub`_ so I can fill in the gaps.

 .. _issue on GitHub: https://github.com/danielquinn/paperless/issues

@ -28,6 +27,7 @@ or just download the tarball and go that route:

 .. code:: bash

+    $ cd to the directory where you want to run Paperless
    $ wget https://github.com/danielquinn/paperless/archive/master.zip
    $ unzip master.zip
    $ cd paperless-master
@ -42,8 +42,10 @@ You can go multiple routes with setting up and running Paperless. The `Vagrant
 route`_ is quick & easy, but means you're running a VM which comes with memory
 consumption etc. We also `support Docker`_, which you can use natively under
 Linux and in a VM with `Docker Machine`_ (this guide was written for native
-Docker usage under Linux, you might have to adapt it for Docker Machine.)
-Alternatively the standard, `bare metal`_ approach is a little more
+Docker usage under Linux, you might have to adapt it for Docker Machine.) 
+Not to forget the virtualenv, this is similar to `bare metal`_ with the exception
+that you have to activate the virtualenv first.
+Last but not least, the standard `bare metal`_ approach is a little more
 complicated, but worth it because it makes it easier should you want to
 contribute some code back.

@ -59,9 +61,11 @@ Standard (Bare Metal)
 .....................

 1. Install the requirements as per the :ref:`requirements <requirements>` page.
-2. Change to the ``src`` directory in this repo.
-3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` and open it in
-   your favourite editor.  Set the values for:
+2. Within the extract of master.zip go to the ``src`` directory.
+3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` also the virtual 
+   envrionment look there for it and open it in your favourite editor.  
+   Because this file contains passwords it should only be readable by user root
+   and paperless !  Set the values for:

    * ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
      dumped to be consumed by Paperless.
@ -70,18 +74,18 @@ Standard (Bare Metal)
    * ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
      will spawn to process document pages in parallel.

-4. Initialise the database with ``./manage.py migrate``.
+4. Initialise the SQLite database with ``./manage.py migrate``.
 5. Create a user for your Paperless instance with
   ``./manage.py createsuperuser``. Follow the prompts to create your user.
 6. Start the webserver with ``./manage.py runserver <IP>:<PORT>``.
-   If no specifc IP or port are given, the default is ``127.0.0.1:8000``.
-   You should now be able to visit your (empty) `Paperless webserver`_ at
-   ``127.0.0.1:8000`` (or whatever you chose).  You can login with the
-   user/pass you created in #5.
+   If no specifc IP or port are given, the default is ``127.0.0.1:8000`` 
+   also known as http://localhost:8000/.
+   You should now be able to visit your (empty) at `Paperless webserver`_ or 
+   whatever you chose before.  You can login with the user/pass you created in #5.
 7. In a separate window, change to the ``src`` directory in this repo again,
   but this time, you should start the consumer script with
   ``./manage.py document_consumer``.
-8. Scan something.  Put it in the ``CONSUMPTION_DIR``.
+8. Scan something or put a file into the  ``CONSUMPTION_DIR``.
 9. Wait a few minutes
 10. Visit the document list on your webserver, and it should be there, indexed
    and downloadable.
@ -299,10 +303,11 @@ Standard (Bare Metal, Systemd)

 If you're running on a bare metal system that's using Systemd, you can use the
 service unit files in the ``scripts`` directory to set this up.  You'll need to
-create a user called ``paperless`` and setup Paperless to be in a place that
-this new user can read and write to. Be sure to edit the service scripts to point
-to the proper location of your paperless install, referencing the appropriate Python
-binary. For example: ``ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer``.
+create a user called ``paperless`` (without login (if not already done so #5)) and 
+setup Paperless to be in a place that this new user can read and write to. Be sure 
+to edit the service  scripts to point to the proper location of your paperless install, 
+referencing the appropriate Python binary. For example: 
+``ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer``.
 If you don't want to make a new user, you can change the ``Group`` and ``User`` variables
 accordingly.

@ -344,7 +349,7 @@ after restarting your system:
  If you are using a network interface other than ``eth0``, you will have to
  change ``IFACE=eth0``. For example, if you are connected via WiFi, you will
  likely need to replace ``eth0`` above with ``wlan0``. To see all interfaces,
-  run ``ifconfig``.
+  run ``ifconfig -a``.

  Save the file.