mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-11-03 03:16:10 -06:00 
			
		
		
		
	Merge pull request #222 from tido-/master
little changes to reflect as much as possible
This commit is contained in:
		
							
								
								
									
										27
									
								
								README.rst
									
									
									
									
									
								
							
							
						
						
									
										27
									
								
								README.rst
									
									
									
									
									
								
							@@ -6,7 +6,7 @@ Paperless
 | 
				
			|||||||
|Travis|
 | 
					|Travis|
 | 
				
			||||||
|Dependencies|
 | 
					|Dependencies|
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Scan, index, and archive all of your paper documents
 | 
					Index and archive all of your scanned paper documents
 | 
				
			||||||
 | 
					
 | 
				
			||||||
I hate paper.  Environmental issues aside, it's a tech person's nightmare:
 | 
					I hate paper.  Environmental issues aside, it's a tech person's nightmare:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -23,6 +23,8 @@ it... because paper.  I wrote this to make my life easier.
 | 
				
			|||||||
How it Works
 | 
					How it Works
 | 
				
			||||||
============
 | 
					============
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Paperless does not control your scanner, it only helps you deal with what your scanner produces
 | 
				
			||||||
 | 
					
 | 
				
			||||||
1. Buy a document scanner like `this one`_ (used by me) or `this other one`_
 | 
					1. Buy a document scanner like `this one`_ (used by me) or `this other one`_
 | 
				
			||||||
   recommended by another user.
 | 
					   recommended by another user.
 | 
				
			||||||
2. Set it up to "scan to FTP" or something similar. It should be able to push
 | 
					2. Set it up to "scan to FTP" or something similar. It should be able to push
 | 
				
			||||||
@@ -30,7 +32,7 @@ How it Works
 | 
				
			|||||||
   scanner doesn't know how to automatically upload the file somewhere, you can
 | 
					   scanner doesn't know how to automatically upload the file somewhere, you can
 | 
				
			||||||
   always do that manually.  Paperless doesn't care how the documents get into
 | 
					   always do that manually.  Paperless doesn't care how the documents get into
 | 
				
			||||||
   its local consumption directory.
 | 
					   its local consumption directory.
 | 
				
			||||||
3. Have the target server run the Paperless consumption script to OCR the PDF
 | 
					3. Have the target server run the Paperless consumption script to OCR the file
 | 
				
			||||||
   and index it into a local database.
 | 
					   and index it into a local database.
 | 
				
			||||||
4. Use the web frontend to sift through the database and find what you want.
 | 
					4. Use the web frontend to sift through the database and find what you want.
 | 
				
			||||||
5. Download the PDF you need/want via the web interface and do whatever you
 | 
					5. Download the PDF you need/want via the web interface and do whatever you
 | 
				
			||||||
@@ -48,9 +50,8 @@ Stability
 | 
				
			|||||||
=========
 | 
					=========
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless is still under active development (just look at the git commit
 | 
					Paperless is still under active development (just look at the git commit
 | 
				
			||||||
history) so don't expect it to be 100% stable.  I'm using it for my own
 | 
					history) so don't expect it to be 100% stable.  You can backup the sqlite3 
 | 
				
			||||||
documents, but I'm crazy like that.  If you use this and it breaks something,
 | 
					database, media directory and your configuration file to be on the safe side.
 | 
				
			||||||
you get to keep all the shiny pieces.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Requirements
 | 
					Requirements
 | 
				
			||||||
@@ -83,22 +84,22 @@ Similar Projects
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
There's another project out there called `Mayan EDMS`_ that has a surprising
 | 
					There's another project out there called `Mayan EDMS`_ that has a surprising
 | 
				
			||||||
amount of technical overlap with Paperless.  Also based on Django and using
 | 
					amount of technical overlap with Paperless.  Also based on Django and using
 | 
				
			||||||
a consumer model with Tesseract and unpaper, Mayan EDMS is *much* more
 | 
					a consumer model with Tesseract and Unpaper, Mayan EDMS is *much* more
 | 
				
			||||||
featureful and comes with a slick UI as well.  It may be that Paperless is
 | 
					featureful and comes with a slick UI as well, but still in Python 2. It may be 
 | 
				
			||||||
better suited for low-resource environments (like a Rasberry Pi), but to be
 | 
					that Paperless consumes fewer resources, but to be honest, this is just a guess 
 | 
				
			||||||
honest, this is just a guess as I haven't tested this myself.  One thing's
 | 
					as I haven't tested this myself.  One thing's for certain though, *Paperless* 
 | 
				
			||||||
for certain though, *Paperless* is a **much** better name.
 | 
					is a **much** better name.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Important Note
 | 
					Important Note
 | 
				
			||||||
==============
 | 
					==============
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Document scanners are typically used to scan sensitive documents.  Things like
 | 
					Document scanners are typically used to scan sensitive documents.  Things like
 | 
				
			||||||
your social insurance number, tax records, invoices, etc.  While paperless
 | 
					your social insurance number, tax records, invoices, etc.  While Paperless
 | 
				
			||||||
encrypts the original PDFs via the consumption script, the OCR'd text is *not*
 | 
					encrypts the original files via the consumption script, the OCR'd text is *not*
 | 
				
			||||||
encrypted and is therefore stored in the clear (it needs to be searchable, so
 | 
					encrypted and is therefore stored in the clear (it needs to be searchable, so
 | 
				
			||||||
if someone has ideas on how to do that on encrypted data, I'm all ears).  This
 | 
					if someone has ideas on how to do that on encrypted data, I'm all ears).  This
 | 
				
			||||||
means that paperless should never be run on an untrusted host.  Instead, I
 | 
					means that Paperless should never be run on an untrusted host.  Instead, I
 | 
				
			||||||
recommend that if you do want to use it, run it locally on a server in your own
 | 
					recommend that if you do want to use it, run it locally on a server in your own
 | 
				
			||||||
home.
 | 
					home.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -3,7 +3,11 @@
 | 
				
			|||||||
Paperless
 | 
					Paperless
 | 
				
			||||||
=========
 | 
					=========
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Scan, index, and archive all of your paper documents.  Say goodbye to paper.
 | 
					Paperless is a simple Django application running in two parts: 
 | 
				
			||||||
 | 
					a :ref:`consumer <utilities-consumer>` (the thing that does the indexing) and 
 | 
				
			||||||
 | 
					the :ref:`webserver <utilities-webserver>` (the part that lets you search & download
 | 
				
			||||||
 | 
					already-indexed documents). If you want to learn more about its functions keep on 
 | 
				
			||||||
 | 
					reading after the installation section.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. _index-why-this-exists:
 | 
					.. _index-why-this-exists:
 | 
				
			||||||
@@ -15,10 +19,11 @@ Paper is a nightmare.  Environmental issues aside, there's no excuse for it in
 | 
				
			|||||||
the 21st century.  It takes up space, collects dust, doesn't support any form of
 | 
					the 21st century.  It takes up space, collects dust, doesn't support any form of
 | 
				
			||||||
a search feature, indexing is tedious, it's heavy and prone to damage & loss.
 | 
					a search feature, indexing is tedious, it's heavy and prone to damage & loss.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
I wrote this to make "going paperless" easier.  I wanted to be able to feed
 | 
					I wrote this to make "going paperless" easier.  I do not have to worry about 
 | 
				
			||||||
documents right from the post box into the scanner and then shred them so I
 | 
					finding stuff again. I feed documents right from the post box into the scanner and 
 | 
				
			||||||
never have to worry about finding stuff again.  Perhaps you might find it useful
 | 
					then shred them.  Perhaps you might find it useful too.
 | 
				
			||||||
too.
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Contents
 | 
					Contents
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -4,7 +4,7 @@ Requirements
 | 
				
			|||||||
============
 | 
					============
 | 
				
			||||||
 | 
					
 | 
				
			||||||
You need a Linux machine or Unix-like setup (theoretically an Apple machine
 | 
					You need a Linux machine or Unix-like setup (theoretically an Apple machine
 | 
				
			||||||
should work) that has the following software installed on it:
 | 
					should work) that has the following software installed:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
* `Python3`_ (with development libraries, pip and virtualenv)
 | 
					* `Python3`_ (with development libraries, pip and virtualenv)
 | 
				
			||||||
* `GNU Privacy Guard`_
 | 
					* `GNU Privacy Guard`_
 | 
				
			||||||
@@ -21,14 +21,14 @@ should work) that has the following software installed on it:
 | 
				
			|||||||
Notably, you should confirm how you access your Python3 installation.  Many
 | 
					Notably, you should confirm how you access your Python3 installation.  Many
 | 
				
			||||||
Linux distributions will install Python3 in parallel to Python2, using the names
 | 
					Linux distributions will install Python3 in parallel to Python2, using the names
 | 
				
			||||||
``python3`` and ``python`` respectively.  The same goes for ``pip3`` and
 | 
					``python3`` and ``python`` respectively.  The same goes for ``pip3`` and
 | 
				
			||||||
``pip``.  Using Python2 will likely break things, so make sure that you're using
 | 
					``pip``.  Running Paperless with Python2 will likely break things, so make sure that 
 | 
				
			||||||
the right version.
 | 
					you're using the right version.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
For the purposes of simplicity, ``python`` and ``pip`` is used everywhere to
 | 
					For the purposes of simplicity, ``python`` and ``pip`` is used everywhere to
 | 
				
			||||||
refer to their Python 3 versions.
 | 
					refer to their Python3 versions.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
In addition to the above, there are a number of Python requirements, all of
 | 
					In addition to the above, there are a number of Python requirements, all of
 | 
				
			||||||
which are listed in a file called ``requirements.txt`` in the project root.
 | 
					which are listed in a file called ``requirements.txt`` in the project root directory.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
If you're not working on a virtual environment (like Vagrant or Docker), you
 | 
					If you're not working on a virtual environment (like Vagrant or Docker), you
 | 
				
			||||||
should probably be using a virtualenv, but that's your call.  The reasons why
 | 
					should probably be using a virtualenv, but that's your call.  The reasons why
 | 
				
			||||||
@@ -67,7 +67,7 @@ dependencies is easy:
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
    $ pip install --user --requirement /path/to/paperless/requirements.txt
 | 
					    $ pip install --user --requirement /path/to/paperless/requirements.txt
 | 
				
			||||||
 | 
					
 | 
				
			||||||
This should download and install all of the requirements into
 | 
					This will download and install all of the requirements into
 | 
				
			||||||
``${HOME}/.local``.  Remember that your distribution may be using ``pip3`` as
 | 
					``${HOME}/.local``.  Remember that your distribution may be using ``pip3`` as
 | 
				
			||||||
mentioned above.
 | 
					mentioned above.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -86,8 +86,8 @@ enter it, and install the requirements using the ``requirements.txt`` file:
 | 
				
			|||||||
    $ . /path/to/arbitrary/directory/bin/activate
 | 
					    $ . /path/to/arbitrary/directory/bin/activate
 | 
				
			||||||
    $ pip install  --requirement /path/to/paperless/requirements.txt
 | 
					    $ pip install  --requirement /path/to/paperless/requirements.txt
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Now you're ready to go.  Just remember to enter your virtualenv whenever you
 | 
					Now you're ready to go.  Just remember to enter (activate) your virtualenv 
 | 
				
			||||||
want to use Paperless.
 | 
					whenever you want to use Paperless.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. _requirements-documentation:
 | 
					.. _requirements-documentation:
 | 
				
			||||||
@@ -95,7 +95,7 @@ want to use Paperless.
 | 
				
			|||||||
Documentation
 | 
					Documentation
 | 
				
			||||||
-------------
 | 
					-------------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
As generation of the documentation is not required for use of Paperless,
 | 
					As generation of the documentation is not required for the use of Paperless,
 | 
				
			||||||
dependencies for this process are not included in ``requirements.txt``.  If
 | 
					dependencies for this process are not included in ``requirements.txt``.  If
 | 
				
			||||||
you'd like to generate your own docs locally, you'll need to:
 | 
					you'd like to generate your own docs locally, you'll need to:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
@@ -4,9 +4,8 @@ Setup
 | 
				
			|||||||
=====
 | 
					=====
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Paperless isn't a very complicated app, but there are a few components, so some
 | 
					Paperless isn't a very complicated app, but there are a few components, so some
 | 
				
			||||||
basic documentation is in order.  If you go follow along in this document and
 | 
					basic documentation is in order.  If you follow along in this document and still 
 | 
				
			||||||
still have trouble, please open an `issue on GitHub`_ so I can fill in the
 | 
					have trouble, please open an `issue on GitHub`_ so I can fill in the gaps.
 | 
				
			||||||
gaps.
 | 
					 | 
				
			||||||
 | 
					
 | 
				
			||||||
.. _issue on GitHub: https://github.com/danielquinn/paperless/issues
 | 
					.. _issue on GitHub: https://github.com/danielquinn/paperless/issues
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -28,6 +27,7 @@ or just download the tarball and go that route:
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
.. code:: bash
 | 
					.. code:: bash
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					    $ cd to the directory where you want to run Paperless
 | 
				
			||||||
    $ wget https://github.com/danielquinn/paperless/archive/master.zip
 | 
					    $ wget https://github.com/danielquinn/paperless/archive/master.zip
 | 
				
			||||||
    $ unzip master.zip
 | 
					    $ unzip master.zip
 | 
				
			||||||
    $ cd paperless-master
 | 
					    $ cd paperless-master
 | 
				
			||||||
@@ -42,8 +42,10 @@ You can go multiple routes with setting up and running Paperless. The `Vagrant
 | 
				
			|||||||
route`_ is quick & easy, but means you're running a VM which comes with memory
 | 
					route`_ is quick & easy, but means you're running a VM which comes with memory
 | 
				
			||||||
consumption etc. We also `support Docker`_, which you can use natively under
 | 
					consumption etc. We also `support Docker`_, which you can use natively under
 | 
				
			||||||
Linux and in a VM with `Docker Machine`_ (this guide was written for native
 | 
					Linux and in a VM with `Docker Machine`_ (this guide was written for native
 | 
				
			||||||
Docker usage under Linux, you might have to adapt it for Docker Machine.)
 | 
					Docker usage under Linux, you might have to adapt it for Docker Machine.) 
 | 
				
			||||||
Alternatively the standard, `bare metal`_ approach is a little more
 | 
					Not to forget the virtualenv, this is similar to `bare metal`_ with the exception
 | 
				
			||||||
 | 
					that you have to activate the virtualenv first.
 | 
				
			||||||
 | 
					Last but not least, the standard `bare metal`_ approach is a little more
 | 
				
			||||||
complicated, but worth it because it makes it easier should you want to
 | 
					complicated, but worth it because it makes it easier should you want to
 | 
				
			||||||
contribute some code back.
 | 
					contribute some code back.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -59,9 +61,11 @@ Standard (Bare Metal)
 | 
				
			|||||||
.....................
 | 
					.....................
 | 
				
			||||||
 | 
					
 | 
				
			||||||
1. Install the requirements as per the :ref:`requirements <requirements>` page.
 | 
					1. Install the requirements as per the :ref:`requirements <requirements>` page.
 | 
				
			||||||
2. Change to the ``src`` directory in this repo.
 | 
					2. Within the extract of master.zip go to the ``src`` directory.
 | 
				
			||||||
3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` and open it in
 | 
					3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` also the virtual 
 | 
				
			||||||
   your favourite editor.  Set the values for:
 | 
					   envrionment look there for it and open it in your favourite editor.  
 | 
				
			||||||
 | 
					   Because this file contains passwords it should only be readable by user root
 | 
				
			||||||
 | 
					   and paperless !  Set the values for:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
    * ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
 | 
					    * ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
 | 
				
			||||||
      dumped to be consumed by Paperless.
 | 
					      dumped to be consumed by Paperless.
 | 
				
			||||||
@@ -70,18 +74,18 @@ Standard (Bare Metal)
 | 
				
			|||||||
    * ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
 | 
					    * ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
 | 
				
			||||||
      will spawn to process document pages in parallel.
 | 
					      will spawn to process document pages in parallel.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
4. Initialise the database with ``./manage.py migrate``.
 | 
					4. Initialise the SQLite database with ``./manage.py migrate``.
 | 
				
			||||||
5. Create a user for your Paperless instance with
 | 
					5. Create a user for your Paperless instance with
 | 
				
			||||||
   ``./manage.py createsuperuser``. Follow the prompts to create your user.
 | 
					   ``./manage.py createsuperuser``. Follow the prompts to create your user.
 | 
				
			||||||
6. Start the webserver with ``./manage.py runserver <IP>:<PORT>``.
 | 
					6. Start the webserver with ``./manage.py runserver <IP>:<PORT>``.
 | 
				
			||||||
   If no specifc IP or port are given, the default is ``127.0.0.1:8000``.
 | 
					   If no specifc IP or port are given, the default is ``127.0.0.1:8000`` 
 | 
				
			||||||
   You should now be able to visit your (empty) `Paperless webserver`_ at
 | 
					   also known as http://localhost:8000/.
 | 
				
			||||||
   ``127.0.0.1:8000`` (or whatever you chose).  You can login with the
 | 
					   You should now be able to visit your (empty) at `Paperless webserver`_ or 
 | 
				
			||||||
   user/pass you created in #5.
 | 
					   whatever you chose before.  You can login with the user/pass you created in #5.
 | 
				
			||||||
7. In a separate window, change to the ``src`` directory in this repo again,
 | 
					7. In a separate window, change to the ``src`` directory in this repo again,
 | 
				
			||||||
   but this time, you should start the consumer script with
 | 
					   but this time, you should start the consumer script with
 | 
				
			||||||
   ``./manage.py document_consumer``.
 | 
					   ``./manage.py document_consumer``.
 | 
				
			||||||
8. Scan something.  Put it in the ``CONSUMPTION_DIR``.
 | 
					8. Scan something or put a file into the  ``CONSUMPTION_DIR``.
 | 
				
			||||||
9. Wait a few minutes
 | 
					9. Wait a few minutes
 | 
				
			||||||
10. Visit the document list on your webserver, and it should be there, indexed
 | 
					10. Visit the document list on your webserver, and it should be there, indexed
 | 
				
			||||||
    and downloadable.
 | 
					    and downloadable.
 | 
				
			||||||
@@ -299,10 +303,11 @@ Standard (Bare Metal, Systemd)
 | 
				
			|||||||
 | 
					
 | 
				
			||||||
If you're running on a bare metal system that's using Systemd, you can use the
 | 
					If you're running on a bare metal system that's using Systemd, you can use the
 | 
				
			||||||
service unit files in the ``scripts`` directory to set this up.  You'll need to
 | 
					service unit files in the ``scripts`` directory to set this up.  You'll need to
 | 
				
			||||||
create a user called ``paperless`` and setup Paperless to be in a place that
 | 
					create a user called ``paperless`` (without login (if not already done so #5)) and 
 | 
				
			||||||
this new user can read and write to. Be sure to edit the service scripts to point
 | 
					setup Paperless to be in a place that this new user can read and write to. Be sure 
 | 
				
			||||||
to the proper location of your paperless install, referencing the appropriate Python
 | 
					to edit the service  scripts to point to the proper location of your paperless install, 
 | 
				
			||||||
binary. For example: ``ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer``.
 | 
					referencing the appropriate Python binary. For example: 
 | 
				
			||||||
 | 
					``ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer``.
 | 
				
			||||||
If you don't want to make a new user, you can change the ``Group`` and ``User`` variables
 | 
					If you don't want to make a new user, you can change the ``Group`` and ``User`` variables
 | 
				
			||||||
accordingly.
 | 
					accordingly.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -344,7 +349,7 @@ after restarting your system:
 | 
				
			|||||||
  If you are using a network interface other than ``eth0``, you will have to
 | 
					  If you are using a network interface other than ``eth0``, you will have to
 | 
				
			||||||
  change ``IFACE=eth0``. For example, if you are connected via WiFi, you will
 | 
					  change ``IFACE=eth0``. For example, if you are connected via WiFi, you will
 | 
				
			||||||
  likely need to replace ``eth0`` above with ``wlan0``. To see all interfaces,
 | 
					  likely need to replace ``eth0`` above with ``wlan0``. To see all interfaces,
 | 
				
			||||||
  run ``ifconfig``.
 | 
					  run ``ifconfig -a``.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
  Save the file.
 | 
					  Save the file.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user