mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-11 10:00:48 -05:00
Merge pull request #222 from tido-/master
little changes to reflect as much as possible
This commit is contained in:
commit
3477b96d87
27
README.rst
27
README.rst
@ -6,7 +6,7 @@ Paperless
|
|||||||
|Travis|
|
|Travis|
|
||||||
|Dependencies|
|
|Dependencies|
|
||||||
|
|
||||||
Scan, index, and archive all of your paper documents
|
Index and archive all of your scanned paper documents
|
||||||
|
|
||||||
I hate paper. Environmental issues aside, it's a tech person's nightmare:
|
I hate paper. Environmental issues aside, it's a tech person's nightmare:
|
||||||
|
|
||||||
@ -23,6 +23,8 @@ it... because paper. I wrote this to make my life easier.
|
|||||||
How it Works
|
How it Works
|
||||||
============
|
============
|
||||||
|
|
||||||
|
Paperless does not control your scanner, it only helps you deal with what your scanner produces
|
||||||
|
|
||||||
1. Buy a document scanner like `this one`_ (used by me) or `this other one`_
|
1. Buy a document scanner like `this one`_ (used by me) or `this other one`_
|
||||||
recommended by another user.
|
recommended by another user.
|
||||||
2. Set it up to "scan to FTP" or something similar. It should be able to push
|
2. Set it up to "scan to FTP" or something similar. It should be able to push
|
||||||
@ -30,7 +32,7 @@ How it Works
|
|||||||
scanner doesn't know how to automatically upload the file somewhere, you can
|
scanner doesn't know how to automatically upload the file somewhere, you can
|
||||||
always do that manually. Paperless doesn't care how the documents get into
|
always do that manually. Paperless doesn't care how the documents get into
|
||||||
its local consumption directory.
|
its local consumption directory.
|
||||||
3. Have the target server run the Paperless consumption script to OCR the PDF
|
3. Have the target server run the Paperless consumption script to OCR the file
|
||||||
and index it into a local database.
|
and index it into a local database.
|
||||||
4. Use the web frontend to sift through the database and find what you want.
|
4. Use the web frontend to sift through the database and find what you want.
|
||||||
5. Download the PDF you need/want via the web interface and do whatever you
|
5. Download the PDF you need/want via the web interface and do whatever you
|
||||||
@ -48,9 +50,8 @@ Stability
|
|||||||
=========
|
=========
|
||||||
|
|
||||||
Paperless is still under active development (just look at the git commit
|
Paperless is still under active development (just look at the git commit
|
||||||
history) so don't expect it to be 100% stable. I'm using it for my own
|
history) so don't expect it to be 100% stable. You can backup the sqlite3
|
||||||
documents, but I'm crazy like that. If you use this and it breaks something,
|
database, media directory and your configuration file to be on the safe side.
|
||||||
you get to keep all the shiny pieces.
|
|
||||||
|
|
||||||
|
|
||||||
Requirements
|
Requirements
|
||||||
@ -83,22 +84,22 @@ Similar Projects
|
|||||||
|
|
||||||
There's another project out there called `Mayan EDMS`_ that has a surprising
|
There's another project out there called `Mayan EDMS`_ that has a surprising
|
||||||
amount of technical overlap with Paperless. Also based on Django and using
|
amount of technical overlap with Paperless. Also based on Django and using
|
||||||
a consumer model with Tesseract and unpaper, Mayan EDMS is *much* more
|
a consumer model with Tesseract and Unpaper, Mayan EDMS is *much* more
|
||||||
featureful and comes with a slick UI as well. It may be that Paperless is
|
featureful and comes with a slick UI as well, but still in Python 2. It may be
|
||||||
better suited for low-resource environments (like a Rasberry Pi), but to be
|
that Paperless consumes fewer resources, but to be honest, this is just a guess
|
||||||
honest, this is just a guess as I haven't tested this myself. One thing's
|
as I haven't tested this myself. One thing's for certain though, *Paperless*
|
||||||
for certain though, *Paperless* is a **much** better name.
|
is a **much** better name.
|
||||||
|
|
||||||
|
|
||||||
Important Note
|
Important Note
|
||||||
==============
|
==============
|
||||||
|
|
||||||
Document scanners are typically used to scan sensitive documents. Things like
|
Document scanners are typically used to scan sensitive documents. Things like
|
||||||
your social insurance number, tax records, invoices, etc. While paperless
|
your social insurance number, tax records, invoices, etc. While Paperless
|
||||||
encrypts the original PDFs via the consumption script, the OCR'd text is *not*
|
encrypts the original files via the consumption script, the OCR'd text is *not*
|
||||||
encrypted and is therefore stored in the clear (it needs to be searchable, so
|
encrypted and is therefore stored in the clear (it needs to be searchable, so
|
||||||
if someone has ideas on how to do that on encrypted data, I'm all ears). This
|
if someone has ideas on how to do that on encrypted data, I'm all ears). This
|
||||||
means that paperless should never be run on an untrusted host. Instead, I
|
means that Paperless should never be run on an untrusted host. Instead, I
|
||||||
recommend that if you do want to use it, run it locally on a server in your own
|
recommend that if you do want to use it, run it locally on a server in your own
|
||||||
home.
|
home.
|
||||||
|
|
||||||
|
@ -3,7 +3,11 @@
|
|||||||
Paperless
|
Paperless
|
||||||
=========
|
=========
|
||||||
|
|
||||||
Scan, index, and archive all of your paper documents. Say goodbye to paper.
|
Paperless is a simple Django application running in two parts:
|
||||||
|
a :ref:`consumer <utilities-consumer>` (the thing that does the indexing) and
|
||||||
|
the :ref:`webserver <utilities-webserver>` (the part that lets you search & download
|
||||||
|
already-indexed documents). If you want to learn more about its functions keep on
|
||||||
|
reading after the installation section.
|
||||||
|
|
||||||
|
|
||||||
.. _index-why-this-exists:
|
.. _index-why-this-exists:
|
||||||
@ -15,10 +19,11 @@ Paper is a nightmare. Environmental issues aside, there's no excuse for it in
|
|||||||
the 21st century. It takes up space, collects dust, doesn't support any form of
|
the 21st century. It takes up space, collects dust, doesn't support any form of
|
||||||
a search feature, indexing is tedious, it's heavy and prone to damage & loss.
|
a search feature, indexing is tedious, it's heavy and prone to damage & loss.
|
||||||
|
|
||||||
I wrote this to make "going paperless" easier. I wanted to be able to feed
|
I wrote this to make "going paperless" easier. I do not have to worry about
|
||||||
documents right from the post box into the scanner and then shred them so I
|
finding stuff again. I feed documents right from the post box into the scanner and
|
||||||
never have to worry about finding stuff again. Perhaps you might find it useful
|
then shred them. Perhaps you might find it useful too.
|
||||||
too.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Contents
|
Contents
|
||||||
|
@ -4,7 +4,7 @@ Requirements
|
|||||||
============
|
============
|
||||||
|
|
||||||
You need a Linux machine or Unix-like setup (theoretically an Apple machine
|
You need a Linux machine or Unix-like setup (theoretically an Apple machine
|
||||||
should work) that has the following software installed on it:
|
should work) that has the following software installed:
|
||||||
|
|
||||||
* `Python3`_ (with development libraries, pip and virtualenv)
|
* `Python3`_ (with development libraries, pip and virtualenv)
|
||||||
* `GNU Privacy Guard`_
|
* `GNU Privacy Guard`_
|
||||||
@ -21,14 +21,14 @@ should work) that has the following software installed on it:
|
|||||||
Notably, you should confirm how you access your Python3 installation. Many
|
Notably, you should confirm how you access your Python3 installation. Many
|
||||||
Linux distributions will install Python3 in parallel to Python2, using the names
|
Linux distributions will install Python3 in parallel to Python2, using the names
|
||||||
``python3`` and ``python`` respectively. The same goes for ``pip3`` and
|
``python3`` and ``python`` respectively. The same goes for ``pip3`` and
|
||||||
``pip``. Using Python2 will likely break things, so make sure that you're using
|
``pip``. Running Paperless with Python2 will likely break things, so make sure that
|
||||||
the right version.
|
you're using the right version.
|
||||||
|
|
||||||
For the purposes of simplicity, ``python`` and ``pip`` is used everywhere to
|
For the purposes of simplicity, ``python`` and ``pip`` is used everywhere to
|
||||||
refer to their Python 3 versions.
|
refer to their Python3 versions.
|
||||||
|
|
||||||
In addition to the above, there are a number of Python requirements, all of
|
In addition to the above, there are a number of Python requirements, all of
|
||||||
which are listed in a file called ``requirements.txt`` in the project root.
|
which are listed in a file called ``requirements.txt`` in the project root directory.
|
||||||
|
|
||||||
If you're not working on a virtual environment (like Vagrant or Docker), you
|
If you're not working on a virtual environment (like Vagrant or Docker), you
|
||||||
should probably be using a virtualenv, but that's your call. The reasons why
|
should probably be using a virtualenv, but that's your call. The reasons why
|
||||||
@ -67,7 +67,7 @@ dependencies is easy:
|
|||||||
|
|
||||||
$ pip install --user --requirement /path/to/paperless/requirements.txt
|
$ pip install --user --requirement /path/to/paperless/requirements.txt
|
||||||
|
|
||||||
This should download and install all of the requirements into
|
This will download and install all of the requirements into
|
||||||
``${HOME}/.local``. Remember that your distribution may be using ``pip3`` as
|
``${HOME}/.local``. Remember that your distribution may be using ``pip3`` as
|
||||||
mentioned above.
|
mentioned above.
|
||||||
|
|
||||||
@ -86,8 +86,8 @@ enter it, and install the requirements using the ``requirements.txt`` file:
|
|||||||
$ . /path/to/arbitrary/directory/bin/activate
|
$ . /path/to/arbitrary/directory/bin/activate
|
||||||
$ pip install --requirement /path/to/paperless/requirements.txt
|
$ pip install --requirement /path/to/paperless/requirements.txt
|
||||||
|
|
||||||
Now you're ready to go. Just remember to enter your virtualenv whenever you
|
Now you're ready to go. Just remember to enter (activate) your virtualenv
|
||||||
want to use Paperless.
|
whenever you want to use Paperless.
|
||||||
|
|
||||||
|
|
||||||
.. _requirements-documentation:
|
.. _requirements-documentation:
|
||||||
@ -95,7 +95,7 @@ want to use Paperless.
|
|||||||
Documentation
|
Documentation
|
||||||
-------------
|
-------------
|
||||||
|
|
||||||
As generation of the documentation is not required for use of Paperless,
|
As generation of the documentation is not required for the use of Paperless,
|
||||||
dependencies for this process are not included in ``requirements.txt``. If
|
dependencies for this process are not included in ``requirements.txt``. If
|
||||||
you'd like to generate your own docs locally, you'll need to:
|
you'd like to generate your own docs locally, you'll need to:
|
||||||
|
|
||||||
|
@ -4,9 +4,8 @@ Setup
|
|||||||
=====
|
=====
|
||||||
|
|
||||||
Paperless isn't a very complicated app, but there are a few components, so some
|
Paperless isn't a very complicated app, but there are a few components, so some
|
||||||
basic documentation is in order. If you go follow along in this document and
|
basic documentation is in order. If you follow along in this document and still
|
||||||
still have trouble, please open an `issue on GitHub`_ so I can fill in the
|
have trouble, please open an `issue on GitHub`_ so I can fill in the gaps.
|
||||||
gaps.
|
|
||||||
|
|
||||||
.. _issue on GitHub: https://github.com/danielquinn/paperless/issues
|
.. _issue on GitHub: https://github.com/danielquinn/paperless/issues
|
||||||
|
|
||||||
@ -28,6 +27,7 @@ or just download the tarball and go that route:
|
|||||||
|
|
||||||
.. code:: bash
|
.. code:: bash
|
||||||
|
|
||||||
|
$ cd to the directory where you want to run Paperless
|
||||||
$ wget https://github.com/danielquinn/paperless/archive/master.zip
|
$ wget https://github.com/danielquinn/paperless/archive/master.zip
|
||||||
$ unzip master.zip
|
$ unzip master.zip
|
||||||
$ cd paperless-master
|
$ cd paperless-master
|
||||||
@ -43,7 +43,9 @@ route`_ is quick & easy, but means you're running a VM which comes with memory
|
|||||||
consumption etc. We also `support Docker`_, which you can use natively under
|
consumption etc. We also `support Docker`_, which you can use natively under
|
||||||
Linux and in a VM with `Docker Machine`_ (this guide was written for native
|
Linux and in a VM with `Docker Machine`_ (this guide was written for native
|
||||||
Docker usage under Linux, you might have to adapt it for Docker Machine.)
|
Docker usage under Linux, you might have to adapt it for Docker Machine.)
|
||||||
Alternatively the standard, `bare metal`_ approach is a little more
|
Not to forget the virtualenv, this is similar to `bare metal`_ with the exception
|
||||||
|
that you have to activate the virtualenv first.
|
||||||
|
Last but not least, the standard `bare metal`_ approach is a little more
|
||||||
complicated, but worth it because it makes it easier should you want to
|
complicated, but worth it because it makes it easier should you want to
|
||||||
contribute some code back.
|
contribute some code back.
|
||||||
|
|
||||||
@ -59,9 +61,11 @@ Standard (Bare Metal)
|
|||||||
.....................
|
.....................
|
||||||
|
|
||||||
1. Install the requirements as per the :ref:`requirements <requirements>` page.
|
1. Install the requirements as per the :ref:`requirements <requirements>` page.
|
||||||
2. Change to the ``src`` directory in this repo.
|
2. Within the extract of master.zip go to the ``src`` directory.
|
||||||
3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` and open it in
|
3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` also the virtual
|
||||||
your favourite editor. Set the values for:
|
envrionment look there for it and open it in your favourite editor.
|
||||||
|
Because this file contains passwords it should only be readable by user root
|
||||||
|
and paperless ! Set the values for:
|
||||||
|
|
||||||
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
||||||
dumped to be consumed by Paperless.
|
dumped to be consumed by Paperless.
|
||||||
@ -70,18 +74,18 @@ Standard (Bare Metal)
|
|||||||
* ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
|
* ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
|
||||||
will spawn to process document pages in parallel.
|
will spawn to process document pages in parallel.
|
||||||
|
|
||||||
4. Initialise the database with ``./manage.py migrate``.
|
4. Initialise the SQLite database with ``./manage.py migrate``.
|
||||||
5. Create a user for your Paperless instance with
|
5. Create a user for your Paperless instance with
|
||||||
``./manage.py createsuperuser``. Follow the prompts to create your user.
|
``./manage.py createsuperuser``. Follow the prompts to create your user.
|
||||||
6. Start the webserver with ``./manage.py runserver <IP>:<PORT>``.
|
6. Start the webserver with ``./manage.py runserver <IP>:<PORT>``.
|
||||||
If no specifc IP or port are given, the default is ``127.0.0.1:8000``.
|
If no specifc IP or port are given, the default is ``127.0.0.1:8000``
|
||||||
You should now be able to visit your (empty) `Paperless webserver`_ at
|
also known as http://localhost:8000/.
|
||||||
``127.0.0.1:8000`` (or whatever you chose). You can login with the
|
You should now be able to visit your (empty) at `Paperless webserver`_ or
|
||||||
user/pass you created in #5.
|
whatever you chose before. You can login with the user/pass you created in #5.
|
||||||
7. In a separate window, change to the ``src`` directory in this repo again,
|
7. In a separate window, change to the ``src`` directory in this repo again,
|
||||||
but this time, you should start the consumer script with
|
but this time, you should start the consumer script with
|
||||||
``./manage.py document_consumer``.
|
``./manage.py document_consumer``.
|
||||||
8. Scan something. Put it in the ``CONSUMPTION_DIR``.
|
8. Scan something or put a file into the ``CONSUMPTION_DIR``.
|
||||||
9. Wait a few minutes
|
9. Wait a few minutes
|
||||||
10. Visit the document list on your webserver, and it should be there, indexed
|
10. Visit the document list on your webserver, and it should be there, indexed
|
||||||
and downloadable.
|
and downloadable.
|
||||||
@ -299,10 +303,11 @@ Standard (Bare Metal, Systemd)
|
|||||||
|
|
||||||
If you're running on a bare metal system that's using Systemd, you can use the
|
If you're running on a bare metal system that's using Systemd, you can use the
|
||||||
service unit files in the ``scripts`` directory to set this up. You'll need to
|
service unit files in the ``scripts`` directory to set this up. You'll need to
|
||||||
create a user called ``paperless`` and setup Paperless to be in a place that
|
create a user called ``paperless`` (without login (if not already done so #5)) and
|
||||||
this new user can read and write to. Be sure to edit the service scripts to point
|
setup Paperless to be in a place that this new user can read and write to. Be sure
|
||||||
to the proper location of your paperless install, referencing the appropriate Python
|
to edit the service scripts to point to the proper location of your paperless install,
|
||||||
binary. For example: ``ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer``.
|
referencing the appropriate Python binary. For example:
|
||||||
|
``ExecStart=/path/to/python3 /path/to/paperless/src/manage.py document_consumer``.
|
||||||
If you don't want to make a new user, you can change the ``Group`` and ``User`` variables
|
If you don't want to make a new user, you can change the ``Group`` and ``User`` variables
|
||||||
accordingly.
|
accordingly.
|
||||||
|
|
||||||
@ -344,7 +349,7 @@ after restarting your system:
|
|||||||
If you are using a network interface other than ``eth0``, you will have to
|
If you are using a network interface other than ``eth0``, you will have to
|
||||||
change ``IFACE=eth0``. For example, if you are connected via WiFi, you will
|
change ``IFACE=eth0``. For example, if you are connected via WiFi, you will
|
||||||
likely need to replace ``eth0`` above with ``wlan0``. To see all interfaces,
|
likely need to replace ``eth0`` above with ``wlan0``. To see all interfaces,
|
||||||
run ``ifconfig``.
|
run ``ifconfig -a``.
|
||||||
|
|
||||||
Save the file.
|
Save the file.
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user