mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
226 lines
9.1 KiB
ReStructuredText
226 lines
9.1 KiB
ReStructuredText
**************
|
|
Usage Overview
|
|
**************
|
|
|
|
Paperless is an application that manages your personal documents. With
|
|
the help of a document scanner (see :ref:`scanners`), paperless transforms
|
|
your wieldy physical document binders into a searchable archive and
|
|
provices many utilities for finding and managing your documents.
|
|
|
|
|
|
Terms and definitions
|
|
#####################
|
|
|
|
Paperless esentially consists of two different parts for managing your
|
|
documents:
|
|
|
|
* The *consumer* watches a specified folder and adds all documents in that
|
|
folder to paperless.
|
|
* The *web server* provides a UI that you use to manage and search for your
|
|
scanned documents.
|
|
|
|
Each document has a couple of fields that you can assign to them:
|
|
|
|
* A *Document* is a piece of paper that sometimes contains valuable
|
|
information.
|
|
* The *correspondent* of a document is the person, institution or company that
|
|
a document either originates form, or is sent to.
|
|
* A *tag* is a label that you can assign to documents. Think of labels as more
|
|
powerful folders: Multiple documents can be grouped together with a single
|
|
tag, however, a single document can also have multiple tags. This is not
|
|
possible with folders. The reason folders are not implemented in paperless
|
|
is simply that tags are much more versatile than folders.
|
|
* A *document type* is used to demarkate the type of a document such as letter,
|
|
bank statement, invoice, contract, etc. It is used to identify what a document
|
|
is about.
|
|
* The *date added* of a document is the date the document was scanned into
|
|
paperless. You cannot and should not change this date.
|
|
* The *date created* of a document is the date the document was intially issued.
|
|
This can be the date you bought a product, the date you signed a contract, or
|
|
the date a letter was sent to you.
|
|
* The *archive serial number* (short: ASN) of a document is the identifier of
|
|
the document in your physical document binders. See
|
|
:ref:`usage-recommended_workflow` below.
|
|
* The *content* of a document is the text that was OCR'ed from the document.
|
|
This text is fed into the search engine and is used for matching tags,
|
|
correspondents and document types.
|
|
|
|
|
|
Frontend overview
|
|
#################
|
|
|
|
.. warning::
|
|
|
|
TBD. Add some fancy screenshots!
|
|
|
|
Adding documents to paperless
|
|
#############################
|
|
|
|
Once you've got Paperless setup, you need to start feeding documents into it.
|
|
Currently, there are three options: the consumption directory, IMAP (email), and
|
|
HTTP POST.
|
|
|
|
|
|
The consumption directory
|
|
=========================
|
|
|
|
The primary method of getting documents into your database is by putting them in
|
|
the consumption directory. The consumer runs in an infinite
|
|
loop looking for new additions to this directory and when it finds them, it goes
|
|
about the process of parsing them with the OCR, indexing what it finds, and storing
|
|
it in the media directory.
|
|
|
|
Getting stuff into this directory is up to you. If you're running Paperless
|
|
on your local computer, you might just want to drag and drop files there, but if
|
|
you're running this on a server and want your scanner to automatically push
|
|
files to this directory, you'll need to setup some sort of service to accept the
|
|
files from the scanner. Typically, you're looking at an FTP server like
|
|
`Proftpd`_ or a Windows folder share with `Samba`_.
|
|
|
|
.. _Proftpd: http://www.proftpd.org/
|
|
.. _Samba: http://www.samba.org/
|
|
|
|
.. TODO: hyperref to configuration of the location of this magic folder.
|
|
|
|
|
|
IMAP (Email)
|
|
============
|
|
|
|
You can tell paperless-ng to consume documents from your email accounts.
|
|
This is a very flexible and powerful feature, if you regularly received documents
|
|
via mail that you need to archive. The mail consumer can be configured by using the
|
|
admin interface in the following manner:
|
|
|
|
1. Define e-mail accounts.
|
|
2. Define mail rules for your account.
|
|
|
|
These rules perform the following:
|
|
|
|
1. Connect to the mail server.
|
|
2. Fetch all matching mails (as defined by folder, maximum age and the filters)
|
|
3. Check if there are any consumable attachments.
|
|
4. If so, instruct paperless to consume the attachments and optionally
|
|
use the metadata provided in the rule for the new document.
|
|
5. If documents were consumed from a mail, the rule action is performed
|
|
on that mail.
|
|
|
|
Paperless will completely ignore mails that do not match your filters. It will also
|
|
only perform the action on mails that it has consumed documents from.
|
|
|
|
The actions all ensure that the same mail is not consumed twice by different means.
|
|
These are as follows:
|
|
|
|
* **Delete:** Immediately deletes mail that paperless has consumed documents from.
|
|
Use with caution.
|
|
* **Mark as read:** Mark consumed mail as read. Paperless will not consume documents
|
|
from already read mails. If you read a mail before paperless sees it, it will be
|
|
ignored.
|
|
* **Flag:** Sets the 'important' flag on mails with consumed documents. Paperless
|
|
will not consume flagged mails.
|
|
* **Move to folder:** Moves consumed mails out of the way so that paperless wont
|
|
consume them again.
|
|
|
|
.. caution::
|
|
|
|
The mail consumer will perform these actions on all mails it has consumed
|
|
documents from. Keep in mind that the actual consumption process may fail
|
|
for some reason, leaving you with missing documents in paperless.
|
|
|
|
Paperless is set up to check your mails every 10 minutes. this can be configured on the
|
|
'Scheduled tasks' page in the admin.
|
|
|
|
|
|
REST API
|
|
========
|
|
|
|
You can also submit a document using the REST API, see :ref:`api-file_uploads` for details.
|
|
|
|
|
|
.. _usage-recommended_workflow:
|
|
|
|
The recommended workflow
|
|
########################
|
|
|
|
Once you have familiarized yourself with paperless and are ready to use it
|
|
for all your documents, the recommended workflow for managing your documents
|
|
is as follows. This workflow also takes into account that some documents
|
|
have to be kept in physical form, but still ensures that you get all the
|
|
advantages for these documents as well.
|
|
|
|
The following diagram shows how easy it is to manage your documents.
|
|
|
|
.. image:: _static/recommended_workflow.png
|
|
|
|
Preparations in paperless
|
|
=========================
|
|
|
|
* Create an inbox tag that gets assigned to all new documents.
|
|
* Create a TODO tag.
|
|
|
|
Processing of the physical documents
|
|
====================================
|
|
|
|
Keep a physical inbox. Whenever you receive a document that you need to
|
|
archive, put it into your inbox. Regulary, do the following for all documents
|
|
in your inbox:
|
|
|
|
1. For each document, decide if you need to keep the document in physical
|
|
form. This applies to certain important documents, such as contracts and
|
|
certificates.
|
|
2. If you need to keep the document, write a running number on the document
|
|
before scanning, starting at one and counting upwards. This is the archive
|
|
serial number, or ASN in short.
|
|
3. Scan the document.
|
|
4. If the document has an ASN assigned, store it in a *single* binder, sorted
|
|
by ASN. Don't order this binder in any other way.
|
|
5. If the document has no ASN, throw it away. Yay!
|
|
|
|
Over time, you will notice that your physical binder will fill up. If it is
|
|
full, label the binder with the range of ASNs in this binder (i.e., "Documents
|
|
1 to 343"), store the binder in your cellar or elsewhere, and start a new
|
|
binder.
|
|
|
|
The idea behind this process is that you will never have to use the physical
|
|
binders to find a document. If you need a specific physical document, you
|
|
may find this document by:
|
|
|
|
1. Searching in paperless for the document.
|
|
2. Identify the ASN of the document, since it appears on the scan.
|
|
3. Grab the relevant document binder and get the document. This is easy since
|
|
they are sorted by ASN.
|
|
|
|
Processing of documents in paperless
|
|
====================================
|
|
|
|
Once you have scanned in a document, proceed in paperless as follows.
|
|
|
|
1. If the document has an ASN, assign the ASN to the document.
|
|
2. Assign a correspondent to the document (i.e., your employer, bank, etc)
|
|
This isnt strictly necessary but helps in finding a document when you need
|
|
it.
|
|
3. Assign a document type (i.e., invoice, bank statement, etc) to the document
|
|
This isnt strictly necessary but helps in finding a document when you need
|
|
it.
|
|
4. Assign a proper title to the document (the name of an item you bought, the
|
|
subject of the letter, etc)
|
|
5. Check that the date of the document is corrent. Paperless tries to read
|
|
the date from the content of the document, but this fails sometimes if the
|
|
OCR is bad or multiple dates appear on the document.
|
|
6. Remove inbox tags from the documents.
|
|
|
|
|
|
Task management
|
|
===============
|
|
|
|
Some documents require attention and require you to act on the document. You
|
|
may take two different approaches to handle these documents based on how
|
|
regularly you intent to use paperless and scan documents.
|
|
|
|
* If you scan and process your documents in paperless regularly, assign a
|
|
TODO tag to all scanned documents that you need to process. Create a saved
|
|
view on the dashboard that shows all documents with this tag.
|
|
* If you do not scan documents regularly and use paperless solely for archiving,
|
|
create a physical todo box next to your physical inbox and put documents you
|
|
need to process in the TODO box. When you performed the task associated with
|
|
the document, move it to the inbox.
|