mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Merge pull request #40 from davemachado/master
Update documentation for grammar and additional clarity
This commit is contained in:
commit
edd21d1233
@ -18,13 +18,13 @@ that had a ``match`` property of ``bc hydro`` and a ``matching_algorithm`` of
|
||||
your ``Home Utility`` tag so long as the text ``bc hydro`` appears in the body
|
||||
of the document somewhere.
|
||||
|
||||
The matching logic is quite powerful, and supports searching the text of your
|
||||
The matching logic is quite powerful. It supports searching the text of your
|
||||
document with different algorithms, and as such, some experimentation may be
|
||||
necessary to get things right.
|
||||
|
||||
In order to have a tag, correspondent or type assigned automatically to newly
|
||||
In order to have a tag, correspondent, or type assigned automatically to newly
|
||||
consumed documents, assign a match and matching algorithm using the web
|
||||
interface. These settings define when to assign correspondents, tags and types
|
||||
interface. These settings define when to assign correspondents, tags, and types
|
||||
to documents.
|
||||
|
||||
The following algorithms are available:
|
||||
@ -34,16 +34,16 @@ The following algorithms are available:
|
||||
either of these terms.
|
||||
* **All:** Requires that every word provided appears in the PDF, albeit not in the
|
||||
order provided.
|
||||
* **Literal:** Matches only if the match appears exactly as provided in the PDF.
|
||||
* **Literal:** Matches only if the match appears exactly as provided (i.e. preserve ordering) in the PDF.
|
||||
* **Regular expression:** Parses the match as a regular expression and tries to
|
||||
find a match within the document.
|
||||
* **Fuzzy match:** I dont know. Look at the source.
|
||||
* **Auto:** Tries to automatically match new documents. This does not require you
|
||||
to set a match. See the notes below.
|
||||
|
||||
When using the "any" or "all" matching algorithms, you can search for terms
|
||||
When using the *any* or *all* matching algorithms, you can search for terms
|
||||
that consist of multiple words by enclosing them in double quotes. For example,
|
||||
defining a match text of ``"Bank of America" BofA`` using the "any" algorithm,
|
||||
defining a match text of ``"Bank of America" BofA`` using the *any* algorithm,
|
||||
will match documents that contain either "Bank of America" or "BofA", but will
|
||||
not match documents containing "Bank of South America".
|
||||
|
||||
@ -58,8 +58,8 @@ Automatic matching
|
||||
==================
|
||||
|
||||
Paperless-ng comes with a new matching algorithm called *Auto*. This matching
|
||||
algorithm tries to assign tags, correspondents and document types to your
|
||||
documents based on how you have assigned these on existing documents. It
|
||||
algorithm tries to assign tags, correspondents, and document types to your
|
||||
documents based on how you have already assigned these on existing documents. It
|
||||
uses a neural network under the hood.
|
||||
|
||||
If, for example, all your bank statements of your account 123 at the Bank of
|
||||
@ -76,11 +76,11 @@ feature:
|
||||
changes. Paperless periodically (default: once each hour) checks for changes
|
||||
and does this automatically for you.
|
||||
* The Auto matching algorithm only takes documents into account which are NOT
|
||||
placed in your inbox (i.e., have inbox tags assigned to them). This ensures
|
||||
placed in your inbox (i.e. have any inbox tags assigned to them). This ensures
|
||||
that the neural network only learns from documents which you have correctly
|
||||
tagged before.
|
||||
* The matching algorithm can only work if there is a correlation between the
|
||||
tag, correspondent or document type and the document itself. Your bank
|
||||
tag, correspondent, or document type and the document itself. Your bank
|
||||
statements usually contain your bank account number and the name of the bank,
|
||||
so this works reasonably well, However, tags such as "TODO" cannot be
|
||||
automatically assigned.
|
||||
@ -167,7 +167,7 @@ into paperless. It receives the following arguments:
|
||||
* Correspondent
|
||||
* Tags
|
||||
|
||||
The script can be in any language you like, but for a simple shell script
|
||||
The script can be written in any language, but for a simple shell script
|
||||
example, you can take a look at ``post-consumption-example.sh`` in the
|
||||
``scripts`` directory in this project.
|
||||
|
||||
|
@ -86,10 +86,9 @@ The consumption directory
|
||||
=========================
|
||||
|
||||
The primary method of getting documents into your database is by putting them in
|
||||
the consumption directory. The consumer runs in an infinite
|
||||
loop looking for new additions to this directory and when it finds them, it goes
|
||||
about the process of parsing them with the OCR, indexing what it finds, and storing
|
||||
it in the media directory.
|
||||
the consumption directory. The consumer runs in an infinite loop, looking for new
|
||||
additions to this directory. When it finds them, the consumer goes about the process
|
||||
of parsing them with the OCR, indexing what it finds, and storing it in the media directory.
|
||||
|
||||
Getting stuff into this directory is up to you. If you're running Paperless
|
||||
on your local computer, you might just want to drag and drop files there, but if
|
||||
@ -128,7 +127,7 @@ IMAP (Email)
|
||||
============
|
||||
|
||||
You can tell paperless-ng to consume documents from your email accounts.
|
||||
This is a very flexible and powerful feature, if you regularly received documents
|
||||
This is a very flexible and powerful feature if you regularly received documents
|
||||
via mail that you need to archive. The mail consumer can be configured by using the
|
||||
admin interface in the following manner:
|
||||
|
||||
@ -396,7 +395,7 @@ Task management
|
||||
|
||||
Some documents require attention and require you to act on the document. You
|
||||
may take two different approaches to handle these documents based on how
|
||||
regularly you intent to use paperless and scan documents.
|
||||
regularly you intend to scan documents and use paperless.
|
||||
|
||||
* If you scan and process your documents in paperless regularly, assign a
|
||||
TODO tag to all scanned documents that you need to process. Create a saved
|
||||
|
Loading…
x
Reference in New Issue
Block a user