mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Update documentation for grammar and additional clarity
Small tweaks to be consistent in oxford comma usage already at work in the docs. More importantly, adding some clarification here and there to try and make things even more dead simple to read :)
This commit is contained in:
parent
7bc8325df9
commit
e655b40d3a
@ -18,13 +18,13 @@ that had a ``match`` property of ``bc hydro`` and a ``matching_algorithm`` of
|
||||
your ``Home Utility`` tag so long as the text ``bc hydro`` appears in the body
|
||||
of the document somewhere.
|
||||
|
||||
The matching logic is quite powerful, and supports searching the text of your
|
||||
The matching logic is quite powerful. It supports searching the text of your
|
||||
document with different algorithms, and as such, some experimentation may be
|
||||
necessary to get things right.
|
||||
|
||||
In order to have a tag, correspondent or type assigned automatically to newly
|
||||
In order to have a tag, correspondent, or type assigned automatically to newly
|
||||
consumed documents, assign a match and matching algorithm using the web
|
||||
interface. These settings define when to assign correspondents, tags and types
|
||||
interface. These settings define when to assign correspondents, tags, and types
|
||||
to documents.
|
||||
|
||||
The following algorithms are available:
|
||||
@ -34,16 +34,16 @@ The following algorithms are available:
|
||||
either of these terms.
|
||||
* **All:** Requires that every word provided appears in the PDF, albeit not in the
|
||||
order provided.
|
||||
* **Literal:** Matches only if the match appears exactly as provided in the PDF.
|
||||
* **Literal:** Matches only if the match appears exactly as provided (i.e. preserve ordering) in the PDF.
|
||||
* **Regular expression:** Parses the match as a regular expression and tries to
|
||||
find a match within the document.
|
||||
* **Fuzzy match:** I dont know. Look at the source.
|
||||
* **Auto:** Tries to automatically match new documents. This does not require you
|
||||
to set a match. See the notes below.
|
||||
|
||||
When using the "any" or "all" matching algorithms, you can search for terms
|
||||
When using the *any* or *all* matching algorithms, you can search for terms
|
||||
that consist of multiple words by enclosing them in double quotes. For example,
|
||||
defining a match text of ``"Bank of America" BofA`` using the "any" algorithm,
|
||||
defining a match text of ``"Bank of America" BofA`` using the *any* algorithm,
|
||||
will match documents that contain either "Bank of America" or "BofA", but will
|
||||
not match documents containing "Bank of South America".
|
||||
|
||||
@ -58,8 +58,8 @@ Automatic matching
|
||||
==================
|
||||
|
||||
Paperless-ng comes with a new matching algorithm called *Auto*. This matching
|
||||
algorithm tries to assign tags, correspondents and document types to your
|
||||
documents based on how you have assigned these on existing documents. It
|
||||
algorithm tries to assign tags, correspondents, and document types to your
|
||||
documents based on how you have already assigned these on existing documents. It
|
||||
uses a neural network under the hood.
|
||||
|
||||
If, for example, all your bank statements of your account 123 at the Bank of
|
||||
@ -76,11 +76,11 @@ feature:
|
||||
changes. Paperless periodically (default: once each hour) checks for changes
|
||||
and does this automatically for you.
|
||||
* The Auto matching algorithm only takes documents into account which are NOT
|
||||
placed in your inbox (i.e., have inbox tags assigned to them). This ensures
|
||||
placed in your inbox (i.e. have any inbox tags assigned to them). This ensures
|
||||
that the neural network only learns from documents which you have correctly
|
||||
tagged before.
|
||||
* The matching algorithm can only work if there is a correlation between the
|
||||
tag, correspondent or document type and the document itself. Your bank
|
||||
tag, correspondent, or document type and the document itself. Your bank
|
||||
statements usually contain your bank account number and the name of the bank,
|
||||
so this works reasonably well, However, tags such as "TODO" cannot be
|
||||
automatically assigned.
|
||||
@ -167,7 +167,7 @@ into paperless. It receives the following arguments:
|
||||
* Correspondent
|
||||
* Tags
|
||||
|
||||
The script can be in any language you like, but for a simple shell script
|
||||
The script can be written in any language, but for a simple shell script
|
||||
example, you can take a look at ``post-consumption-example.sh`` in the
|
||||
``scripts`` directory in this project.
|
||||
|
||||
|
@ -86,10 +86,9 @@ The consumption directory
|
||||
=========================
|
||||
|
||||
The primary method of getting documents into your database is by putting them in
|
||||
the consumption directory. The consumer runs in an infinite
|
||||
loop looking for new additions to this directory and when it finds them, it goes
|
||||
about the process of parsing them with the OCR, indexing what it finds, and storing
|
||||
it in the media directory.
|
||||
the consumption directory. The consumer runs in an infinite loop, looking for new
|
||||
additions to this directory. When it finds them, the consumer goes about the process
|
||||
of parsing them with the OCR, indexing what it finds, and storing it in the media directory.
|
||||
|
||||
Getting stuff into this directory is up to you. If you're running Paperless
|
||||
on your local computer, you might just want to drag and drop files there, but if
|
||||
@ -128,7 +127,7 @@ IMAP (Email)
|
||||
============
|
||||
|
||||
You can tell paperless-ng to consume documents from your email accounts.
|
||||
This is a very flexible and powerful feature, if you regularly received documents
|
||||
This is a very flexible and powerful feature if you regularly received documents
|
||||
via mail that you need to archive. The mail consumer can be configured by using the
|
||||
admin interface in the following manner:
|
||||
|
||||
@ -396,7 +395,7 @@ Task management
|
||||
|
||||
Some documents require attention and require you to act on the document. You
|
||||
may take two different approaches to handle these documents based on how
|
||||
regularly you intent to use paperless and scan documents.
|
||||
regularly you intend to scan documents and use paperless.
|
||||
|
||||
* If you scan and process your documents in paperless regularly, assign a
|
||||
TODO tag to all scanned documents that you need to process. Create a saved
|
||||
|
Loading…
x
Reference in New Issue
Block a user