mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Documentation: Replace 'PDF' with 'document'
There are more supported file formats than just PDF.
This commit is contained in:
parent
0559204be4
commit
8218b1aa51
@ -49,17 +49,17 @@ The Consumer
|
|||||||
------------
|
------------
|
||||||
|
|
||||||
The consumer script runs in an infinite loop, constantly looking at a directory
|
The consumer script runs in an infinite loop, constantly looking at a directory
|
||||||
for PDF files to parse and index. The process is pretty straightforward:
|
for documents to parse and index. The process is pretty straightforward:
|
||||||
|
|
||||||
1. Look in ``CONSUMPTION_DIR`` for a PDF. If one is found, go to #2. If not,
|
1. Look in ``CONSUMPTION_DIR`` for a document. If one is found, go to #2.
|
||||||
wait 10 seconds and try again.
|
If not, wait 10 seconds and try again.
|
||||||
2. Parse the PDF with Tesseract
|
2. Parse the document with Tesseract
|
||||||
3. Create a new record in the database with the OCR'd text
|
3. Create a new record in the database with the OCR'd text
|
||||||
4. Attempt to automatically assign document attributes by doing some guesswork.
|
4. Attempt to automatically assign document attributes by doing some guesswork.
|
||||||
Read up on the :ref:`guesswork documentation<guesswork>` for more
|
Read up on the :ref:`guesswork documentation<guesswork>` for more
|
||||||
information about this process.
|
information about this process.
|
||||||
5. Encrypt the PDF and store it in the ``media`` directory under
|
5. Encrypt the document and store it in the ``media`` directory under
|
||||||
``documents/pdf``.
|
``documents/originals``.
|
||||||
6. Go to #1.
|
6. Go to #1.
|
||||||
|
|
||||||
|
|
||||||
@ -74,7 +74,7 @@ The consumer is started via the ``manage.py`` script:
|
|||||||
|
|
||||||
$ /path/to/paperless/src/manage.py document_consumer
|
$ /path/to/paperless/src/manage.py document_consumer
|
||||||
|
|
||||||
This starts the service that will run in a loop, consuming PDF files as they
|
This starts the service that will run in a loop, consuming documents as they
|
||||||
appear in ``CONSUMPTION_DIR``.
|
appear in ``CONSUMPTION_DIR``.
|
||||||
|
|
||||||
Note that this command runs continuously, so exiting it will mean your webserver
|
Note that this command runs continuously, so exiting it will mean your webserver
|
||||||
@ -97,8 +97,8 @@ The Exporter
|
|||||||
------------
|
------------
|
||||||
|
|
||||||
Tired of fiddling with Paperless, or just want to do something stupid and are
|
Tired of fiddling with Paperless, or just want to do something stupid and are
|
||||||
afraid of accidentally damaging your files? You can export all of your PDFs
|
afraid of accidentally damaging your files? You can export all of your
|
||||||
into neatly named, dated, and unencrypted.
|
documents into neatly named, dated, and unencrypted files.
|
||||||
|
|
||||||
|
|
||||||
.. _utilities-exporter-howto:
|
.. _utilities-exporter-howto:
|
||||||
@ -112,10 +112,10 @@ This too is done via the ``manage.py`` script:
|
|||||||
|
|
||||||
$ /path/to/paperless/src/manage.py document_exporter /path/to/somewhere/
|
$ /path/to/paperless/src/manage.py document_exporter /path/to/somewhere/
|
||||||
|
|
||||||
This will dump all of your unencrypted PDFs into ``/path/to/somewhere`` for you
|
This will dump all of your unencrypted documents into ``/path/to/somewhere``
|
||||||
to do with as you please. The files are accompanied with a special file,
|
for you to do with as you please. The files are accompanied with a special
|
||||||
``manifest.json`` which can be used to
|
file, ``manifest.json`` which can be used to :ref:`import the files
|
||||||
:ref:`import the files <utilities-importer>` at a later date if you wish.
|
<utilities-importer>` at a later date if you wish.
|
||||||
|
|
||||||
|
|
||||||
.. _utilities-exporter-howto-docker:
|
.. _utilities-exporter-howto-docker:
|
||||||
|
Loading…
x
Reference in New Issue
Block a user