mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
documentation
This commit is contained in:
parent
8cad12b154
commit
19bb29d5cd
@ -333,6 +333,36 @@ command:
|
|||||||
|
|
||||||
The command takes no arguments and processes all your mail accounts and rules.
|
The command takes no arguments and processes all your mail accounts and rules.
|
||||||
|
|
||||||
|
.. _utilities-archiver:
|
||||||
|
|
||||||
|
Creating archived documents
|
||||||
|
===========================
|
||||||
|
|
||||||
|
Paperless stores archived PDF/A documents alongside your original documents.
|
||||||
|
These archived documents will also contain selectable text for image-only
|
||||||
|
originals.
|
||||||
|
These documents are derived from the originals, which are always stored
|
||||||
|
unmodified. If coming from an earlier version of paperless, your documents
|
||||||
|
won't have archived versions.
|
||||||
|
|
||||||
|
This command creates PDF/A documents for your documents.
|
||||||
|
|
||||||
|
.. code::
|
||||||
|
|
||||||
|
document_archiver --overwrite
|
||||||
|
|
||||||
|
This command will only attempt to create archived documents when no archived
|
||||||
|
document exists yet, unless ``--overwrite`` is specified.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
This command essentially performs OCR on all your documents again,
|
||||||
|
according to your settings. If you run this with ``PAPERLESS_OCR_MODE=redo``,
|
||||||
|
it will potentially run for a very long time. You can cancel the command
|
||||||
|
at any time, since this command will skip already archived versions the next time
|
||||||
|
it is run.
|
||||||
|
|
||||||
|
|
||||||
.. _utilities-encyption:
|
.. _utilities-encyption:
|
||||||
|
|
||||||
Managing encryption
|
Managing encryption
|
||||||
|
@ -5,6 +5,29 @@
|
|||||||
Changelog
|
Changelog
|
||||||
*********
|
*********
|
||||||
|
|
||||||
|
paperless-ng 0.9.5
|
||||||
|
##################
|
||||||
|
|
||||||
|
* OCR
|
||||||
|
|
||||||
|
* Paperless now uses `OCRmyPDF <https://github.com/jbarlow83/OCRmyPDF>`_ to perform OCR on documents.
|
||||||
|
* OCRmyPDF creates archived PDF/A documents with embedded text that can be selected in the front end.
|
||||||
|
* Paperless stores archived versions of documents alongside with the originals. The originals can be
|
||||||
|
accessed on the document edit page, if available.
|
||||||
|
* Many of the configuration options regarding OCR have changed. See :ref:`configuration-ocr` for details.
|
||||||
|
* Paperless no longer guesses the language of your documents. It always uses the language that you
|
||||||
|
specified with ``PAPERLESS_OCR_LANGUAGE``. Be sure to set this to the language the majority of your
|
||||||
|
documents are in.
|
||||||
|
* The management command :ref:`document_archiver <utilities-archiver>` can be used to create archived versions for already
|
||||||
|
existing documents.
|
||||||
|
|
||||||
|
* Tags from consumption folder.
|
||||||
|
|
||||||
|
* Thanks to `jayme-github`_, paperless now consumes files from sub folders in the consumption folder and is able to assign tags
|
||||||
|
based on the sub folders a document was found in. This can be configured with ``PAPERLESS_CONSUMER_RECURSIVE`` and
|
||||||
|
``PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS``.
|
||||||
|
|
||||||
|
|
||||||
paperless-ng 0.9.4
|
paperless-ng 0.9.4
|
||||||
##################
|
##################
|
||||||
|
|
||||||
@ -750,6 +773,7 @@ bulk of the work on this big change.
|
|||||||
|
|
||||||
* Initial release
|
* Initial release
|
||||||
|
|
||||||
|
.. _jayme-github: http://github.com/jayme-github
|
||||||
.. _Brian Conn: https://github.com/TheConnMan
|
.. _Brian Conn: https://github.com/TheConnMan
|
||||||
.. _Christopher Luu: https://github.com/nuudles
|
.. _Christopher Luu: https://github.com/nuudles
|
||||||
.. _Florian Jung: https://github.com/the01
|
.. _Florian Jung: https://github.com/the01
|
||||||
|
@ -152,6 +152,8 @@ PAPERLESS_AUTO_LOGIN_USERNAME=<username>
|
|||||||
|
|
||||||
Defaults to none, which disables this feature.
|
Defaults to none, which disables this feature.
|
||||||
|
|
||||||
|
.. _configuration-ocr:
|
||||||
|
|
||||||
OCR settings
|
OCR settings
|
||||||
############
|
############
|
||||||
|
|
||||||
@ -184,6 +186,8 @@ PAPERLESS_OCR_MODE=<mode>
|
|||||||
where no text is present. This is the safest and fastest option.
|
where no text is present. This is the safest and fastest option.
|
||||||
* ``skip_noarchive``: In addition to skip, paperless won't create an
|
* ``skip_noarchive``: In addition to skip, paperless won't create an
|
||||||
archived version of your documents when it finds any text in them.
|
archived version of your documents when it finds any text in them.
|
||||||
|
This is useful if you don't want to have two almost-identical versions
|
||||||
|
of your digital documents in the media folder.
|
||||||
* ``redo``: Paperless will OCR all pages of your documents and attempt to
|
* ``redo``: Paperless will OCR all pages of your documents and attempt to
|
||||||
replace any existing text layers with new text. This will be useful for
|
replace any existing text layers with new text. This will be useful for
|
||||||
documents from scanners that already performed OCR with insufficient
|
documents from scanners that already performed OCR with insufficient
|
||||||
@ -197,7 +201,8 @@ PAPERLESS_OCR_MODE=<mode>
|
|||||||
however, the resulting document may be significantly larger and text
|
however, the resulting document may be significantly larger and text
|
||||||
won't appear as sharp when zoomed in.
|
won't appear as sharp when zoomed in.
|
||||||
|
|
||||||
The default is ``skip``, which only performs OCR when necessary.
|
The default is ``skip``, which only performs OCR when necessary and always
|
||||||
|
creates archived documents.
|
||||||
|
|
||||||
PAPERLESS_OCR_OUTPUT_TYPE=<type>
|
PAPERLESS_OCR_OUTPUT_TYPE=<type>
|
||||||
Specify the the type of PDF documents that paperless should produce.
|
Specify the the type of PDF documents that paperless should produce.
|
||||||
@ -244,7 +249,7 @@ PAPERLESS_OCR_USER_ARG=<json>
|
|||||||
OCRmyPDF offers many more options. Use this parameter to specify any
|
OCRmyPDF offers many more options. Use this parameter to specify any
|
||||||
additional arguments you wish to pass to OCRmyPDF. Since Paperless uses
|
additional arguments you wish to pass to OCRmyPDF. Since Paperless uses
|
||||||
the API of OCRmyPDF, you have to specify these in a format that can be
|
the API of OCRmyPDF, you have to specify these in a format that can be
|
||||||
passed to the API. See `https://ocrmypdf.readthedocs.io/en/latest/api.html#reference`_
|
passed to the API. See `the API reference of OCRmyPDF <https://ocrmypdf.readthedocs.io/en/latest/api.html#reference>`_
|
||||||
for valid parameters. All command line options are supported, but they
|
for valid parameters. All command line options are supported, but they
|
||||||
use underscores instead of dashed.
|
use underscores instead of dashed.
|
||||||
|
|
||||||
|
33
docs/faq.rst
33
docs/faq.rst
@ -3,6 +3,18 @@
|
|||||||
Frequently asked questions
|
Frequently asked questions
|
||||||
**************************
|
**************************
|
||||||
|
|
||||||
|
**Q:** *What's the general plan for Paperless-ng?*
|
||||||
|
|
||||||
|
**A:** Paperless-ng is already almost feature-complete. This project will remain
|
||||||
|
as simple as it is right now. It will see improvements to features that are already there.
|
||||||
|
If you need advanced features such as document versions,
|
||||||
|
workflows or multi-user with customizable access to individual files, this is
|
||||||
|
not the tool for you.
|
||||||
|
|
||||||
|
Features that *are* planned are some more quality of life extensions for the searching
|
||||||
|
(i.e., search for similar documents, group results by correspondents with "more from this"
|
||||||
|
links, etc), bulk editing and hierarchical tags.
|
||||||
|
|
||||||
**Q:** *I'm using docker. Where are my documents?*
|
**Q:** *I'm using docker. Where are my documents?*
|
||||||
|
|
||||||
**A:** Your documents are stored inside the docker volume ``paperless_media``.
|
**A:** Your documents are stored inside the docker volume ``paperless_media``.
|
||||||
@ -21,6 +33,18 @@ is
|
|||||||
files around manually. This folder is meant to be entirely managed by docker
|
files around manually. This folder is meant to be entirely managed by docker
|
||||||
and paperless.
|
and paperless.
|
||||||
|
|
||||||
|
**Q:** *Let's say you don't support this project anymore in a year. Can I easily move to other systems?*
|
||||||
|
|
||||||
|
**A:** Your documents are stored as plain files inside the media folder. You can always drag those files
|
||||||
|
out of that folder to use them elsewhere. Here are a couple notes about that.
|
||||||
|
|
||||||
|
* Paperless never modifies your original documents. It keeps checksums of all documents and uses a
|
||||||
|
scheduled sanity checker to check that they remain the same.
|
||||||
|
* By default, paperless uses the internal ID of each document as its filename. This might not be very
|
||||||
|
convenient for export. However, you can adjust the way files are stored in paperless by
|
||||||
|
:ref:`configuring the filename format <advanced-file_name_handling>`.
|
||||||
|
* :ref:`The exporter <utilities-exporter>` is another easy way to get your files out of paperless with reasonable file names.
|
||||||
|
|
||||||
**Q:** *What file types does paperless-ng support?*
|
**Q:** *What file types does paperless-ng support?*
|
||||||
|
|
||||||
**A:** Currently, the following files are supported:
|
**A:** Currently, the following files are supported:
|
||||||
@ -53,3 +77,12 @@ in your browser and paperless has to do much less work to serve the data.
|
|||||||
that automatically, I'm all ears. For now, you have to grab the latest release
|
that automatically, I'm all ears. For now, you have to grab the latest release
|
||||||
archive from the project page and build the image yourself. The release comes
|
archive from the project page and build the image yourself. The release comes
|
||||||
with the front end already compiled, so you don't have to do this on the Pi.
|
with the front end already compiled, so you don't have to do this on the Pi.
|
||||||
|
|
||||||
|
**Q:** *How do I run this on my toaster?*
|
||||||
|
|
||||||
|
**A:** I honestly don't know! As for all other devices that might be able
|
||||||
|
to run paperless, you're a bit on your own. If you can't run the docker image,
|
||||||
|
the documentation has instructions for bare metal installs. I'm running
|
||||||
|
paperless on an i3 processor from 2015 or so. This is also what I use to test
|
||||||
|
new releases with. Apart from that, I also have a Raspberry Pi, which I
|
||||||
|
occasionally build the image on and see if it works.
|
||||||
|
@ -42,6 +42,9 @@ resources in the documentation:
|
|||||||
learn about how paperless automates all tagging using machine learning.
|
learn about how paperless automates all tagging using machine learning.
|
||||||
* Paperless now comes with a :ref:`proper email consumer <usage-email>`
|
* Paperless now comes with a :ref:`proper email consumer <usage-email>`
|
||||||
that's fully tested and production ready.
|
that's fully tested and production ready.
|
||||||
|
* Paperless creates searchable PDF/A documents from whatever you you put into
|
||||||
|
the consumption directory. This means that you can select text in
|
||||||
|
image-only documents coming from your scanner.
|
||||||
* See :ref:`this note <utilities-encyption>` about GnuPG encryption in
|
* See :ref:`this note <utilities-encyption>` about GnuPG encryption in
|
||||||
paperless-ng.
|
paperless-ng.
|
||||||
* Paperless is now integrated with a
|
* Paperless is now integrated with a
|
||||||
|
@ -60,6 +60,31 @@ Once you've got Paperless setup, you need to start feeding documents into it.
|
|||||||
Currently, there are three options: the consumption directory, IMAP (email), and
|
Currently, there are three options: the consumption directory, IMAP (email), and
|
||||||
HTTP POST.
|
HTTP POST.
|
||||||
|
|
||||||
|
When adding documents to paperless, it will perform the following operations on
|
||||||
|
your documents:
|
||||||
|
|
||||||
|
1. OCR the document, if it has no text. Digital documents usually have text,
|
||||||
|
and this step will be skipped for those documents.
|
||||||
|
2. Paperless will create an archiveable PDF/A document from your document.
|
||||||
|
If this document is coming from your scanner, it will have embedded selectable text.
|
||||||
|
3. Paperless performs automatic matching of tags, correspondents and types on the
|
||||||
|
document before storing it in the database.
|
||||||
|
|
||||||
|
.. hint::
|
||||||
|
|
||||||
|
This process can be configured to fit your needs. If you don't want paperless
|
||||||
|
to create archived versions for digital documents, you can configure that by
|
||||||
|
configuring ``PAPERLESS_OCR_MODE=skip_noarchive``. Please read the
|
||||||
|
:ref:`relevant section in the documentation <configuration-ocr>`.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
No matter which options you choose, Paperless will always store the original
|
||||||
|
document that it found in the consumption directory or in the mail and
|
||||||
|
will never overwrite that document. Archived versions are stored alongside the
|
||||||
|
digital versions.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
The consumption directory
|
The consumption directory
|
||||||
=========================
|
=========================
|
||||||
|
Loading…
x
Reference in New Issue
Block a user