documentation

This commit is contained in:
jonaswinkler 2021-02-21 13:35:47 +01:00
parent bac4a63cc8
commit ef4009e94f
3 changed files with 14 additions and 2 deletions

View File

@ -10,12 +10,12 @@ paperless-ng 1.2.0
* Changes to the OCRmyPDF integration
* Added support for deskewing and automatic rotation of incorrectly rotated pages.
* Added support for deskewing and automatic rotation of incorrectly rotated pages. This is disabled by default, see :ref:`configuration-ocr`.
* Better support for encrypted files.
* Better support for various other PDF files: Paperless will now attempt to force OCR with safe options when OCR fails with the configured options.
* Added an explicit option to skip cleaning with ``unpaper``.
* Download multiple selected document.
* Download multiple selected documents as a zip archive.
* The document list now remembers the current page.

View File

@ -244,6 +244,8 @@ PAPERLESS_OCR_MODE=<mode>
The default is ``skip``, which only performs OCR when necessary and always
creates archived documents.
Read more about this in the `OCRmyPDF documentation <https://ocrmypdf.readthedocs.io/en/latest/advanced.html#when-ocr-is-skipped>`_.
PAPERLESS_OCR_CLEAN=<mode>
Tells paperless to use ``unpaper`` to clean any input document before
sending it to tesseract. This uses more resources, but generally results
@ -256,12 +258,19 @@ PAPERLESS_OCR_CLEAN=<mode>
Defaults to ``clean``.
.. note::
``clean-final`` is incompatible with ocr mode ``redo``.
PAPERLESS_OCR_DESKEW=<bool>
Tells paperless to correct skewing (slight rotation of input images mainly
due to improper scanning)
Defaults to ``false``, which disables this feature.
.. note::
Deskewing is incompatible with ocr mode ``redo``.
PAPERLESS_OCR_ROTATE_PAGES=<bool>
Tells paperless to correct page rotation (90°, 180° and 270° rotation).

View File

@ -774,6 +774,9 @@ configuring some options in paperless can help improve performance immensely:
your documents before feeding them into paperless. Some scanners are able to
do this! You might want to even specify ``skip_noarchive`` to skip archive
file generation for already ocr'ed documents entirely.
* If you want to perform OCR on the the device, consider using ``PAPERLESS_OCR_CLEAN=none``.
This will speed up OCR times and use less memory at the expense of slightly worse
OCR results.
* Set ``PAPERLESS_OPTIMIZE_THUMBNAILS`` to 'false' if you want faster consumption
times. Thumbnails will be about 20% larger.
* If using docker, consider setting ``PAPERLESS_WEBSERVER_WORKERS`` to