enable deskewing and rotation by default

This commit is contained in:
jonaswinkler 2021-02-21 23:40:26 +01:00
parent 265432f2a5
commit cb10617979
4 changed files with 23 additions and 17 deletions

View File

@ -10,7 +10,7 @@ paperless-ng 1.2.0
* Changes to the OCRmyPDF integration
* Added support for deskewing and automatic rotation of incorrectly rotated pages. This is disabled by default, see :ref:`configuration-ocr`.
* Added support for deskewing and automatic rotation of incorrectly rotated pages. This is enabled by default, see :ref:`configuration-ocr`.
* Better support for encrypted files.
* Better support for various other PDF files: Paperless will now attempt to force OCR with safe options when OCR fails with the configured options.
* Added an explicit option to skip cleaning with ``unpaper``.

View File

@ -260,22 +260,28 @@ PAPERLESS_OCR_CLEAN=<mode>
.. note::
``clean-final`` is incompatible with ocr mode ``redo``.
``clean-final`` is incompatible with ocr mode ``redo``. When both
``clean-final`` and the ocr mode ``redo`` is configured, ``clean``
is used instead.
PAPERLESS_OCR_DESKEW=<bool>
Tells paperless to correct skewing (slight rotation of input images mainly
due to improper scanning)
Defaults to ``false``, which disables this feature.
Defaults to ``true``, which enables this feature.
.. note::
Deskewing is incompatible with ocr mode ``redo``.
Deskewing is incompatible with ocr mode ``redo``. Deskewing will get
disabled automatically if ``redo`` is used as the ocr mode.
PAPERLESS_OCR_ROTATE_PAGES=<bool>
Tells paperless to correct page rotation (90°, 180° and 270° rotation).
Defaults to ``false``, which disables this feature.
If you notice that paperless is not rotating pages incorrectly rotated
pages (or vice versa), try adjusting the threshold up or down (see below).
Defaults to ``true``, which enables this feature.
PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=<num>
@ -284,7 +290,7 @@ PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=<num>
whereas "2" is a very aggressive option and will often result correctly rotated pages
being rotated as well.
Defaults to "10".
Defaults to "12".
PAPERLESS_OCR_OUTPUT_TYPE=<type>
Specify the the type of PDF documents that paperless should produce.
@ -392,7 +398,7 @@ requires are as follows:
PAPERLESS_TIKA_ENABLED: 1
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
PAPERLESS_TIKA_ENDPOINT: http://tika:9998
# ...
gotenberg:
@ -622,10 +628,10 @@ USERMAP_UID=<uid>
.. code:: shell-session
$ id -u
Paperless will change ownership on its folders to this user, so you need to get this right
in order to be able to write to the consumption directory.
Defaults to 1000.
USERMAP_GID=<gid>
@ -635,10 +641,10 @@ USERMAP_GID=<gid>
.. code:: shell-session
$ id -g
Paperless will change ownership on its folders to this group, so you need to get this right
in order to be able to write to the consumption directory.
Defaults to 1000.
PAPERLESS_OCR_LANGUAGES=<list>

View File

@ -42,9 +42,9 @@
#PAPERLESS_OCR_PAGES=1
#PAPERLESS_OCR_IMAGE_DPI=300
#PAPERLESS_OCR_CLEAN=clean
#PAPERLESS_OCR_DESKEW=false
#PAPERLESS_OCR_ROTATE_PAGES=false
#PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=10
#PAPERLESS_OCR_DESKEW=true
#PAPERLESS_OCR_ROTATE_PAGES=true
#PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=12.0
#PAPERLESS_OCR_USER_ARGS={}
#PAPERLESS_CONVERT_MEMORY_LIMIT=0
#PAPERLESS_CONVERT_TMPDIR=/var/tmp/paperless

View File

@ -457,11 +457,11 @@ OCR_IMAGE_DPI = os.getenv("PAPERLESS_OCR_IMAGE_DPI")
OCR_CLEAN = os.getenv("PAPERLESS_OCR_CLEAN", "clean")
OCR_DESKEW = __get_boolean("PAPERLESS_OCR_DESKEW")
OCR_DESKEW = __get_boolean("PAPERLESS_OCR_DESKEW", "true")
OCR_ROTATE_PAGES = __get_boolean("PAPERLESS_OCR_ROTATE_PAGES")
OCR_ROTATE_PAGES = __get_boolean("PAPERLESS_OCR_ROTATE_PAGES", "true")
OCR_ROTATE_PAGES_THRESHOLD = float(os.getenv("PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD", 10.0))
OCR_ROTATE_PAGES_THRESHOLD = float(os.getenv("PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD", 12.0))
OCR_USER_ARGS = os.getenv("PAPERLESS_OCR_USER_ARGS", "{}")