new setting: PAPERLESS_OCR_PAGES

This commit is contained in:
Jonas Winkler
2020-11-22 12:54:08 +01:00
parent ea089de3b3
commit fec9e54049
6 changed files with 54 additions and 5 deletions

View File

@@ -184,6 +184,16 @@ PAPERLESS_TIME_ZONE=<timezone>
PAPERLESS_OCR_PAGES=<num>
Tells paperless to use only the specified amount of pages for OCR. Documents
with less than the specified amount of pages get OCR'ed completely.
Specifying 1 here will only use the first page.
Defaults to 0, which disables this feature and always uses all pages.
PAPERLESS_OCR_LANGUAGE=<lang>
Customize the default language that tesseract will attempt to use when
parsing documents. The default language is used whenever