mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
documentation for the new configuration options
This commit is contained in:
parent
6da237dd9e
commit
b978994525
@ -202,7 +202,6 @@ Paperless uses `OCRmyPDF <https://ocrmypdf.readthedocs.io/en/latest/>`_ for
|
|||||||
performing OCR on documents and images. Paperless uses sensible defaults for
|
performing OCR on documents and images. Paperless uses sensible defaults for
|
||||||
most settings, but all of them can be configured to your needs.
|
most settings, but all of them can be configured to your needs.
|
||||||
|
|
||||||
|
|
||||||
PAPERLESS_OCR_LANGUAGE=<lang>
|
PAPERLESS_OCR_LANGUAGE=<lang>
|
||||||
Customize the language that paperless will attempt to use when
|
Customize the language that paperless will attempt to use when
|
||||||
parsing documents.
|
parsing documents.
|
||||||
@ -245,6 +244,39 @@ PAPERLESS_OCR_MODE=<mode>
|
|||||||
The default is ``skip``, which only performs OCR when necessary and always
|
The default is ``skip``, which only performs OCR when necessary and always
|
||||||
creates archived documents.
|
creates archived documents.
|
||||||
|
|
||||||
|
PAPERLESS_OCR_CLEAN=<mode>
|
||||||
|
Tells paperless to use ``unpaper`` to clean any input document before
|
||||||
|
sending it to tesseract. This uses more resources, but generally results
|
||||||
|
in better OCR results. The following modes are available:
|
||||||
|
|
||||||
|
* ``clean``: Apply unpaper.
|
||||||
|
* ``clean-final``: Apply unpaper, and use the cleaned images to build the
|
||||||
|
output file instead of the original images.
|
||||||
|
* ``none``: Do not apply unpaper.
|
||||||
|
|
||||||
|
Defaults to ``clean``.
|
||||||
|
|
||||||
|
PAPERLESS_OCR_DESKEW=<bool>
|
||||||
|
Tells paperless to correct skewing (slight rotation of input images Mostly
|
||||||
|
due to improper scanning)
|
||||||
|
|
||||||
|
Defaults to ``false``, which disables this feature.
|
||||||
|
|
||||||
|
|
||||||
|
PAPERLESS_OCR_ROTATE_PAGES=<bool>
|
||||||
|
Tells paperless to correct page rotation (90°, 180° and 270° rotation).
|
||||||
|
|
||||||
|
Defaults to ``false``, which disables this feature.
|
||||||
|
|
||||||
|
|
||||||
|
PAPERLESS_OCR_ROTATE_PAGES_THRESHOLD=<num>
|
||||||
|
Adjust the threshold for automatic page rotation by ``PAPERLESS_OCR_ROTATE_PAGES``.
|
||||||
|
This is an arbitrary value reported by tesseract. "15" is a very conservative value,
|
||||||
|
whereas "2" is a very aggressive option and will often result correctly rotated pages
|
||||||
|
being rotated as well.
|
||||||
|
|
||||||
|
Defaults to "10".
|
||||||
|
|
||||||
PAPERLESS_OCR_OUTPUT_TYPE=<type>
|
PAPERLESS_OCR_OUTPUT_TYPE=<type>
|
||||||
Specify the the type of PDF documents that paperless should produce.
|
Specify the the type of PDF documents that paperless should produce.
|
||||||
|
|
||||||
@ -271,7 +303,6 @@ PAPERLESS_OCR_PAGES=<num>
|
|||||||
|
|
||||||
Defaults to 0, which disables this feature and always uses all pages.
|
Defaults to 0, which disables this feature and always uses all pages.
|
||||||
|
|
||||||
|
|
||||||
PAPERLESS_OCR_IMAGE_DPI=<num>
|
PAPERLESS_OCR_IMAGE_DPI=<num>
|
||||||
Paperless will OCR any images you put into the system and convert them
|
Paperless will OCR any images you put into the system and convert them
|
||||||
into PDF documents. This is useful if your scanner produces images.
|
into PDF documents. This is useful if your scanner produces images.
|
||||||
@ -282,8 +313,8 @@ PAPERLESS_OCR_IMAGE_DPI=<num>
|
|||||||
|
|
||||||
Set this to the DPI your scanner produces images at.
|
Set this to the DPI your scanner produces images at.
|
||||||
|
|
||||||
Default is none, which causes paperless to fail if no DPI information is
|
Default is none, which will automatically calculate image DPI so that
|
||||||
present in an image.
|
the produced PDF documents are A4 sized.
|
||||||
|
|
||||||
|
|
||||||
PAPERLESS_OCR_USER_ARGS=<json>
|
PAPERLESS_OCR_USER_ARGS=<json>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user