mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-07-28 18:24:38 -05:00
Add PAPERLESS_OCR_SKIP_ARCHIVE_FILE config setting
This commit is contained in:
@@ -415,12 +415,6 @@ modes are available:
|
||||
- `skip`: Paperless skips all pages and will perform ocr only on
|
||||
pages where no text is present. This is the safest option.
|
||||
|
||||
- `skip_noarchive`: In addition to skip, paperless won't create
|
||||
an archived version of your documents when it finds any text in
|
||||
them. This is useful if you don't want to have two
|
||||
almost-identical versions of your digital documents in the media
|
||||
folder. This is the fastest option.
|
||||
|
||||
- `redo`: Paperless will OCR all pages of your documents and
|
||||
attempt to replace any existing text layers with new text. This
|
||||
will be useful for documents from scanners that already
|
||||
@@ -443,6 +437,19 @@ modes are available:
|
||||
Read more about this in the [OCRmyPDF
|
||||
documentation](https://ocrmypdf.readthedocs.io/en/latest/advanced.html#when-ocr-is-skipped).
|
||||
|
||||
`PAPERLESS_OCR_SKIP_ARCHIVE_FILE=<mode>`
|
||||
|
||||
: Specify when you would like paperless to skip creating an archived
|
||||
version of your documents. This is useful if you don't want to have two
|
||||
almost-identical versions of your documents in the media folder.
|
||||
|
||||
- `never`: Never skip creating an archived version.
|
||||
- `with_text`: Skip creating an archived version for documents
|
||||
that already have embedded text.
|
||||
- `always`: Always skip creating an archived version.
|
||||
|
||||
The default is `never`.
|
||||
|
||||
`PAPERLESS_OCR_CLEAN=<mode>`
|
||||
|
||||
: Tells paperless to use `unpaper` to clean any input document before
|
||||
|
@@ -818,9 +818,10 @@ performance immensely:
|
||||
other tasks).
|
||||
- Keep `PAPERLESS_OCR_MODE` at its default value `skip` and consider
|
||||
OCR'ing your documents before feeding them into paperless. Some
|
||||
scanners are able to do this! You might want to even specify
|
||||
`skip_noarchive` to skip archive file generation for already ocr'ed
|
||||
documents entirely.
|
||||
scanners are able to do this!
|
||||
- Set `PAPERLESS_OCR_SKIP_ARCHIVE_FILE` to `with_text` to skip archive
|
||||
file generation for already ocr'ed documents, or `always` to skip it
|
||||
for all documents.
|
||||
- If you want to perform OCR on the device, consider using
|
||||
`PAPERLESS_OCR_CLEAN=none`. This will speed up OCR times and use
|
||||
less memory at the expense of slightly worse OCR results.
|
||||
|
@@ -60,8 +60,8 @@ following operations on your documents:
|
||||
|
||||
This process can be configured to fit your needs. If you don't want
|
||||
paperless to create archived versions for digital documents, you can
|
||||
configure that by configuring `PAPERLESS_OCR_MODE=skip_noarchive`.
|
||||
Please read the
|
||||
configure that by configuring
|
||||
`PAPERLESS_OCR_SKIP_ARCHIVE_FILE=with_text`. Please read the
|
||||
[relevant section in the documentation](/configuration#ocr).
|
||||
|
||||
!!! note
|
||||
|
Reference in New Issue
Block a user