Add PAPERLESS_OCR_SKIP_ARCHIVE_FILE config setting

2026-02-09 23:49:29 -06:00 · 2023-02-23 22:42:57 -05:00
parent 8a89f5ae27
commit ca412e0184
8 changed files with 185 additions and 14 deletions
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -415,12 +415,6 @@ modes are available:
    -   `skip`: Paperless skips all pages and will perform ocr only on
        pages where no text is present. This is the safest option.

-    -   `skip_noarchive`: In addition to skip, paperless won't create
-        an archived version of your documents when it finds any text in
-        them. This is useful if you don't want to have two
-        almost-identical versions of your digital documents in the media
-        folder. This is the fastest option.
-
    -   `redo`: Paperless will OCR all pages of your documents and
        attempt to replace any existing text layers with new text. This
        will be useful for documents from scanners that already
@@ -443,6 +437,19 @@ modes are available:
    Read more about this in the [OCRmyPDF
    documentation](https://ocrmypdf.readthedocs.io/en/latest/advanced.html#when-ocr-is-skipped).

+`PAPERLESS_OCR_SKIP_ARCHIVE_FILE=<mode>`
+
+: Specify when you would like paperless to skip creating an archived
+version of your documents. This is useful if you don't want to have two
+almost-identical versions of your documents in the media folder.
+
+    -   `never`: Never skip creating an archived version.
+    -   `with_text`: Skip creating an archived version for documents
+    that already have embedded text.
+    -   `always`: Always skip creating an archived version.
+
+    The default is `never`.
+
 `PAPERLESS_OCR_CLEAN=<mode>`

 : Tells paperless to use `unpaper` to clean any input document before