diff --git a/docs/configuration.rst b/docs/configuration.rst index 1f9da8b51..2ec34f803 100644 --- a/docs/configuration.rst +++ b/docs/configuration.rst @@ -183,11 +183,11 @@ PAPERLESS_OCR_MODE= are available: * ``skip``: Paperless skips all pages and will perform ocr only on pages - where no text is present. This is the safest and fastest option. + where no text is present. This is the safest option. * ``skip_noarchive``: In addition to skip, paperless won't create an archived version of your documents when it finds any text in them. This is useful if you don't want to have two almost-identical versions - of your digital documents in the media folder. + of your digital documents in the media folder. This is the fastest option. * ``redo``: Paperless will OCR all pages of your documents and attempt to replace any existing text layers with new text. This will be useful for documents from scanners that already performed OCR with insufficient diff --git a/docs/setup.rst b/docs/setup.rst index 746c0aa0d..e2b3d1ab9 100644 --- a/docs/setup.rst +++ b/docs/setup.rst @@ -220,16 +220,24 @@ writing. Windows is not and will never be supported. * ``python3-dev`` * ``imagemagick`` >= 6 for PDF conversion - * ``unpaper`` for cleaning documents before OCR - * ``ghostscript`` * ``optipng`` for optimising thumbnails - * ``tesseract-ocr`` >= 4.0.0 for OCR - * ``tesseract-ocr`` language packs (``tesseract-ocr-eng``, ``tesseract-ocr-deu``, etc) * ``gnupg`` for handling encrypted documents * ``libpoppler-cpp-dev`` for PDF to text conversion * ``libmagic-dev`` for mime type detection * ``libpq-dev`` for PostgreSQL + These dependencies are required for OCRmyPDF, which is used for text recognition. + + * ``unpaper`` + * ``ghostscript`` + * ``icc-profiles-free`` + * ``liblept5`` + * ``libxml2`` + * ``pngquant`` + * ``zlib1g`` + * ``tesseract-ocr`` >= 4.0.0 for OCR + * ``tesseract-ocr`` language packs (``tesseract-ocr-eng``, ``tesseract-ocr-deu``, etc) + You will also need ``build-essential``, ``python3-setuptools`` and ``python3-wheel`` for installing some of the python dependencies. You can remove that again after installation.