mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Documented the new variables and updated the changelog
This commit is contained in:
parent
b387be6f25
commit
23aa79f307
@ -3,6 +3,10 @@ Changelog
|
||||
|
||||
* 0.2.0
|
||||
|
||||
* `#98`_: Added optional environment variables for ImageMagick so that it
|
||||
doesn't explode when handling Very Large Documents or when it's just
|
||||
running on a low-memory system. Thanks to `Florian Harr`_ for his help on
|
||||
this one.
|
||||
* Added support for guessing the date from the file name along with the
|
||||
correspondent, title, and tags. Thanks to `Tikitu de Jager`_ for his pull
|
||||
request that I took forever to merge and to `Pit`_ for his efforts on the
|
||||
@ -97,6 +101,7 @@ Changelog
|
||||
.. _zedster: https://github.com/zedster
|
||||
.. _Martin Honermeyer: https://github.com/djmaze
|
||||
.. _Tim White: https://github.com/timwhite
|
||||
.. _Florian Harr: https://github.com/evils
|
||||
|
||||
.. _#20: https://github.com/danielquinn/paperless/issues/20
|
||||
.. _#44: https://github.com/danielquinn/paperless/issues/44
|
||||
@ -111,3 +116,4 @@ Changelog
|
||||
.. _#68: https://github.com/danielquinn/paperless/issues/68
|
||||
.. _#71: https://github.com/danielquinn/paperless/issues/71
|
||||
.. _#94: https://github.com/danielquinn/paperless/issues/71
|
||||
.. _#98: https://github.com/danielquinn/paperless/issues/71
|
||||
|
@ -3,17 +3,47 @@
|
||||
Troubleshooting
|
||||
===============
|
||||
|
||||
.. _troubleshooting_ocr_language_files_missing:
|
||||
.. _troubleshooting-languagemissing:
|
||||
|
||||
Consumer warns ``OCR for XX failed``
|
||||
------------------------------------
|
||||
|
||||
If you find the OCR accuracy to be too low, and/or the document consumer warns that ``OCR for
|
||||
XX failed, but we're going to stick with what we've got since FORGIVING_OCR is enabled``, then you
|
||||
might need to install the `Tesseract language files
|
||||
<http://packages.ubuntu.com/search?keywords=tesseract-ocr>`_ marching your documents languages.
|
||||
If you find the OCR accuracy to be too low, and/or the document consumer warns
|
||||
that ``OCR for XX failed, but we're going to stick with what we've got since
|
||||
FORGIVING_OCR is enabled``, then you might need to install the
|
||||
`Tesseract language files <http://packages.ubuntu.com/search?keywords=tesseract-ocr>`_
|
||||
marching your documents languages.
|
||||
|
||||
As an example, if you are running Paperless from the Vagrant setup provided (or from any Ubuntu or Debian
|
||||
box), and your documents are written in Spanish you may need to run::
|
||||
As an example, if you are running Paperless from the Vagrant setup provided
|
||||
(or from any Ubuntu or Debian box), and your documents are written in Spanish
|
||||
you may need to run::
|
||||
|
||||
apt-get install -y tesseract-ocr-spa
|
||||
|
||||
|
||||
.. _troubleshooting-convertpixelcache:
|
||||
|
||||
Consumer dies with ``convert: unable to extent pixel cache``
|
||||
------------------------------------------------------------
|
||||
|
||||
During the consumption process, Paperless invokes ImageMagick's ``convert``
|
||||
program to translate the source document into something that the OCR engine can
|
||||
understand and this can burn a Very Large amount of memory if the original
|
||||
document is rather long. Similarly, if your system doesn't have a lot of
|
||||
memory to begin with (ie. a Rasberry Pi), then this can happen for even
|
||||
medium-sized documents.
|
||||
|
||||
The solution is to tell ImageMagick *not* to Use All The RAM, as is its
|
||||
default, and instead tell it to used a fixed amount. ``convert`` will then
|
||||
break up the job into hundreds of individual files and use them to slowly
|
||||
compile the finished image. Simply set ``PAPERLESS_CONVERT_MEMORY_LIMIT`` in
|
||||
``/etc/paperless.conf`` to something like ``32000000`` and you'll limit
|
||||
``convert`` to 32MB. Fiddle with this value as you like.
|
||||
|
||||
**HOWEVER**: Simply setting this value may not be enough on system where
|
||||
``/tmp`` is mounted as tmpfs, as this is where ``convert`` will write its
|
||||
temporary files. In these cases (most Systemd machines), you need to tell
|
||||
ImageMagick to use a different space for its scratch work. You do this by
|
||||
setting ``PAPERLESS_CONVERT_TMPDIR`` in ``/etc/paperless.conf`` to somewhere
|
||||
that's actually on a physical disk (and writable by the user running
|
||||
Paperless), like ``/var/tmp/paperless`` or ``/home/my_user/tmp`` in a pinch.
|
||||
|
Loading…
x
Reference in New Issue
Block a user