diff --git a/docs/changelog.rst b/docs/changelog.rst index d5c48b2dc..116c2e07c 100644 --- a/docs/changelog.rst +++ b/docs/changelog.rst @@ -8,6 +8,9 @@ Changelog paperless-ng 0.9.5 ################## +This release concludes the big changes I wanted to get rolled into paperless. The next releases before 1.0 will +focus on fixing issues, primarily. + * OCR * Paperless now uses `OCRmyPDF `_ to perform OCR on documents. diff --git a/docs/faq.rst b/docs/faq.rst index 9a5e73ea5..887946074 100644 --- a/docs/faq.rst +++ b/docs/faq.rst @@ -86,3 +86,15 @@ the documentation has instructions for bare metal installs. I'm running paperless on an i3 processor from 2015 or so. This is also what I use to test new releases with. Apart from that, I also have a Raspberry Pi, which I occasionally build the image on and see if it works. + +**Q:** *How do I proxy this with NGINX?* + +.. code:: + + location / { + proxy_pass http://localhost:8000/ + } + +And that's about it. Paperless serves everything, including static files by itself +when running the docker image. If you want to do anything fancy, you have to +install paperless bare metal. diff --git a/docs/troubleshooting.rst b/docs/troubleshooting.rst index 9e1c42f4a..dc5bf7f5d 100644 --- a/docs/troubleshooting.rst +++ b/docs/troubleshooting.rst @@ -29,75 +29,23 @@ Check for the following issues: Consumer fails to pickup any new files ###################################### -If you notice, that the consumer will only pickup files in the consumption +If you notice that the consumer will only pickup files in the consumption directory at startup, but won't find any other files added later, check out the configuration file and enable filesystem polling with the setting ``PAPERLESS_CONSUMER_POLLING``. +Operation not permitted +####################### -Consumer warns ``OCR for XX failed`` -#################################### +You might see errors such as: -If you find the OCR accuracy to be too low, and/or the document consumer warns -that ``OCR for XX failed, but we're going to stick with what we've got since -FORGIVING_OCR is enabled``, then you might need to install the -`Tesseract language files `_ -marching your document's languages. +.. code:: -As an example, if you are running Paperless from any Ubuntu or Debian -box, and your documents are written in Spanish you may need to run:: + chown: changing ownership of '../export': Operation not permitted - apt-get install -y tesseract-ocr-spa +The container tries to set file ownership on the listed directories. This is +required so that the user running paperless inside docker has write permissions +to these folders. This happens when pointing these directories to NFS shares, +for example. - - -Consumer dies with ``convert: unable to extent pixel cache`` -############################################################ - -During the consumption process, Paperless invokes ImageMagick's ``convert`` -program to translate the source document into something that the OCR engine can -understand and this can burn a Very Large amount of memory if the original -document is rather long. Similarly, if your system doesn't have a lot of -memory to begin with (ie. a Raspberry Pi), then this can happen for even -medium-sized documents. - -The solution is to tell ImageMagick *not* to Use All The RAM, as is its -default, and instead tell it to used a fixed amount. ``convert`` will then -break up the job into hundreds of individual files and use them to slowly -compile the finished image. Simply set ``PAPERLESS_CONVERT_MEMORY_LIMIT`` in -``/etc/paperless.conf`` to something like ``32000000`` and you'll limit -``convert`` to 32MB. Fiddle with this value as you like. - -**HOWEVER**: Simply setting this value may not be enough on system where -``/tmp`` is mounted as tmpfs, as this is where ``convert`` will write its -temporary files. In these cases (most Systemd machines), you need to tell -ImageMagick to use a different space for its scratch work. You do this by -setting ``PAPERLESS_CONVERT_TMPDIR`` in ``/etc/paperless.conf`` to somewhere -that's actually on a physical disk (and writable by the user running -Paperless), like ``/var/tmp/paperless`` or ``/home/my_user/tmp`` in a pinch. - - -DecompressionBombWarning and/or no text in the OCR output -######################################################### - -Some users have had issues using Paperless to consume PDFs that were created -by merging Very Large Scanned Images into one PDF. If this happens to you, -it's likely because the PDF you've created contains some very large pages -(millions of pixels) and the process of converting the PDF to a OCR-friendly -image is exploding. - -Typically, this happens because the scanned images are created with a high -DPI and then rolled into the PDF with an assumed DPI of 72 (the default). -The best solution then is to specify the DPI used in the scan in the -conversion-to-PDF step. So for example, if you scanned the original image -with a DPI of 300, then merging the images into the single PDF with -``convert`` should look like this: - -.. code:: bash - - $ convert -density 300 *.jpg finished.pdf - -For more information on this and situations like it, you should take a look -at `Issue #118`_ as that's where this tip originated. - -.. _Issue #118: https://github.com/the-paperless-project/paperless/issues/118 +Ensure that `chown` is possible on these directories.