Addresses #118

2026-01-28 22:59:03 -06:00 · 2016-04-30 13:57:31 +01:00
parent 3c79b55ae6
commit a0d268ebbc
1 changed files with 27 additions and 0 deletions
--- a/docs/troubleshooting.rst
+++ b/docs/troubleshooting.rst
@@ -47,3 +47,30 @@ ImageMagick to use a different space for its scratch work.  You do this by
 setting ``PAPERLESS_CONVERT_TMPDIR`` in ``/etc/paperless.conf`` to somewhere
 that's actually on a physical disk (and writable by the user running
 Paperless), like ``/var/tmp/paperless`` or ``/home/my_user/tmp`` in a pinch.
+
+
+.. _troubleshooting-decompressionbombwarning:
+
+DecompressionBombWarning and/or no text in the OCR output
+---------------------------------------------------------
+Some users have had issues using Paperless to consume PDFs that were created
+by merging Very Large Scanned Images into one PDF.  If this happens to you,
+it's likely because the PDF you've created contains some very large pages
+(millions of pixels) and the process of converting the PDF to a OCR-friendly
+image is exploding.
+
+Typically, this happens because the scanned images are created with a high
+DPI and then rolled into the PDF with an assumed DPI of 72 (the default).
+The best solution then is to specify the DPI used in the scan in the
+conversion-to-PDF step.  So for example, if you scanned the original image
+with a DPI of 300, then merging the images into the single PDF with
+``convert`` should look like this:
+
+.. code:: bash
+
+    $ convert -density 300 *.jpg finished.pdf
+
+For more information on this and situations like it, you should take a look
+at `Issue #118`_ as that's where this tip originated.
+
+.. _Issue #118: https://github.com/danielquinn/paperless/issues/118