Add support for using pre-existing text from PDFs

2026-02-03 23:22:42 -06:00 · 2018-01-30 20:13:35 +00:00
parent 31c8cf020e
commit cd92c005e3
7 changed files with 60 additions and 13 deletions
--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@@ -3,7 +3,15 @@ Changelog

 * 1.2.0
  * New Docker image, now based on Alpine, thanks to the efforts of `addadi`_
-  and `Pit`_.
+    and `Pit`_.
+  * `BastianPoe`_ has added the long-awaited feature to automatically skip the
+    OCR step when the PDF already contains text. This can be overridden by
+    setting ``PAPERLESS_OCR_ALWAYS=YES`` either in your ``paperless.conf`` or
+    in the environment.  Note that this also means that Paperless now requires
+    ``libpoppler-cpp-dev`` to be installed. **You'll need to run
+    ``pip install -r requirements.txt`` after the usual ``git pull`` to
+    properly update**.
+
 * 1.1.0
  * Fix for `#283`_, a redirect bug which broke interactions with
    paperless-desktop.  Thanks to `chris-aeviator`_ for reporting it.
@@ -272,6 +280,7 @@ Changelog
 .. _chris-aeviator: https://github.com/chris-aeviator
 .. _Dan Panzarella: https://github.com/pzl
 .. _addadi: https://github.com/addadi
+.. _BastianPoe: https://github.com/BastianPoe

 .. _#20: https://github.com/danielquinn/paperless/issues/20
 .. _#44: https://github.com/danielquinn/paperless/issues/44