paperless-ngx

mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2025-07-20 17:44:56 -05:00

History

Pit Kleyersburg f5beda9c56 Enable parallel OCR processing

At the moment, every page in a PDF will be processed one by one using
tesseract. Since the processing of a single page is independent from every
other page, one can make use of multi-core machines.

This PR introduces a multiprocessing pool to process multiple pages
simultaneously. The amount of threads to use can be specified in the
environment variable `PAPERLESS_OCR_THREADS`. This will default to the
number of cores/hyperthreads Python detects for your system.

2016-02-14 15:57:42 +01:00

documents

Enable parallel OCR processing

2016-02-14 15:57:42 +01:00

paperless

Enable parallel OCR processing

2016-02-14 15:57:42 +01:00

manage.py

Changed the name, forgot to change the check.

2016-02-08 11:14:57 +00:00