paperless-ngx

mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2026-02-24 00:59:35 -06:00

Files

Pit Kleyersburg f5beda9c56 Enable parallel OCR processing

At the moment, every page in a PDF will be processed one by one using
tesseract. Since the processing of a single page is independent from every
other page, one can make use of multi-core machines.

This PR introduces a multiprocessing pool to process multiple pages
simultaneously. The amount of threads to use can be specified in the
environment variable `PAPERLESS_OCR_THREADS`. This will default to the
number of cores/hyperthreads Python detects for your system.

2016-02-14 15:57:42 +01:00

__init__.py

It works!

2015-12-20 19:23:33 +00:00

db.py

Better variable names

2016-01-23 03:05:40 +00:00

settings.py

Enable parallel OCR processing

2016-02-14 15:57:42 +01:00

urls.py

The 'API' is written but untested

2016-02-08 23:46:16 +00:00

version.py

version bump

2016-02-11 12:25:12 +00:00

wsgi.py

It works!

2015-12-20 19:23:33 +00:00