3371 Commits

Author SHA1 Message Date
Pit Kleyersburg
20b2408dbb Ensure OCR_THREADS is integer, add documentation 2016-02-14 16:37:38 +01:00
Pit Kleyersburg
f5beda9c56 Enable parallel OCR processing
At the moment, every page in a PDF will be processed one by one using
tesseract. Since the processing of a single page is independent from every
other page, one can make use of multi-core machines.

This PR introduces a multiprocessing pool to process multiple pages
simultaneously. The amount of threads to use can be specified in the
environment variable `PAPERLESS_OCR_THREADS`. This will default to the
number of cores/hyperthreads Python detects for your system.
2016-02-14 15:57:42 +01:00
Daniel Quinn
6b0a537bff Added support for a shared secret in email 2016-02-14 03:01:24 +00:00
Daniel Quinn
3b5d4cdd39 Added some error handling 2016-02-14 01:32:25 +00:00
Daniel Quinn
fc5d89c6fc Added a default algorithm 2016-02-14 01:30:41 +00:00
Daniel Quinn
d9b7851de9 Added a default algorithm 2016-02-14 01:30:18 +00:00
Daniel Quinn
330dfa544b Fixed a typo in the description. There's no need for a new migration here. 2016-02-14 00:10:37 +00:00
Daniel Quinn
a846b3f7b8 Adding some more debugging 2016-02-13 00:57:05 +00:00
Daniel Quinn
840472071c Added the required verbosity reference 2016-02-12 08:27:28 +00:00
Daniel Quinn
2421f559be Simpler regex 2016-02-12 08:27:09 +00:00
Daniel Quinn
a022fcb8f1 Fixed the auto-naming regexes 2016-02-11 22:05:55 +00:00
Daniel Quinn
7aadab23cc Added the Renderable mixin because DRY 2016-02-11 22:05:38 +00:00
Daniel Quinn
ef1639208c Tests for the consumer 2016-02-11 12:25:23 +00:00
Daniel Quinn
cef4abc01d version bump 2016-02-11 12:25:12 +00:00
Daniel Quinn
c423a13f85 Added a simple re-tagger 2016-02-11 12:24:18 +00:00
Daniel Quinn
39134b517e Cleaned up file_name() 2016-02-10 23:53:48 +00:00
Daniel Quinn
4a078dcfbc Merge branch 'master' into feature/images-as-docs 2016-02-09 17:20:45 +00:00
Daniel Quinn
0eaed36420 The 'API' is written but untested 2016-02-08 23:46:16 +00:00
Daniel Quinn
212752f46e Fixt the tags to be optional 2016-02-08 17:28:59 +00:00
Daniel Quinn
0c729e5675 Changed the name, forgot to change the check.
Closes #17
2016-02-08 11:14:57 +00:00
Daniel Quinn
c4311af263 Cleaned up the tests 2016-02-06 17:41:11 +00:00
Daniel Quinn
febb45af81 Prettied up the interface a little 2016-02-06 17:27:17 +00:00
Daniel Quinn
ce69e37256 Linked tag labels 2016-02-06 17:14:44 +00:00
Daniel Quinn
48761911b3 Image imports and consumption by mail work 2016-02-06 17:05:36 +00:00
Daniel Quinn
71075a691a The mailconsumer isn't a consumer at all. Best fixt that 2016-02-05 20:15:08 +00:00
Daniel Quinn
d8ad6b589b Added pytest and broke up the consumer into file and mail 2016-02-05 00:23:36 +00:00
Daniel Quinn
3bc89d23c8 Sorting the filters 2016-02-03 17:20:12 +00:00
Daniel Quinn
a70b40f618 Broke the consumer script into separate files and started on a mail consumer 2016-01-30 01:18:52 +00:00
Daniel Quinn
84d5f8cc5d Merge branch 'master' into feature/images-as-docs 2016-01-29 23:41:13 +00:00
Daniel Quinn
ace9389e5f #12: Support image documents 2016-01-29 23:18:03 +00:00
Daniel Quinn
10e4f0f5f3 Added some better admin for tags 2016-01-28 18:37:27 +00:00
Daniel Quinn
a7d041a9f5 Prettied-up the admin 2016-01-28 08:16:29 +00:00
Daniel Quinn
3026593d6c Version bump for automated tagging 2016-01-28 07:29:25 +00:00
Daniel Quinn
0ec63ae1f9 #11: automatic tagging support 2016-01-28 07:23:11 +00:00
Daniel Quinn
286292dbf9 Added some documentation 2016-01-24 20:15:50 -05:00
Daniel Quinn
04bcb1cdad Forced python3 for setups not using a virtualenv 2016-01-24 12:31:02 +00:00
Daniel Quinn
669cf1cb70 Add labels (#9) 2016-01-23 04:40:35 +00:00
Daniel Quinn
1219e81e77 Moved changes to where it should be 2016-01-23 03:44:51 +00:00
Daniel Quinn
65074b4375 Smarter check positions 2016-01-23 03:42:39 +00:00
Daniel Quinn
0eb0c88d3d Now the exporter sets the proper dates 2016-01-23 03:22:15 +00:00
Daniel Quinn
796e977894 Django insists on adding every little thing as a migration 2016-01-23 03:14:55 +00:00
Daniel Quinn
4f1bf81d5b Better variable names 2016-01-23 03:05:40 +00:00
Daniel Quinn
9e596953a3 pep8 2016-01-23 02:58:03 +00:00
Daniel Quinn
d69df37fb6 The exporter now re-dates the files 2016-01-23 02:57:29 +00:00
Daniel Quinn
fdb29f739f Added language detection 2016-01-23 02:33:29 +00:00
Daniel Quinn
bcdcfbaee0 Added a manual language lookup based on ISO639 2016-01-23 02:33:04 +00:00
Daniel Quinn
fbbaf9cce0 Organised and documented project settings 2016-01-23 02:28:39 +00:00
Daniel Quinn
15fb83078c Added Changes file. Perhaps proper releases soon? 2016-01-23 02:27:35 +00:00
Daniel Quinn
ec70d05517 Introducing language detection 2016-01-21 12:50:22 -05:00
the01
4c1ff658d2 add language setting for tesseract 2016-01-21 09:24:13 +01:00