Daniel Quinn
cebc44f2c9
API is halfway there
2016-02-16 09:28:34 +00:00
Daniel Quinn
2f0da8ab25
Added download_url to the Document model
2016-02-15 22:38:18 +00:00
Pit Kleyersburg
7b227ffa2f
Fix matching if user supplied an empty value
2016-02-14 19:47:05 +01:00
Daniel Quinn
aea4af5d3b
Version bump and feature update
2016-02-14 17:18:28 +00:00
Daniel Quinn
a0f4f6c5f2
Fixed merge conflict and did some pep8
2016-02-14 17:13:48 +00:00
Pit Kleyersburg
aeab9a0e81
Detect language only on one page of PDF
...
To detect the language currently the entire document gets processed. If
a different language has been detected than the default one, the entire
document will be processed again for the new language.
This PR analyzes the middle page for its language and either processes
the remaining pages with the default language if it didn't differ, or
processes all pages for the new guessed language.
The amount of processed pages comes down from the worst case `2n` to
worst case `n+1`.
2016-02-14 17:55:13 +01:00
Daniel Quinn
7843ea5037
Added and implemented a rudimentary logger
2016-02-14 16:09:52 +00:00
Pit Kleyersburg
20b2408dbb
Ensure OCR_THREADS
is integer, add documentation
2016-02-14 16:37:38 +01:00
Pit Kleyersburg
f5beda9c56
Enable parallel OCR processing
...
At the moment, every page in a PDF will be processed one by one using
tesseract. Since the processing of a single page is independent from every
other page, one can make use of multi-core machines.
This PR introduces a multiprocessing pool to process multiple pages
simultaneously. The amount of threads to use can be specified in the
environment variable `PAPERLESS_OCR_THREADS`. This will default to the
number of cores/hyperthreads Python detects for your system.
2016-02-14 15:57:42 +01:00
Daniel Quinn
6b0a537bff
Added support for a shared secret in email
2016-02-14 03:01:24 +00:00
Daniel Quinn
3b5d4cdd39
Added some error handling
2016-02-14 01:32:25 +00:00
Daniel Quinn
fc5d89c6fc
Added a default algorithm
2016-02-14 01:30:41 +00:00
Daniel Quinn
d9b7851de9
Added a default algorithm
2016-02-14 01:30:18 +00:00
Daniel Quinn
330dfa544b
Fixed a typo in the description. There's no need for a new migration here.
2016-02-14 00:10:37 +00:00
Daniel Quinn
a846b3f7b8
Adding some more debugging
2016-02-13 00:57:05 +00:00
Daniel Quinn
840472071c
Added the required verbosity reference
2016-02-12 08:27:28 +00:00
Daniel Quinn
2421f559be
Simpler regex
2016-02-12 08:27:09 +00:00
Daniel Quinn
a022fcb8f1
Fixed the auto-naming regexes
2016-02-11 22:05:55 +00:00
Daniel Quinn
7aadab23cc
Added the Renderable mixin because DRY
2016-02-11 22:05:38 +00:00
Daniel Quinn
ef1639208c
Tests for the consumer
2016-02-11 12:25:23 +00:00
Daniel Quinn
cef4abc01d
version bump
2016-02-11 12:25:12 +00:00
Daniel Quinn
c423a13f85
Added a simple re-tagger
2016-02-11 12:24:18 +00:00
Daniel Quinn
39134b517e
Cleaned up file_name()
2016-02-10 23:53:48 +00:00
Daniel Quinn
4a078dcfbc
Merge branch 'master' into feature/images-as-docs
2016-02-09 17:20:45 +00:00
Daniel Quinn
0eaed36420
The 'API' is written but untested
2016-02-08 23:46:16 +00:00
Daniel Quinn
212752f46e
Fixt the tags to be optional
2016-02-08 17:28:59 +00:00
Daniel Quinn
0c729e5675
Changed the name, forgot to change the check.
...
Closes #17
2016-02-08 11:14:57 +00:00
Daniel Quinn
c4311af263
Cleaned up the tests
2016-02-06 17:41:11 +00:00
Daniel Quinn
febb45af81
Prettied up the interface a little
2016-02-06 17:27:17 +00:00
Daniel Quinn
ce69e37256
Linked tag labels
2016-02-06 17:14:44 +00:00
Daniel Quinn
48761911b3
Image imports and consumption by mail work
2016-02-06 17:05:36 +00:00
Daniel Quinn
71075a691a
The mailconsumer isn't a consumer at all. Best fixt that
2016-02-05 20:15:08 +00:00
Daniel Quinn
d8ad6b589b
Added pytest and broke up the consumer into file and mail
2016-02-05 00:23:36 +00:00
Daniel Quinn
3bc89d23c8
Sorting the filters
2016-02-03 17:20:12 +00:00
Daniel Quinn
a70b40f618
Broke the consumer script into separate files and started on a mail consumer
2016-01-30 01:18:52 +00:00
Daniel Quinn
84d5f8cc5d
Merge branch 'master' into feature/images-as-docs
2016-01-29 23:41:13 +00:00
Daniel Quinn
ace9389e5f
#12 : Support image documents
2016-01-29 23:18:03 +00:00
Daniel Quinn
10e4f0f5f3
Added some better admin for tags
2016-01-28 18:37:27 +00:00
Daniel Quinn
a7d041a9f5
Prettied-up the admin
2016-01-28 08:16:29 +00:00
Daniel Quinn
3026593d6c
Version bump for automated tagging
2016-01-28 07:29:25 +00:00
Daniel Quinn
0ec63ae1f9
#11 : automatic tagging support
2016-01-28 07:23:11 +00:00
Daniel Quinn
286292dbf9
Added some documentation
2016-01-24 20:15:50 -05:00
Daniel Quinn
04bcb1cdad
Forced python3 for setups not using a virtualenv
2016-01-24 12:31:02 +00:00
Daniel Quinn
669cf1cb70
Add labels ( #9 )
2016-01-23 04:40:35 +00:00
Daniel Quinn
1219e81e77
Moved changes to where it should be
2016-01-23 03:44:51 +00:00
Daniel Quinn
65074b4375
Smarter check positions
2016-01-23 03:42:39 +00:00
Daniel Quinn
0eb0c88d3d
Now the exporter sets the proper dates
2016-01-23 03:22:15 +00:00
Daniel Quinn
796e977894
Django insists on adding every little thing as a migration
2016-01-23 03:14:55 +00:00
Daniel Quinn
4f1bf81d5b
Better variable names
2016-01-23 03:05:40 +00:00
Daniel Quinn
9e596953a3
pep8
2016-01-23 02:58:03 +00:00