1011 Commits

Author SHA1 Message Date
Jonas Winkler
a48cc6c627 Merge branch 'master' into dev 2018-09-12 11:47:35 +02:00
Jonas Winkler
2d2ad9156d bugfix 2018-09-11 20:45:36 +02:00
Jonas Winkler
9c51d7d2d1 fixed settings 2018-09-11 17:30:46 +02:00
Jonas Winkler
553caefe1d Merge remote-tracking branch 'upstream/master' 2018-09-11 14:43:59 +02:00
Jonas Winkler
35ea0f2add Merge branch 'machine-learning' into dev 2018-09-11 14:36:21 +02:00
Jonas Winkler
8a16b62773 The classifier works with ids now, not names. Minor changes. 2018-09-11 14:30:18 +02:00
Jonas Winkler
d2929e974a changed classifier 2018-09-11 00:33:07 +02:00
Daniel Quinn
16053bf832 Bump to 2.3.0 2018-09-09 21:51:44 +01:00
Daniel Quinn
375464f471 Merge pull request #401 from ahyear/patch-1
add migrate commande to docker update process
2018-09-09 21:26:56 +01:00
Daniel Quinn
fd4d218b72 Merge branch 'jat255-ENH_config_inline_or_attach' 2018-09-09 21:22:42 +01:00
Daniel Quinn
6b9d6d354f Streamline how we handle boolean values in settings.py 2018-09-09 21:22:07 +01:00
Daniel Quinn
0521506239 Make the example file contain the default value 2018-09-09 21:16:53 +01:00
Daniel Quinn
b05079544f Merge branch 'ENH_config_inline_or_attach' of git://github.com/jat255/paperless into jat255-ENH_config_inline_or_attach 2018-09-09 21:15:14 +01:00
Daniel Quinn
21e53aa55c Merge pull request #399 from jat255/ENH_convert_only_one_page
Speed up thumbnail generation for PDFs
2018-09-09 21:12:42 +01:00
Daniel Quinn
06f9e462e2 Merge pull request #396 from dubit0/postgres_mysql_fix
Fix document checks with PostgreSQL and MySQL backends.
2018-09-09 21:10:36 +01:00
Daniel Quinn
0f21daf47a Merge branch 'jat255-ENH_text_consumer' 2018-09-09 21:03:58 +01:00
Daniel Quinn
81c8e067fe Reorder imports 2018-09-09 21:03:37 +01:00
Daniel Quinn
ef7f98281d Rename parsers to DATE_REGEX
In moving the `parsers` variable into the package-level, it lost the
context, so a more descriptive name was needed.
2018-09-09 21:02:30 +01:00
Daniel Quinn
69fc0d6d80 Fix pycodestyle complaints 2018-09-09 20:55:37 +01:00
Daniel Quinn
a3158eedf9 Merge branch 'ENH_text_consumer' of git://github.com/jat255/paperless into jat255-ENH_text_consumer 2018-09-09 20:52:59 +01:00
Daniel Quinn
058a88f102 Merge pull request #398 from ddddavidmartin/bump_pyocr_version_for_tesseract_4_support
Bump required version for Pyocr to support the latest tesseract 4.
2018-09-09 20:01:51 +01:00
Daniel Quinn
6b63ce9201 Fix pycodestyle complaints
Apparently, pycodestyle updated itself to now check for invalid escape
sequences, which only complain if the regex in use isn't a raw string
(r"").
2018-09-09 20:00:12 +01:00
Daniel Quinn
16ba221ad9 Add tox to dev dependencies 2018-09-09 19:59:47 +01:00
ahyear
fb7e264ef8 add migrate commande to docker update process 2018-09-06 15:32:41 +02:00
Jonas Winkler
1c8576cfb9 mode change 2018-09-06 12:00:01 +02:00
Jonas Winkler
62934063a4 fixed merge error 2018-09-06 10:15:15 +02:00
Joshua Taillon
2661af34c3 remove debugging print statement 2018-09-05 23:05:37 -04:00
Joshua Taillon
a8e53846b8 add INLINE_DOC to settings.py 2018-09-05 23:03:30 -04:00
Joshua Taillon
661f1f570b add option for inline vs. attachment for document rendering 2018-09-05 22:58:38 -04:00
Joshua Taillon
5326895334 move date-matching regex pattern to base parser module for use by all subclasses 2018-09-05 21:13:36 -04:00
Jonas Winkler
d725f20505 Merge branch 'dev' into machine-learning 2018-09-06 00:29:41 +02:00
Jonas Winkler
069249cc0a Merge branch 'master' into dev 2018-09-06 00:28:58 +02:00
Jonas Winkler
ad5066a88c Added scikit-learn to requirements 2018-09-06 00:20:44 +02:00
Joshua Taillon
98a437f78a change tesseract parser to only convert first page to save (potentially) massive amounts of work 2018-09-05 15:18:35 -04:00
Jonas Winkler
becda609f1 fixed the api 2018-09-05 15:29:05 +02:00
Jonas Winkler
c701a8f59c Merge branch 'dev' into machine-learning 2018-09-05 15:26:39 +02:00
Jonas Winkler
6e66d39297 fixed the api 2018-09-05 15:25:14 +02:00
Jonas Winkler
0b2dc348cd fixed api 2018-09-05 14:57:37 +02:00
Jonas Winkler
bbba57dd4d implemented automatic classification field functionality 2018-09-05 14:31:02 +02:00
Jonas Winkler
582e9c5cb4 Fixed a few things 2018-09-05 12:43:11 +02:00
Daniel Quinn
d440980fbc Add empty requirements for rtd to reference 2018-09-05 11:16:42 +01:00
Daniel Quinn
71164afe9a Add credits for 2.2.0 that I forgot 2018-09-05 10:59:06 +01:00
Daniel Quinn
a406d2d887 Re-flow text to keep it <80c wide 2018-09-05 10:58:41 +01:00
David Martin
503fe6669f Bump required version for Pyocr to support the latest tesseract 4.
This recently changed in the official tesseract engine [0]. -psm is
not allowed as an option anymore and --psm has to be used instead. The
latest pyocr enables support for this [1].

[0] tesseract-ocr/tesseract@ee201e1
[1] 5abd0a566a
2018-09-05 13:03:42 +10:00
Thomas Niederprüm
0eb7b0cab5 Catch ProgrammingError in Document checks.
When running PostgreSQL or MariaDB/MySQL backends, a query to a non-existent
table will raise a "ProgrammingError". This patch properly catches this error.
Without this patch all management calls to manage.py will lead to an error when
running PostgreSQL or MariaDB as a backend.
2018-09-04 20:11:48 +02:00
Jonas Winkler
9d4155a907 removed matching model fields, automatic classifier reloading, added autmatic_classification field to matching model 2018-09-04 18:40:26 +02:00
Jonas Winkler
5a63125e04 Merge remote-tracking branch 'upstream/master' 2018-09-04 16:02:48 +02:00
Jonas Winkler
804b3d98f9 Fixed documents not being saved after modification 2018-09-04 15:33:51 +02:00
Jonas Winkler
6a35c2e38b Merge branch 'document-type' into dev 2018-09-04 14:55:59 +02:00
Jonas Winkler
8a1a794577 Document Type exporting 2018-09-04 14:55:29 +02:00