9690 Commits

Author SHA1 Message Date
Daniel Quinn
6b63ce9201 Fix pycodestyle complaints
Apparently, pycodestyle updated itself to now check for invalid escape
sequences, which only complain if the regex in use isn't a raw string
(r"").
2018-09-09 20:00:12 +01:00
Daniel Quinn
16ba221ad9 Add tox to dev dependencies 2018-09-09 19:59:47 +01:00
ahyear
fb7e264ef8 add migrate commande to docker update process 2018-09-06 15:32:41 +02:00
Jonas Winkler
1c8576cfb9 mode change 2018-09-06 12:00:01 +02:00
Jonas Winkler
62934063a4 fixed merge error 2018-09-06 10:15:15 +02:00
Joshua Taillon
2661af34c3 remove debugging print statement 2018-09-05 23:05:37 -04:00
Joshua Taillon
a8e53846b8 add INLINE_DOC to settings.py 2018-09-05 23:03:30 -04:00
Joshua Taillon
661f1f570b add option for inline vs. attachment for document rendering 2018-09-05 22:58:38 -04:00
Joshua Taillon
5326895334 move date-matching regex pattern to base parser module for use by all subclasses 2018-09-05 21:13:36 -04:00
Jonas Winkler
d725f20505 Merge branch 'dev' into machine-learning 2018-09-06 00:29:41 +02:00
Jonas Winkler
069249cc0a Merge branch 'master' into dev 2018-09-06 00:28:58 +02:00
Jonas Winkler
ad5066a88c Added scikit-learn to requirements 2018-09-06 00:20:44 +02:00
Joshua Taillon
98a437f78a change tesseract parser to only convert first page to save (potentially) massive amounts of work 2018-09-05 15:18:35 -04:00
Jonas Winkler
becda609f1 fixed the api 2018-09-05 15:29:05 +02:00
Jonas Winkler
c701a8f59c Merge branch 'dev' into machine-learning 2018-09-05 15:26:39 +02:00
Jonas Winkler
6e66d39297 fixed the api 2018-09-05 15:25:14 +02:00
Jonas Winkler
0b2dc348cd fixed api 2018-09-05 14:57:37 +02:00
Jonas Winkler
bbba57dd4d implemented automatic classification field functionality 2018-09-05 14:31:02 +02:00
Jonas Winkler
582e9c5cb4 Fixed a few things 2018-09-05 12:43:11 +02:00
Daniel Quinn
d440980fbc Add empty requirements for rtd to reference 2018-09-05 11:16:42 +01:00
Daniel Quinn
71164afe9a Add credits for 2.2.0 that I forgot 2018-09-05 10:59:06 +01:00
Daniel Quinn
a406d2d887 Re-flow text to keep it <80c wide 2018-09-05 10:58:41 +01:00
David Martin
503fe6669f Bump required version for Pyocr to support the latest tesseract 4.
This recently changed in the official tesseract engine [0]. -psm is
not allowed as an option anymore and --psm has to be used instead. The
latest pyocr enables support for this [1].

[0] tesseract-ocr/tesseract@ee201e1
[1] 5abd0a566a
2018-09-05 13:03:42 +10:00
Thomas Niederprüm
0eb7b0cab5 Catch ProgrammingError in Document checks.
When running PostgreSQL or MariaDB/MySQL backends, a query to a non-existent
table will raise a "ProgrammingError". This patch properly catches this error.
Without this patch all management calls to manage.py will lead to an error when
running PostgreSQL or MariaDB as a backend.
2018-09-04 20:11:48 +02:00
Jonas Winkler
9d4155a907 removed matching model fields, automatic classifier reloading, added autmatic_classification field to matching model 2018-09-04 18:40:26 +02:00
Jonas Winkler
5a63125e04 Merge remote-tracking branch 'upstream/master' 2018-09-04 16:02:48 +02:00
Jonas Winkler
804b3d98f9 Fixed documents not being saved after modification 2018-09-04 15:33:51 +02:00
Jonas Winkler
6a35c2e38b Merge branch 'document-type' into dev 2018-09-04 14:55:59 +02:00
Jonas Winkler
8a1a794577 Document Type exporting 2018-09-04 14:55:29 +02:00
Jonas Winkler
c50c517928 Implemented the classifier model, including automatic tagging of new documents 2018-09-04 14:39:55 +02:00
Joshua Taillon
3f7a6f3966 Merge branch 'master' into ENH_text_consumer 2018-09-03 23:47:30 -04:00
Joshua Taillon
cc7a341e75 explicitly add txt, md, and csv types for consumer and viewer; fix thumbnail generation 2018-09-03 23:46:13 -04:00
Jonas Winkler
3eecd67fc1 Added code that trains models based on data from the databasae 2018-09-03 15:55:41 +02:00
Daniel Quinn
282aa0165f Bump for 2.2.1 2018-09-03 00:27:40 +01:00
Daniel Quinn
8569e9d26e Don't try to remove SessionAuthenticationMiddleware
It was remove entirely in Django 2.0
2018-09-03 00:25:10 +01:00
Daniel Quinn
9e5d042d50 Add Tim to the credits for 2.2.0 2018-09-02 21:53:52 +01:00
Daniel Quinn
7fe6b287dc Merge branch 'dadosch-django-v2' 2018-09-02 21:48:59 +01:00
Daniel Quinn
dd3170b8c7 Updates for 2.2.0 2018-09-02 21:48:09 +01:00
Daniel Quinn
6b640fe6d7 Add note about the removal of puritanical language 2018-09-02 21:46:52 +01:00
Daniel Quinn
88b8835617 Switch out field_name= for name=
This appears to be a django-filter version change thing.
2018-09-02 21:26:30 +01:00
Daniel Quinn
af6028fc4e pep8 2018-09-02 21:26:20 +01:00
Daniel Quinn
828f050e82 Remove old Python 2.x style code 2018-09-02 21:26:06 +01:00
Daniel Quinn
40e409e092 Drop django-flat-responsive
It's not necessary for Django 2.0+ as the new system is responsive by
default.
2018-09-02 21:25:30 +01:00
Daniel Quinn
7d2d3901bc Merge @dadosch's changes & fix dependency conflicts 2018-09-02 21:06:40 +01:00
Daniel Quinn
3ec0dc040e Merge pull request #391 from sbrunner/tag-list
Better interface when we have many tags
2018-09-02 20:57:32 +01:00
Daniel Quinn
59e67eb271 Default sort order for tags to use 'name' 2018-09-02 20:56:45 +01:00
Daniel Quinn
284ff69539 Fix #384: duplicate tags due to case insensitivity 2018-09-02 20:48:51 +01:00
Daniel Quinn
ef0b33e72e Clean up some linter complaints 2018-09-02 20:33:49 +01:00
Daniel Quinn
52bfb2edf0 Update dependencies 2018-09-02 20:33:28 +01:00
Jonas Winkler
daa93883ee Added command to create datasets 2018-09-02 12:47:19 +02:00