10 Commits

Author SHA1 Message Date
Jonas Winkler
9bfa088eb5 reworked the interface of the parsers. 2020-11-25 19:36:39 +01:00
Jonas Winkler
eb6805e37e code style fixes 2020-11-12 21:09:45 +01:00
Jonas Winkler
def3a85858 reworked most of the tesseract parser, better logging 2020-11-02 15:40:44 +01:00
Daniel Quinn
bc898c1992 Use optipng to optimise document thumbnails 2018-10-07 14:56:38 +01:00
Daniel Quinn
074609e1fc Consolidate get_date onto the DocumentParser parent class 2018-10-07 14:56:02 +01:00
Daniel Quinn
ef7f98281d Rename parsers to DATE_REGEX
In moving the `parsers` variable into the package-level, it lost the
context, so a more descriptive name was needed.
2018-09-09 21:02:30 +01:00
Daniel Quinn
69fc0d6d80 Fix pycodestyle complaints 2018-09-09 20:55:37 +01:00
Joshua Taillon
5326895334 move date-matching regex pattern to base parser module for use by all subclasses 2018-09-05 21:13:36 -04:00
Joshua Taillon
cc7a341e75 explicitly add txt, md, and csv types for consumer and viewer; fix thumbnail generation 2018-09-03 23:46:13 -04:00
Joshua Taillon
3c074d9e36 first stab at text consumer 2018-08-30 23:32:41 -04:00