120 Commits

Author SHA1 Message Date
jonaswinkler
ba7bf9b2d2 removed slugs entirely, since their only purpose was purely cosmetic anyway. 2020-12-09 00:04:37 +01:00
jonaswinkler
27893efdc9 a test that "verifies" that the file renaming lock works and no inconsistencies are created. 2020-12-08 21:08:44 +01:00
jonaswinkler
638113864f fixes #90 2020-12-08 13:54:49 +01:00
jonaswinkler
e4eeb29f54 checking file types against parsers in the consumer. 2020-12-01 15:26:05 +01:00
jonaswinkler
f08d494f1b filename handling for archive files. 2020-11-30 21:38:42 +01:00
jonaswinkler
1df64e3129 Merge branch 'dev' into feature-ocrmypdf 2020-11-30 16:48:09 +01:00
jonaswinkler
7658c07b4d added file type checks to the parsers to prevent temporary files from being consumed. Also: parsers announce file types they wish to use as default for each mime type. 2020-11-30 00:40:04 +01:00
jonaswinkler
ddb3ef49f6 Merge branch 'dev' into feature-ocrmypdf 2020-11-29 18:37:38 +01:00
jonaswinkler
0e9d88ef7d test cases for #67 2020-11-29 15:47:56 +01:00
jonaswinkler
744b86bb91 fixes an issue with paperless not assigning metadata when FILENAME_FORMAT is specified and resolves an invalid warning about missing files fixes #67 2020-11-29 14:45:43 +01:00
jonaswinkler
6931e37737 error logging. 2020-11-29 12:37:11 +01:00
jonaswinkler
96dc4c1daa added checksums for archived documents. 2020-11-29 12:31:26 +01:00
jonaswinkler
7bba3065fe Merge branch 'dev' into feature-ocrmypdf 2020-11-27 14:03:19 +01:00
jonaswinkler
72b4f817df moved consumption dir check into the correct spot 2020-11-27 13:12:13 +01:00
Jonas Winkler
c3adcd6b49 Merge branch 'dev' into feature-ocrmypdf 2020-11-25 21:13:02 +01:00
Jonas Winkler
3ba603a2e8 Paperless will continue to operate with encrypted files, however, all new files will be stored unencrypted. 2020-11-25 21:03:06 +01:00
Jonas Winkler
3167a0479a GnuPG for archive file. 2020-11-25 20:16:27 +01:00
Jonas Winkler
c51d292049 codestyle 2020-11-25 19:51:02 +01:00
Jonas Winkler
9bfa088eb5 reworked the interface of the parsers. 2020-11-25 19:36:39 +01:00
Jonas Winkler
17b62b61fa add support for archive files. 2020-11-25 14:47:17 +01:00
Jonas Winkler
28cd246d48 added archive directory. 2020-11-25 14:45:21 +01:00
Jonas Winkler
afc3753e58 code cleanup 2020-11-21 14:03:45 +01:00
Jonas Winkler
fc0ba2098a FileType does not care about the extension anymore. 2020-11-20 16:18:59 +01:00
Jonas Winkler
f976a0b4ba mime type handling 2020-11-20 13:31:03 +01:00
Jonas Winkler
8c40c54421 codestyle 2020-11-18 22:41:14 +01:00
Jonas Winkler
680ab3d56b updated logging, logging for the mail consumer to see whats happening 2020-11-18 13:23:30 +01:00
Jonas Winkler
39ba14aac1 refactor 2020-11-17 11:49:44 +01:00
Jonas Winkler
e30f0b274b added more testing 2020-11-16 23:16:37 +01:00
Jonas Winkler
bd04c966c5 first version of the new consumer. 2020-11-16 18:26:54 +01:00
Jonas Winkler
eb6805e37e code style fixes 2020-11-12 21:09:45 +01:00
Jonas Winkler
8b8a2af053 fixed the file handling implementation. The feature is cool, but the original implementation had so many small flaws it wasn't even funny. 2020-11-11 14:21:33 +01:00
Jonas Winkler
a91e46364a small consumer fixes 2020-11-11 14:14:21 +01:00
Jonas Winkler
3048342de7 added a setting: delete duplicate documents 2020-11-10 01:47:58 +01:00
Jonas Winkler
33f1c82943 updated the classifier. Its now much faster and does not retrain when data hasnt changed. 2020-11-06 14:46:06 +01:00
Jonas Winkler
9757e261f2 A handy script to redo ocr on all documents, 2020-11-03 14:04:11 +01:00
Jonas Winkler
a89773ad71 removed unused code, small fixes 2020-11-02 18:20:04 +01:00
Jonas Winkler
def3a85858 reworked most of the tesseract parser, better logging 2020-11-02 15:40:44 +01:00
Jonas Winkler
6fd73a04b8 updated consumer: now using watchdog 2020-11-01 23:07:54 +01:00
Jonas Winkler
6ce493e3a7 the document classifier is now stateless 2020-10-29 14:33:42 +01:00
Jonas Winkler
dd16b7262e unified document matching, legacy and automatching work alongside now 2020-10-28 11:45:11 +01:00
Jonas Winkler
93d963ed4e added
- document index
- api access for thumbnails/downloads
- more api filters

updated
- pipfile

removed
- filename handling
- legacy thumb/download access
- obsolete admin gui settings (per page items, FY, inline view)
2020-10-25 23:03:02 +01:00
Jonas Winkler
b71049ad16 Merge branch 'master' into dev 2020-10-16 15:02:57 +02:00
JOKer
5f8120add1 Merge pull request #593 from BastianPoe/feature-293
Give stored documents a structured and configurable filename
2020-05-02 08:33:49 +02:00
Johann Bauer
cea6dcce23 Warn if consume directory contains subdirectories
.
2020-01-04 01:09:54 +01:00
Wolf-Bastian Poettner
d1a54d6576 Allows to configure directory and filename formats for documents stored in paperless
Default configuration is as before (incrementing numbers), but additional fields can be added at will
2019-12-27 14:25:38 +00:00
Jonas Winkler
f711b146e1 Merge branch 'master' into dev 2018-12-11 12:38:15 +01:00
Jonas Winkler
8f0d53c54a Merge remote-tracking branch 'upstream/master' 2018-12-11 12:06:15 +01:00
Daniel Quinn
bc898c1992 Use optipng to optimise document thumbnails 2018-10-07 14:56:38 +01:00
Daniel Quinn
40b9e44bfe Wrap document consumption in a transaction #262 2018-10-07 13:12:22 +01:00
Jonas Winkler
001a80a528 Restored tagging functionality 2018-09-27 20:41:16 +02:00