156 Commits

Author SHA1 Message Date
Jonas Winkler
f4cebda085 A handy script to redo ocr on all documents, 2020-11-03 14:04:11 +01:00
Jonas Winkler
68df1cf4ee replaced usages of .id with .pk, fixed filename issue in exporter 2020-11-03 12:37:37 +01:00
Jonas Winkler
7d282a4e4e removed unused code, small fixes 2020-11-02 18:20:04 +01:00
Jonas Winkler
9f29dc2863 updated consumer: now using watchdog 2020-11-01 23:07:54 +01:00
Jonas Winkler
05f20c19c3 the document classifier is now stateless 2020-10-29 14:33:42 +01:00
Jonas Winkler
11af74ba36 unified document matching, legacy and automatching work alongside now 2020-10-28 11:45:11 +01:00
Jonas Winkler
c26962f17f changed a few things 2020-10-27 17:08:18 +01:00
Jonas Winkler
c596fe6782 unified data folders 2020-10-26 00:35:24 +01:00
Jonas Winkler
052c1680f3 added
- document index
- api access for thumbnails/downloads
- more api filters

updated
- pipfile

removed
- filename handling
- legacy thumb/download access
- obsolete admin gui settings (per page items, FY, inline view)
2020-10-25 23:03:02 +01:00
Jonas Winkler
421dab786d Merge branch 'master' into dev 2020-10-16 15:02:57 +02:00
JOKer
8698f92ac9
Merge pull request #593 from BastianPoe/feature-293
Give stored documents a structured and configurable filename
2020-05-02 08:33:49 +02:00
Johann Bauer
22c7f309a7 Warn if consume directory contains subdirectories
.
2020-01-04 01:09:54 +01:00
Wolf-Bastian Poettner
a79a0ca302 Added tool to rename all documents according to the lastest filename
format
2019-12-27 14:25:38 +00:00
Jonas Winkler
32f3876590 Merge branch 'master' into dev 2019-05-21 13:06:16 +02:00
Dominik von Allmen
e92f736b5b
Update change_storage_type.py 2019-04-02 14:12:00 +02:00
domphonallmen
117726ec72
avoid error when decrypting files with non-ascii character 2019-04-02 11:38:00 +02:00
Jonas Winkler
7257cece30 Code style changes 2018-09-26 10:51:42 +02:00
Jonas Winkler
5b9f38d398 Removed the archive tag, as it wasnt really used anyway. 2018-09-25 21:51:38 +02:00
Jonas Winkler
b31d4779bf Code style changes 2018-09-25 21:12:47 +02:00
Jonas Winkler
60618381f8 Code style adjustments 2018-09-25 16:09:33 +02:00
Jonas Winkler
94ede7389d Merge remote-tracking branch 'upstream/master' 2018-09-25 14:47:12 +02:00
Daniel Quinn
090565d84c Tweak the import/export system to handle encryption choices better
Now when you export a document, the `storage_type` value is always
`unencrypted` (since that's what it is when it's exported anyway), and
the flag is set by the importing script instead, based on the existence
of a `PAPERLESS_PASSPHRASE` environment variable, indicating that
encryption is enabled.
2018-09-23 13:58:40 +01:00
Jonas Winkler
909586bf25 Code style changed 2018-09-13 14:15:16 +02:00
Jonas Winkler
7c589f71a4 Fixed a few minor issues. 2018-09-12 16:25:23 +02:00
Jonas Winkler
46a5bc00d7 Merge branch 'machine-learning' into dev 2018-09-11 14:36:21 +02:00
Jonas Winkler
d46ee11143 The classifier works with ids now, not names. Minor changes. 2018-09-11 14:30:18 +02:00
Jonas Winkler
d2534a73e5 changed classifier 2018-09-11 00:33:07 +02:00
Jonas Winkler
11adc94e5e mode change 2018-09-06 12:00:01 +02:00
Jonas Winkler
d26f940a91 Merge branch 'dev' into machine-learning 2018-09-06 00:29:41 +02:00
Jonas Winkler
13725ef8ee Merge branch 'master' into dev 2018-09-06 00:28:58 +02:00
Jonas Winkler
8eeded95c4 Merge branch 'dev' into machine-learning 2018-09-05 15:26:39 +02:00
Jonas Winkler
cea880f245 implemented automatic classification field functionality 2018-09-05 14:31:02 +02:00
Jonas Winkler
82bc0e3368 Fixed a few things 2018-09-05 12:43:11 +02:00
Jonas Winkler
70bd05450a removed matching model fields, automatic classifier reloading, added autmatic_classification field to matching model 2018-09-04 18:40:26 +02:00
Jonas Winkler
68652c8c37 Document Type exporting 2018-09-04 14:55:29 +02:00
Jonas Winkler
c091eba26e Implemented the classifier model, including automatic tagging of new documents 2018-09-04 14:39:55 +02:00
Jonas Winkler
ca315ba76c Added code that trains models based on data from the databasae 2018-09-03 15:55:41 +02:00
Daniel Quinn
cccc9e1a24 Clean up some linter complaints 2018-09-02 20:33:49 +01:00
Jonas Winkler
350da81081 Added command to create datasets 2018-09-02 12:47:19 +02:00
Jonas Winkler
c3a144f2ca inbox tags, archive tags, archive serial number for documents 2018-07-06 13:25:02 +02:00
Daniel Quinn
d6d8537b69 Remove emoji from storage-type changer 2018-06-17 17:23:50 +01:00
Daniel Quinn
988adf963a Update import & export to handle encryption toggle 2018-06-17 17:06:22 +01:00
Daniel Quinn
c9f35a7da2
Merge branch 'master' into mcronce-disable_encryption 2018-06-17 16:32:51 +01:00
Daniel Quinn
81a8cb45d7 It's exist_ok=, not exists_ok= -- my bad. 2018-05-28 13:08:00 +01:00
Daniel Quinn
27a936f9bf Add script to (de|en)crypt all documents 2018-05-28 12:58:28 +01:00
Erik Arvstedt
bccac5017c fixup: remove helper fn 'make_dirs' 2018-05-21 00:45:00 +02:00
Erik Arvstedt
7e1d59377a Add inotify support 2018-05-11 14:14:50 +02:00
Erik Arvstedt
7357471b9e Consumer loop: make sleep duration dynamic
Make the sleep duration dynamic to account for the time spent in
loop_step.
This improves responsiveness when repeatedly consuming newly
arriving docs.

Use float epoch seconds (time.time()) as the time type for
MailFetcher.last_checked to allow for natural time arithmetic.
2018-05-11 14:14:50 +02:00
Erik Arvstedt
bd75a65866 Refactor: renamings, extract fn 'loop'
Renamings:
loop -> loop_step
delta -> next_mail_time (this variable names a point in time, not a duration)

Extracting the 'loop' fn is a preparation for later commits where a
second type of loop is added.
2018-05-11 14:14:25 +02:00
Erik Arvstedt
61cd050e24 Ensure docs have been unmodified for some time before consuming
Previously, the second mtime check for new files usually happened right
after the first one, which could have caused consumption of docs that
were still being modified.

We're now waiting for at least FILES_MIN_UNMODIFIED_DURATION (0.5s).

This also cleans up the logic by eliminating the consumer.stats attribute
and the weird double call to consumer.run().

Additionally, this a fixes memory leak in consumer.stats where paths could be
added but never removed if the corresponding files disappeared from
the consumer dir before being considered ready.
2018-05-11 14:05:29 +02:00