74 Commits

Author SHA1 Message Date
Trenton Holmes
9e091e333d Attempts to make production consumer more event driven while still allowing unit testing 2022-08-14 17:47:59 -07:00
Trenton Holmes
5d2062a076 Includes the actual OSError string in the log, instead of assuming it's a busy file 2022-07-11 11:57:02 -07:00
Trenton Holmes
836373bc3b Adds configuration variable to the inotify debounce timing 2022-05-15 11:48:12 -07:00
Trenton Holmes
74815a489f Minor improvements for quality of life 2022-05-09 12:05:29 -07:00
Trenton Holmes
dcc2e018ff Adds additional checking for both inotify and polling around document still being busy before consuming it 2022-04-10 12:21:34 -07:00
Trenton Holmes
6635fa5f0d Runs the pre-commit hooks over all the Python files 2022-03-11 11:34:28 -08:00
kpj
c56cb25b5f Format Python code with black 2022-02-27 15:26:41 +01:00
Daniel Albers
bc685e8edb Make ignores configurable
Adds config file setting PAPERLESS_CONSUMER_IGNORE_PATTERNS.
2021-08-18 22:23:18 +02:00
jonaswinkler
3e42ceef38 ignore macOS specific files 2021-05-19 19:56:01 +02:00
jonaswinkler
b957531100 run the polling file change checks on individual threads to speed up queueing of new files 2021-02-21 12:43:55 +01:00
jonaswinkler
692557a364 increased default delay when waiting for file changes with polling 2021-02-21 12:14:54 +01:00
jonaswinkler
555e37958f better exception logging 2021-02-11 22:16:41 +01:00
jonaswinkler
e5a7dc0cc7 rework most of the logging 2021-02-05 01:10:29 +01:00
jonaswinkler
eeff7b3bdb code style 2021-02-02 23:58:25 +01:00
jonaswinkler
ba48e0ca1a revert a change 2021-01-21 22:29:47 +01:00
jonaswinkler
6e29f64a8e revert changes for #351 2021-01-20 11:56:09 +01:00
jonaswinkler
280ba2fcc2 fixes #351 2021-01-19 14:43:55 +01:00
jonaswinkler
9ded48acab fixes #300 2021-01-09 01:54:51 +01:00
jonaswinkler
33ca08f794 tags from folders: case insensitive 2020-12-09 00:07:22 +01:00
jonaswinkler
ba7bf9b2d2 removed slugs entirely, since their only purpose was purely cosmetic anyway. 2020-12-09 00:04:37 +01:00
jonaswinkler
e4eeb29f54 checking file types against parsers in the consumer. 2020-12-01 15:26:05 +01:00
jayme-github
a90b7a647e Create tags from sub directories
The names of sub directories in the consumer directory will be added as
tags for the document to be consumed.
To enable this, set:
PAPERLESS_CONSUMER_RECURSIVE=1
PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=1

Fixes #50
2020-11-30 14:22:35 +01:00
jonaswinkler
c6627eac1f fix warnings about unclosed files. 2020-11-27 13:19:58 +01:00
jonaswinkler
29867ba6bd inotify: cleanup descriptor when done 2020-11-27 13:12:34 +01:00
jonaswinkler
72b4f817df moved consumption dir check into the correct spot 2020-11-27 13:12:13 +01:00
jonaswinkler
221c1e76e9 couple changes to the consumer. 2020-11-26 18:55:05 +01:00
jonaswinkler
dac7971cd6 Apparently there was a very good reason to use inotify. fixes #46 complete with test cases for inotify and polling. 2020-11-26 17:57:03 +01:00
Jonas Winkler
9c23207b84 workaround for a bug in django-q: task results with too long names would not show up in the result lists. 2020-11-22 13:53:19 +01:00
Jonas Winkler
afc3753e58 code cleanup 2020-11-21 14:03:45 +01:00
Jonas Winkler
5eb5aa6fb6 removed unused code. 2020-11-18 00:54:51 +01:00
Jonas Winkler
24bb8c71c9 Merge branch 'dev' into mail_rework 2020-11-17 00:23:10 +01:00
Jonas Winkler
e30f0b274b added more testing 2020-11-16 23:16:37 +01:00
Jonas Winkler
2119eb4c15 added option for polling 2020-11-16 18:52:13 +01:00
Jonas Winkler
bd04c966c5 first version of the new consumer. 2020-11-16 18:26:54 +01:00
Jonas Winkler
d99b4623f8 first implementation of the mail rework 2020-11-15 23:56:22 +01:00
Jonas Winkler
eb6805e37e code style fixes 2020-11-12 21:09:45 +01:00
Jonas Winkler
1fa2c54932 on_modified not needed for the consumer. 2020-11-12 10:41:47 +01:00
Jonas Winkler
f53a958bc5 fixes #30 2020-11-12 09:30:04 +01:00
Jonas Winkler
6fd73a04b8 updated consumer: now using watchdog 2020-11-01 23:07:54 +01:00
Jonas Winkler
d3af1e8815 unified data folders 2020-10-26 00:35:24 +01:00
Johann Bauer
cea6dcce23 Warn if consume directory contains subdirectories
.
2020-01-04 01:09:54 +01:00
Daniel Quinn
ef0b33e72e Clean up some linter complaints 2018-09-02 20:33:49 +01:00
Daniel Quinn
d1b6e9329f It's exist_ok=, not exists_ok= -- my bad. 2018-05-28 13:08:00 +01:00
Erik Arvstedt
d132e2b9f5 fixup: remove helper fn 'make_dirs' 2018-05-21 00:45:00 +02:00
Erik Arvstedt
3db175dfe2 Add inotify support 2018-05-11 14:14:50 +02:00
Erik Arvstedt
b74b47423d Consumer loop: make sleep duration dynamic
Make the sleep duration dynamic to account for the time spent in
loop_step.
This improves responsiveness when repeatedly consuming newly
arriving docs.

Use float epoch seconds (time.time()) as the time type for
MailFetcher.last_checked to allow for natural time arithmetic.
2018-05-11 14:14:50 +02:00
Erik Arvstedt
aac17670de Refactor: renamings, extract fn 'loop'
Renamings:
loop -> loop_step
delta -> next_mail_time (this variable names a point in time, not a duration)

Extracting the 'loop' fn is a preparation for later commits where a
second type of loop is added.
2018-05-11 14:14:25 +02:00
Erik Arvstedt
f56ec70aad Ensure docs have been unmodified for some time before consuming
Previously, the second mtime check for new files usually happened right
after the first one, which could have caused consumption of docs that
were still being modified.

We're now waiting for at least FILES_MIN_UNMODIFIED_DURATION (0.5s).

This also cleans up the logic by eliminating the consumer.stats attribute
and the weird double call to consumer.run().

Additionally, this a fixes memory leak in consumer.stats where paths could be
added but never removed if the corresponding files disappeared from
the consumer dir before being considered ready.
2018-05-11 14:05:29 +02:00
Erik Arvstedt
9320230100 Refactor: extract fn 'make_dirs' 2018-05-11 14:04:36 +02:00
Daniel Quinn
19209ba5af Run a --oneshot loop twice
This was necessary since the first loop only ever collects file
statistics so that the second run can be sure about "readiness".
2018-03-03 18:43:20 +00:00