Make the sleep duration dynamic to account for the time spent in
loop_step.
This improves responsiveness when repeatedly consuming newly
arriving docs.
Use float epoch seconds (time.time()) as the time type for
MailFetcher.last_checked to allow for natural time arithmetic.
Renamings:
loop -> loop_step
delta -> next_mail_time (this variable names a point in time, not a duration)
Extracting the 'loop' fn is a preparation for later commits where a
second type of loop is added.
Previously, the second mtime check for new files usually happened right
after the first one, which could have caused consumption of docs that
were still being modified.
We're now waiting for at least FILES_MIN_UNMODIFIED_DURATION (0.5s).
This also cleans up the logic by eliminating the consumer.stats attribute
and the weird double call to consumer.run().
Additionally, this a fixes memory leak in consumer.stats where paths could be
added but never removed if the corresponding files disappeared from
the consumer dir before being considered ready.
Especially when first setting up the configuration for consuming
documents from emails it makes sense to quickly test the changes. Having
to wait for 10 minutes is not acceptable.
There are two ways around it that come to my mind: the simple approach
is to always fetch the emails when Paperless first starts. This way the
fetching of emails can be tested straight away.
The alternative would be to have a configuration option that allows to
set the interval in which emails are checked. The user could then reduce
it to test the setup and increase it again later on. This seems
needlessly complicated though, so fetching at startup it is.
Rename exporter to export and fixt some debugging
Account for files not matching the sender/title pattern
Added a safety note
Wrong regex on the name parser
Renamed the command to something slightly less ambiguous