175 Commits

Author SHA1 Message Date
Jonas Winkler
bbba57dd4d implemented automatic classification field functionality 2018-09-05 14:31:02 +02:00
Jonas Winkler
582e9c5cb4 Fixed a few things 2018-09-05 12:43:11 +02:00
Jonas Winkler
9d4155a907 removed matching model fields, automatic classifier reloading, added autmatic_classification field to matching model 2018-09-04 18:40:26 +02:00
Jonas Winkler
8a1a794577 Document Type exporting 2018-09-04 14:55:29 +02:00
Jonas Winkler
c50c517928 Implemented the classifier model, including automatic tagging of new documents 2018-09-04 14:39:55 +02:00
Jonas Winkler
3eecd67fc1 Added code that trains models based on data from the databasae 2018-09-03 15:55:41 +02:00
Daniel Quinn
ef0b33e72e Clean up some linter complaints 2018-09-02 20:33:49 +01:00
Jonas Winkler
daa93883ee Added command to create datasets 2018-09-02 12:47:19 +02:00
Jonas Winkler
c03cfb176c inbox tags, archive tags, archive serial number for documents 2018-07-06 13:25:02 +02:00
Daniel Quinn
e7e69d3f6f Remove emoji from storage-type changer 2018-06-17 17:23:50 +01:00
Daniel Quinn
044d707c40 Update import & export to handle encryption toggle 2018-06-17 17:06:22 +01:00
Daniel Quinn
e7fefc40fe Merge branch 'master' into mcronce-disable_encryption 2018-06-17 16:32:51 +01:00
Daniel Quinn
d1b6e9329f It's exist_ok=, not exists_ok= -- my bad. 2018-05-28 13:08:00 +01:00
Daniel Quinn
4576541c28 Add script to (de|en)crypt all documents 2018-05-28 12:58:28 +01:00
Erik Arvstedt
d132e2b9f5 fixup: remove helper fn 'make_dirs' 2018-05-21 00:45:00 +02:00
Erik Arvstedt
3db175dfe2 Add inotify support 2018-05-11 14:14:50 +02:00
Erik Arvstedt
b74b47423d Consumer loop: make sleep duration dynamic
Make the sleep duration dynamic to account for the time spent in
loop_step.
This improves responsiveness when repeatedly consuming newly
arriving docs.

Use float epoch seconds (time.time()) as the time type for
MailFetcher.last_checked to allow for natural time arithmetic.
2018-05-11 14:14:50 +02:00
Erik Arvstedt
aac17670de Refactor: renamings, extract fn 'loop'
Renamings:
loop -> loop_step
delta -> next_mail_time (this variable names a point in time, not a duration)

Extracting the 'loop' fn is a preparation for later commits where a
second type of loop is added.
2018-05-11 14:14:25 +02:00
Erik Arvstedt
f56ec70aad Ensure docs have been unmodified for some time before consuming
Previously, the second mtime check for new files usually happened right
after the first one, which could have caused consumption of docs that
were still being modified.

We're now waiting for at least FILES_MIN_UNMODIFIED_DURATION (0.5s).

This also cleans up the logic by eliminating the consumer.stats attribute
and the weird double call to consumer.run().

Additionally, this a fixes memory leak in consumer.stats where paths could be
added but never removed if the corresponding files disappeared from
the consumer dir before being considered ready.
2018-05-11 14:05:29 +02:00
Erik Arvstedt
9320230100 Refactor: extract fn 'make_dirs' 2018-05-11 14:04:36 +02:00
Daniel Quinn
19209ba5af Run a --oneshot loop twice
This was necessary since the first loop only ever collects file
statistics so that the second run can be sure about "readiness".
2018-03-03 18:43:20 +00:00
Ovv
340855cd87 Help & documentation 2018-03-03 18:43:20 +00:00
Ovv
b10c2c770c style & test 2018-03-03 18:43:20 +00:00
Ovv
d89dbbe537 Configuration cli argument for document_consumer 2018-03-03 18:43:20 +00:00
Daniel Quinn
345bc97c8c Updated for style and to add a --use-first option 2018-02-08 20:03:29 +00:00
Dashie
3df9ea3b26 Fix line length 2018-02-08 19:34:48 +00:00
Dashie
73a9a23860 Add manager command to re-tag documents without correspondent 2018-02-08 19:34:48 +00:00
pzl
7a01005989 small typo in exporter thumbnail filename 2018-01-19 14:28:46 -05:00
David Martin
91cebb5567 Fetch emails right at startup instead of waiting for 10 minutes.
Especially when first setting up the configuration for consuming
documents from emails it makes sense to quickly test the changes. Having
to wait for 10 minutes is not acceptable.

There are two ways around it that come to my mind: the simple approach
is to always fetch the emails when Paperless first starts. This way the
fetching of emails can be tested straight away.
The alternative would be to have a configuration option that allows to
set the interval in which emails are checked. The user could then reduce
it to test the setup and increase it again later on. This seems
needlessly complicated though, so fetching at startup it is.
2017-05-21 14:23:46 +10:00
CkuT
cabb9b5096 Use relatives paths instead of absolutes paths for document export/import 2017-05-08 15:23:35 +02:00
CkuT
a4f389de36 Refactor to get the document time once 2017-05-08 15:02:59 +02:00
CkuT
909fa3579c Use constants for manifest 2017-05-08 14:54:48 +02:00
CkuT
4c4255172f Add thumbnail export 2017-05-06 15:14:36 +02:00
CkuT
0057feefd1 Fix the source file checking 2017-05-06 15:04:47 +02:00
Daniel Quinn
23bd887f16 Consumer loop time is now configurable 2017-01-01 18:41:06 +00:00
Daniel Quinn
30be13ae33 Added system checks to warn people of misconfigurations 2017-01-01 18:39:34 +00:00
Daniel Quinn
8e58406881 pep8 corrections 2016-10-26 09:32:59 +00:00
Daniel Quinn
01919f75d9 pep8 2016-08-20 18:14:33 +01:00
Daniel Quinn
c7dda9de96 A quick & easy way to see the logs 2016-08-20 18:08:28 +01:00
Daniel Quinn
b92e007e15 Removed log components and introduced signals for tags & correspondents 2016-03-28 11:11:15 +01:00
Daniel Quinn
0aa0513004 Modifications for support for dates 2016-03-24 19:18:33 +00:00
Daniel Quinn
3b278c3a24 Added an informational log message for consumer start 2016-03-06 17:26:07 +00:00
Daniel Quinn
5d4587ef8b Accounted for .sender in a few places 2016-03-04 09:14:50 +00:00
Daniel Quinn
ba7878b9aa Added some tests for the importer 2016-03-03 21:25:08 +00:00
Daniel Quinn
070463b85a s/Sender/Correspondent & reworked the (im|ex)porter 2016-03-03 20:52:42 +00:00
Daniel Quinn
439b60ce5c Merged new logging system 2016-02-28 15:01:19 +00:00
Daniel Quinn
631aa99d92 No need to pass verbosity around anymore 2016-02-28 00:39:40 +00:00
Daniel Quinn
51173d80cf License clarification 2016-02-27 20:19:09 +00:00
Daniel Quinn
422ae9303a pep8 2016-02-21 00:14:50 +00:00
Pit Kleyersburg
724afa59c7 Add Dockerfile for application and documentation
This commit adds a `Dockerfile` to the root of the project, accompanied
by a `docker-compose.yml.example` for simplified deployment. The
`Dockerfile` is agnostic to whether it will be the webserver, the
consumer, or if it is run for a one-off command (i.e. creation of a
superuser, migration of the database, document export, ...).

The containers entrypoint is the `scripts/docker-entrypoint.sh` script.
This script verifies that the required permissions are set, remaps the
default users and/or groups id if required and installs additional
languages if the user wishes to.

After initialization, it analyzes the command the user supplied:

  - If the command starts with a slash, it is expected that the user
    wants to execute a binary file and the command will be executed
    without further intervention. (Using `exec` to effectively replace
    the started shell-script and not have any reaping-issues.)

  - If the command does not start with a slash, the command will be
    passed directly to the `manage.py` script without further
    modification. (Again using `exec`.)

The default command is set to `--help`.

If the user wants to execute a command that is not meant for `manage.py`
but doesn't start with a slash, the Docker `--entrypoint` parameter can
be used to circumvent the mechanics of `docker-entrypoint.sh`.

Further information can be found in `docs/setup.rst` and in
`docs/migrating.rst`.

For additional convenience, a `Dockerfile` has been added to the `docs/`
directory which allows for easy building and serving of the
documentation. This is documented in `docs/requirements.rst`.
2016-02-18 22:58:32 +01:00