Jonas Winkler
052c1680f3
added
...
- document index
- api access for thumbnails/downloads
- more api filters
updated
- pipfile
removed
- filename handling
- legacy thumb/download access
- obsolete admin gui settings (per page items, FY, inline view)
2020-10-25 23:03:02 +01:00
Jonas Winkler
421dab786d
Merge branch 'master' into dev
2020-10-16 15:02:57 +02:00
JOKer
8698f92ac9
Merge pull request #593 from BastianPoe/feature-293
...
Give stored documents a structured and configurable filename
2020-05-02 08:33:49 +02:00
Johann Bauer
22c7f309a7
Warn if consume directory contains subdirectories
...
.
2020-01-04 01:09:54 +01:00
Wolf-Bastian Poettner
6813805712
Allows to configure directory and filename formats for documents stored in paperless
...
Default configuration is as before (incrementing numbers), but additional fields can be added at will
2019-12-27 14:25:38 +00:00
Jonas Winkler
ea58c66fd4
Merge branch 'master' into dev
2018-12-11 12:38:15 +01:00
Jonas Winkler
766109ae4e
Merge remote-tracking branch 'upstream/master'
2018-12-11 12:06:15 +01:00
Daniel Quinn
750ab5bf85
Use optipng to optimise document thumbnails
2018-10-07 14:56:38 +01:00
Daniel Quinn
14bb52b6a4
Wrap document consumption in a transaction #262
2018-10-07 13:12:22 +01:00
Jonas Winkler
b347e3347d
Restored tagging functionality
2018-09-27 20:41:16 +02:00
Jonas Winkler
11adc94e5e
mode change
2018-09-06 12:00:01 +02:00
Jonas Winkler
70bd05450a
removed matching model fields, automatic classifier reloading, added autmatic_classification field to matching model
2018-09-04 18:40:26 +02:00
Erik Arvstedt
742b01d1f5
Update Consumer class documentation
2018-06-17 20:17:40 +01:00
Daniel Quinn
90cd9f3eb7
Drop lines thanks to @erikarvstedt's eagle-eye
2018-06-17 17:10:45 +01:00
Daniel Quinn
c9f35a7da2
Merge branch 'master' into mcronce-disable_encryption
2018-06-17 16:32:51 +01:00
Daniel Quinn
81a8cb45d7
It's exist_ok=, not exists_ok= -- my bad.
2018-05-28 13:08:00 +01:00
Daniel Quinn
6e1f2b3f03
Drop STORAGE_TYPE in favour of just using PAPERLESS_PASSPHRASE
2018-05-28 12:58:28 +01:00
Daniel Quinn
d8740ee5ca
Make the consumer aware of the different storage types
2018-05-28 12:58:28 +01:00
Erik Arvstedt
bccac5017c
fixup: remove helper fn 'make_dirs'
2018-05-21 00:45:00 +02:00
Erik Arvstedt
e65e27d11f
Consider mtime of ignored files, garbage-collect ignore list
...
1. Store the mtime of ignored files so that we can reconsider them if
they have changed.
2. Regularly reset the ignore list to files that still exist in the
consumption dir. Previously, the list could grow indefinitely.
2018-05-11 14:05:30 +02:00
Erik Arvstedt
12488c9634
Simplify ignoring docs
2018-05-11 14:05:29 +02:00
Erik Arvstedt
61cd050e24
Ensure docs have been unmodified for some time before consuming
...
Previously, the second mtime check for new files usually happened right
after the first one, which could have caused consumption of docs that
were still being modified.
We're now waiting for at least FILES_MIN_UNMODIFIED_DURATION (0.5s).
This also cleans up the logic by eliminating the consumer.stats attribute
and the weird double call to consumer.run().
Additionally, this a fixes memory leak in consumer.stats where paths could be
added but never removed if the corresponding files disappeared from
the consumer dir before being considered ready.
2018-05-11 14:05:29 +02:00
Erik Arvstedt
f018e8e54f
Refactor: extract fn try_consume_file
...
The main purpose of this change is to make the following commits more
readable.
2018-05-11 14:05:28 +02:00
Erik Arvstedt
a56a3eb86d
Use os.scandir instead of os.listdir
...
It's simpler and better suited for use cases introduced in later commits.
2018-05-11 14:05:25 +02:00
Erik Arvstedt
2fe7df8ca0
Consume documents in order of increasing mtime
...
This increases overall usability, especially for multi-page scans.
Previously, the consumption order was undefined (see os.listdir())
2018-05-11 14:04:37 +02:00
Erik Arvstedt
873c98dddb
Refactor: extract fn 'make_dirs'
2018-05-11 14:04:36 +02:00
Daniel Quinn
73e62600c2
Clean up docstring to be properly rst
2018-03-03 18:43:20 +00:00
Ovv
8fefafb844
style & test
2018-03-03 18:43:20 +00:00
Ovv
d1a57b5d68
Configuration cli argument for document_consumer
2018-03-03 18:43:20 +00:00
Daniel Quinn
ea6d040809
Monitor return codes of calls to convert
and unpaper
...
...and handle the failures nicely. Addresses #303 .
2018-02-18 16:02:27 +00:00
Daniel Quinn
fb1da4834c
Style and removal of Python 2.7 stuff
2018-02-18 15:55:55 +00:00
Wolf-Bastian Pöttner
b140935843
Add support for a heuristic that extracts the document date from its text
2018-01-28 19:37:10 +01:00
Daniel Quinn
fa4924d5ba
fix: allow for caps in file name suffixes #206
...
@schinkelg ran aground of this one and I took the opportunity to add a
test to catch this sort of thing for next time.
2017-03-28 21:14:24 +00:00
Daniel Quinn
55e81ca4bb
feat: refactor for pluggable consumers
...
I've broken out the OCR-specific code from the consumers and dumped it
all into its own app, `paperless_tesseract`. This new app should serve
as a sample of how to create one's own consumer for different file
types.
Documentation for how to do this isn't ready yet, but for the impatient:
* Create a new app
* containing a `parsers.py` for your parser modelled after
`paperless_tesseract.parsers.RasterisedDocumentParser`
* containing a `signals.py` with a handler moddelled after
`paperless_tesseract.signals.ConsumerDeclaration`
* connect the signal handler to
`documents.signals.document_consumer_declaration` in
`your_app.apps`
* Install the app into Paperless by declaring
`PAPERLESS_INSTALLED_APPS=your_app`. Additional apps should be
separated with commas.
* Restart the consumer
2017-03-25 15:10:25 +00:00
Daniel Quinn
18495ce9da
Fix for #154
...
* Added a test with a faked pyocr and tesseract
* Added a catch for pyocr's *other* TesseractError
2016-11-27 15:06:45 +00:00
Daniel Quinn
ca21929cee
Moved logging logic into the consumer
2016-10-26 09:52:09 +00:00
Daniel Quinn
8e58406881
pep8 corrections
2016-10-26 09:32:59 +00:00
Aleksandr Bogdanov
63de2ca1b0
Collapsing excess whitespace after OCR
2016-10-12 01:46:34 +02:00
Daniel Quinn
1ce76a5486
Actually write the date found in the file name
2016-08-20 18:11:51 +01:00
Lenz Weber
018efc576b
wait until file is completely transmitted
...
negation was missing for feature to be active, see #128
2016-06-26 10:18:58 +02:00
Brian Martin
b6ae129ad1
Sample Config and Bug Fix
...
Update sample config to reflect new setting variable.
Change consumer to handle density setting as str instead of int.
2016-05-13 23:23:58 -04:00
Brian Martin
52c5aafb3f
Convert Density
...
Add settings variable for the convert density setting.
If no variable is set, default to 300.
2016-05-13 22:47:40 -04:00
Daniel Quinn
e96c7448bc
Fix for #107
2016-04-11 23:28:12 +01:00
Daniel Quinn
90939be6af
@Pitkley made a good suggestion in #98
2016-04-10 17:39:49 +01:00
Daniel Quinn
64b72d4337
Added test for duplicates
2016-04-03 18:44:00 +01:00
Daniel Quinn
bbe691f342
Merge pull request #101 from danielquinn/issue/89
...
Closes #89 .
2016-03-28 14:25:56 +01:00
Daniel Quinn
b4e648e1e3
Test All The Things
2016-03-28 14:16:26 +01:00
Daniel Quinn
b92e007e15
Removed log components and introduced signals for tags & correspondents
2016-03-28 11:11:15 +01:00
Daniel Quinn
49b56425e8
Merge branch 'master' into issue/81
2016-03-25 20:56:30 +00:00
Daniel Quinn
b387be6f25
I didn't mean to explicitly set -limit
2016-03-25 20:33:00 +00:00