* Feature: collate two single-sided scans
Some ADF only support single-sided scans, making scanning
double-sided documents a bit annoying.
This new feature enables Paperless to do most of the work,
by merging two seperate scans into a single one, collating
the even and odd numbered pages.
* Documentation: clarify that collation is disabled by default
* Apply suggestions from code review
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
* Address code review remarks
* Grammar fixes
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
* also split documents when an ASN barcode is found
* linter
* fix test case parameters
* avoid pre-python-3.9 features
* simplify dict-creation in tests
* simplify dict-creation in tests for empty dicts
* Add test cases for the splitting by ASN barcode feature
* deleted supporting files for test case construction
I've broken out the OCR-specific code from the consumers and dumped it
all into its own app, `paperless_tesseract`. This new app should serve
as a sample of how to create one's own consumer for different file
types.
Documentation for how to do this isn't ready yet, but for the impatient:
* Create a new app
* containing a `parsers.py` for your parser modelled after
`paperless_tesseract.parsers.RasterisedDocumentParser`
* containing a `signals.py` with a handler moddelled after
`paperless_tesseract.signals.ConsumerDeclaration`
* connect the signal handler to
`documents.signals.document_consumer_declaration` in
`your_app.apps`
* Install the app into Paperless by declaring
`PAPERLESS_INSTALLED_APPS=your_app`. Additional apps should be
separated with commas.
* Restart the consumer