35 Commits

Author SHA1 Message Date
Dennis Brakhane
8c7554e081
Feature: collate two single-sided multipage scans (#3784)
* Feature: collate two single-sided scans

Some ADF only support single-sided scans, making scanning
double-sided documents a bit annoying.

This new feature enables Paperless to do most of the work,
by merging two seperate scans into a single one, collating
the even and odd numbered pages.

* Documentation: clarify that collation is disabled by default

* Apply suggestions from code review

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>

* Address code review remarks

* Grammar fixes

---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-07-24 00:29:04 -07:00
Bastian Machek
931f5f9c27
Feature: support barcode upscaling for better detection of small barcodes (#3655) 2023-06-27 10:18:47 -07:00
Trenton H
07e07fc7e8 Updates handling of barcodes to encapsulate logic, moving it out of tasks and into barcodes 2023-05-22 06:52:31 -07:00
Fabian Ohler
658d372cd2
Feature: split documents on ASN barcode (#2554)
* also split documents when an ASN barcode is found

* linter

* fix test case parameters

* avoid pre-python-3.9 features

* simplify dict-creation in tests

* simplify dict-creation in tests for empty dicts

* Add test cases for the splitting by ASN barcode feature

* deleted supporting files for test case construction
2023-02-01 01:13:30 -08:00
Trenton H
2ab77fbaf7 Removes pikepdf based scanning, fixes up unit testing (+ commenting) 2023-01-27 12:24:47 -08:00
shamoon
c7690c05f5
Merge pull request #2498 from paperless-ngx/fix-2496
Fix: limit asn integer size
2023-01-24 10:37:04 -08:00
Trenton H
4195d5746f Rescales images from PDFs so zbar can better find them 2023-01-24 10:30:53 -08:00
Trenton H
8b90b51b1a Adjust the barcode to ASN range check and add test case to cover the check 2023-01-24 10:30:32 -08:00
Peter Kappelt
c2880bcf9a Extended tests for ASN barcode parsing 2023-01-24 09:43:52 -08:00
Trenton H
d52fbbb040 More smoothly handle the case of a password protected PDF for barcodes 2022-10-24 13:16:14 -07:00
Trenton Holmes
4cc2976614 Adds specific handling for CCITT Group 4, which pikepdf decodes, but not correctly 2022-10-11 13:51:14 -07:00
Trenton Holmes
9ae847039b Fixes the seperation of files by barcode, during the case where 2 barcodes appear back to back 2022-09-14 14:00:37 -07:00
Trenton Holmes
1df517afd3 Removes last vestiges of PNG from the tests, code, docs and samples 2022-06-11 14:20:50 -07:00
Trenton Holmes
8a6aaf4e2d Adds additional testing for both date parsing and consumed document created date 2022-05-08 16:57:35 -07:00
Florian Brandes
ad5188a280
add TIFF barcode support
Signed-off-by: Florian Brandes <florian.brandes@posteo.de>
2022-04-16 21:59:03 +02:00
Florian Brandes
e214f719c9 add more tests
Signed-off-by: Florian Brandes <florian.brandes@posteo.de>
2022-04-07 11:14:29 +02:00
Florian Brandes
10ca515ac5 addes tests:
- barcode-39
- barcode-128
- qr barcodes
- test for consumption

Signed-off-by: Florian Brandes <florian.brandes@posteo.de>
2022-04-07 11:14:29 +02:00
florian on nixos (Florian Brandes)
bcce0838dd working split pages
Signed-off-by: florian on nixos (Florian Brandes) <florian.brandes@posteo.de>
2022-04-06 21:16:41 +02:00
florian on nixos (Florian Brandes)
76e43bcb89 add first tests for barcode reader
Signed-off-by: florian on nixos (Florian Brandes) <florian.brandes@posteo.de>
2022-04-06 21:16:41 +02:00
jonaswinkler
1e5a418191 more testing #511 2021-02-09 00:01:11 +01:00
jonaswinkler
0c676b90f2 migration for #511 2021-02-08 20:59:14 +01:00
jonaswinkler
731418349f added a test case that replicates #511 2021-02-07 18:23:54 +01:00
jonaswinkler
0927f9d477 some bug fixes and tests 2021-01-18 14:16:32 +01:00
jonaswinkler
a68b858733 new exporter that updates the export in place, fixes #376 #343 #166 2021-01-18 01:15:39 +01:00
jonaswinkler
d690b34ee0 added invalid PDF document with BOM marker 2020-12-29 21:02:45 +01:00
jonaswinkler
a3143ec512 more tests! 2020-11-29 19:22:49 +01:00
jonaswinkler
a27daaebe9 fixes an issue with paperless not assigning metadata when FILENAME_FORMAT is specified and resolves an invalid warning about missing files fixes #67 2020-11-29 14:45:43 +01:00
jonaswinkler
db0f7649d1 more tests. 2020-11-26 23:56:57 +01:00
Jonas Winkler
43b473dc53 Test cases for the API 2020-11-26 17:57:00 +01:00
Jonas Winkler
0b1637da62 first implementation of the mail rework 2020-11-15 23:56:22 +01:00
Daniel Quinn
55e81ca4bb feat: refactor for pluggable consumers
I've broken out the OCR-specific code from the consumers and dumped it
all into its own app, `paperless_tesseract`.  This new app should serve
as a sample of how to create one's own consumer for different file
types.

Documentation for how to do this isn't ready yet, but for the impatient:

* Create a new app
    * containing a `parsers.py` for your parser modelled after
      `paperless_tesseract.parsers.RasterisedDocumentParser`
    * containing a `signals.py` with a handler moddelled after
      `paperless_tesseract.signals.ConsumerDeclaration`
    * connect the signal handler to
      `documents.signals.document_consumer_declaration` in
      `your_app.apps`
* Install the app into Paperless by declaring
  `PAPERLESS_INSTALLED_APPS=your_app`.  Additional apps should be
  separated with commas.
* Restart the consumer
2017-03-25 15:10:25 +00:00
Daniel Quinn
18495ce9da Fix for #154
* Added a test with a faked pyocr and tesseract
* Added a catch for pyocr's *other* TesseractError
2016-11-27 15:06:45 +00:00
Florian Harr
9ff4b6c6bc UnitTests for inline attachment email 2016-04-14 13:01:03 -04:00
Daniel Quinn
6b0a537bff Added support for a shared secret in email 2016-02-14 03:01:24 +00:00
Daniel Quinn
c4311af263 Cleaned up the tests 2016-02-06 17:41:11 +00:00