19 Commits

Author SHA1 Message Date
Trenton Holmes
8b3d01c49b When splitting via barcodes, cleanup the split documents better 2023-02-12 08:20:12 -08:00
Fabian Ohler
658d372cd2
Feature: split documents on ASN barcode (#2554)
* also split documents when an ASN barcode is found

* linter

* fix test case parameters

* avoid pre-python-3.9 features

* simplify dict-creation in tests

* simplify dict-creation in tests for empty dicts

* Add test cases for the splitting by ASN barcode feature

* deleted supporting files for test case construction
2023-02-01 01:13:30 -08:00
Trenton H
2ab77fbaf7 Removes pikepdf based scanning, fixes up unit testing (+ commenting) 2023-01-27 12:24:47 -08:00
Trenton H
7273a8c7a5 Tweaks the resizing based on testing 2023-01-24 10:30:53 -08:00
Trenton H
4195d5746f Rescales images from PDFs so zbar can better find them 2023-01-24 10:30:53 -08:00
Trenton H
7bc077ac08 Use dataclasses to group data about barcodes in documents 2023-01-24 09:43:52 -08:00
Peter Kappelt
31a03b1d30 Proper code formatting 2023-01-24 09:43:52 -08:00
Peter Kappelt
5004771d79 Unified separator ans ASN barcode parsing
so that barcode parsing won't run twice
2023-01-24 09:43:52 -08:00
Peter Kappelt
92b9fc1ba9 Feature: Parse ASN from barcode
ASN-Barcodes are identified by a configurable prefix
2023-01-24 09:43:52 -08:00
Peter Kappelt
585cc24dd5 split function for reading barcode and separating pages 2023-01-24 09:43:52 -08:00
Trenton H
10f6195bac Always use pikepdf, then pdf2image if needed to check for barcodes instead of requiring/allowing configuration 2022-11-09 13:01:39 -08:00
Trenton H
d52fbbb040 More smoothly handle the case of a password protected PDF for barcodes 2022-10-24 13:16:14 -07:00
Trenton H
f8ce6285df Allows using pdf2image instead of pikepdf if desired 2022-10-24 09:58:34 -07:00
Trenton Holmes
4cc2976614 Adds specific handling for CCITT Group 4, which pikepdf decodes, but not correctly 2022-10-11 13:51:14 -07:00
Trenton H
caf4b54bc7 In case pikepdf fails to convert an image to a PIL image, fall back to converting pages to PIL images 2022-10-11 13:51:13 -07:00
Trenton H
355b3fcb3d Fixes grammar in comment
Co-authored-by: Florian <florian.brandes@posteo.de>
2022-09-16 09:08:16 -07:00
Trenton Holmes
7aa0e5650b Updates how barcodes are detected, using pikepdf images, instead of converting each page to an image 2022-09-16 09:08:16 -07:00
Trenton Holmes
9ae847039b Fixes the seperation of files by barcode, during the case where 2 barcodes appear back to back 2022-09-14 14:00:37 -07:00
Trenton Holmes
ec045e81f2 Moves the barcode related functionality out of tasks and into its own location. Splits up the testing based on that 2022-07-02 16:19:22 +02:00