31 Commits

Author SHA1 Message Date
Trenton Holmes
650c816a7b Removes support for Python 3.8 and lower from the code base 2023-09-10 11:42:59 -07:00
Dennis Brakhane
8c7554e081
Feature: collate two single-sided multipage scans (#3784)
* Feature: collate two single-sided scans

Some ADF only support single-sided scans, making scanning
double-sided documents a bit annoying.

This new feature enables Paperless to do most of the work,
by merging two seperate scans into a single one, collating
the even and odd numbered pages.

* Documentation: clarify that collation is disabled by default

* Apply suggestions from code review

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>

* Address code review remarks

* Grammar fixes

---------

Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-07-24 00:29:04 -07:00
Trenton H
9f5d47c320 Fixes issues with copy2 or copystat and SELinux see #3665 2023-07-22 06:27:49 -07:00
Bastian Machek
931f5f9c27
Feature: support barcode upscaling for better detection of small barcodes (#3655) 2023-06-27 10:18:47 -07:00
Trenton H
70f3f98363 Let ruff autofix some things from the newest version 2023-06-13 20:15:18 -07:00
Trenton H
883937bfd7 In cases where a temporary file is created or used, copy the original file stats to it 2023-06-07 09:02:19 -07:00
Trenton H
07e07fc7e8 Updates handling of barcodes to encapsulate logic, moving it out of tasks and into barcodes 2023-05-22 06:52:31 -07:00
Trenton H
ce41ac9158 Configures ruff as the one stop linter and resolves warnings it raised 2023-04-01 17:03:52 -07:00
Trenton H
3c2bbf244d Creates a data model for the document consumption, allowing stronger typing of arguments and setting of some information about the file only once 2023-04-01 11:05:34 -07:00
Trenton H
0778c2808b
Instead of using PIL directly to convert TIFF to PDF, use the existing library of img2pdf 2023-03-20 13:48:05 -07:00
Marvin Gaube
e89c0f15dd feature: Add support for zxing as barcode scanning lib 2023-03-19 13:48:35 +01:00
Trenton H
41bcfcaffe Changes out the settings and a decent amount of test code to be pathlib compatible 2023-03-06 09:16:07 -08:00
Trenton Holmes
8b3d01c49b When splitting via barcodes, cleanup the split documents better 2023-02-12 08:20:12 -08:00
Fabian Ohler
658d372cd2
Feature: split documents on ASN barcode (#2554)
* also split documents when an ASN barcode is found

* linter

* fix test case parameters

* avoid pre-python-3.9 features

* simplify dict-creation in tests

* simplify dict-creation in tests for empty dicts

* Add test cases for the splitting by ASN barcode feature

* deleted supporting files for test case construction
2023-02-01 01:13:30 -08:00
Trenton H
2ab77fbaf7 Removes pikepdf based scanning, fixes up unit testing (+ commenting) 2023-01-27 12:24:47 -08:00
Trenton H
7273a8c7a5 Tweaks the resizing based on testing 2023-01-24 10:30:53 -08:00
Trenton H
4195d5746f Rescales images from PDFs so zbar can better find them 2023-01-24 10:30:53 -08:00
Trenton H
7bc077ac08 Use dataclasses to group data about barcodes in documents 2023-01-24 09:43:52 -08:00
Peter Kappelt
31a03b1d30 Proper code formatting 2023-01-24 09:43:52 -08:00
Peter Kappelt
5004771d79 Unified separator ans ASN barcode parsing
so that barcode parsing won't run twice
2023-01-24 09:43:52 -08:00
Peter Kappelt
92b9fc1ba9 Feature: Parse ASN from barcode
ASN-Barcodes are identified by a configurable prefix
2023-01-24 09:43:52 -08:00
Peter Kappelt
585cc24dd5 split function for reading barcode and separating pages 2023-01-24 09:43:52 -08:00
Trenton H
10f6195bac Always use pikepdf, then pdf2image if needed to check for barcodes instead of requiring/allowing configuration 2022-11-09 13:01:39 -08:00
Trenton H
d52fbbb040 More smoothly handle the case of a password protected PDF for barcodes 2022-10-24 13:16:14 -07:00
Trenton H
f8ce6285df Allows using pdf2image instead of pikepdf if desired 2022-10-24 09:58:34 -07:00
Trenton Holmes
4cc2976614 Adds specific handling for CCITT Group 4, which pikepdf decodes, but not correctly 2022-10-11 13:51:14 -07:00
Trenton H
caf4b54bc7 In case pikepdf fails to convert an image to a PIL image, fall back to converting pages to PIL images 2022-10-11 13:51:13 -07:00
Trenton H
355b3fcb3d Fixes grammar in comment
Co-authored-by: Florian <florian.brandes@posteo.de>
2022-09-16 09:08:16 -07:00
Trenton Holmes
7aa0e5650b Updates how barcodes are detected, using pikepdf images, instead of converting each page to an image 2022-09-16 09:08:16 -07:00
Trenton Holmes
9ae847039b Fixes the seperation of files by barcode, during the case where 2 barcodes appear back to back 2022-09-14 14:00:37 -07:00
Trenton Holmes
ec045e81f2 Moves the barcode related functionality out of tasks and into its own location. Splits up the testing based on that 2022-07-02 16:19:22 +02:00