Evgenii
1dc8477a00
Fix: update ASN regex to support Unicode ( #5099 )
2023-12-25 16:33:30 -08:00
Trenton H
e8877c2c0e
Fix: Document metadata is lost during barcode splitting ( #4982 )
...
* Fixes barcode splitting dropping metadata that might be needed for the round 2
2023-12-15 09:17:25 -08:00
Sebastian Porombka
62fdc545b9
barcode logic: strip non-numeric characters from detected ASN string ( #4379 )
...
* legacy barcodes exist which still contain characters after the number. the current logic did not truncate them. instead, int() was called from the remaining string. this does not work in this case. it is therefore sufficient to continue processing numeric characters.
* lint
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-10-17 03:44:22 +00:00
Trenton Holmes
34b80a4d8e
Removes support for Python 3.8 and lower from the code base
2023-09-10 11:42:59 -07:00
Dennis Brakhane
ef749f9a29
Feature: collate two single-sided multipage scans ( #3784 )
...
* Feature: collate two single-sided scans
Some ADF only support single-sided scans, making scanning
double-sided documents a bit annoying.
This new feature enables Paperless to do most of the work,
by merging two seperate scans into a single one, collating
the even and odd numbered pages.
* Documentation: clarify that collation is disabled by default
* Apply suggestions from code review
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
* Address code review remarks
* Grammar fixes
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-07-24 00:29:04 -07:00
Trenton H
e160580c8b
Fixes issues with copy2 or copystat and SELinux see #3665
2023-07-22 06:27:49 -07:00
Bastian Machek
324e30bd4b
Feature: support barcode upscaling for better detection of small barcodes ( #3655 )
2023-06-27 10:18:47 -07:00
Trenton H
4504668cb2
Let ruff autofix some things from the newest version
2023-06-13 20:15:18 -07:00
Trenton H
e83be2e540
In cases where a temporary file is created or used, copy the original file stats to it
2023-06-07 09:02:19 -07:00
Trenton H
1396f25419
Updates handling of barcodes to encapsulate logic, moving it out of tasks and into barcodes
2023-05-22 06:52:31 -07:00
Trenton H
d2c02b9102
Configures ruff as the one stop linter and resolves warnings it raised
2023-04-01 17:03:52 -07:00
Trenton H
36a6df0bae
Creates a data model for the document consumption, allowing stronger typing of arguments and setting of some information about the file only once
2023-04-01 11:05:34 -07:00
Trenton H
f124228e86
Instead of using PIL directly to convert TIFF to PDF, use the existing library of img2pdf
2023-03-20 13:48:05 -07:00
Marvin Gaube
c66a0ec82e
feature: Add support for zxing as barcode scanning lib
2023-03-19 13:48:35 +01:00
Trenton H
ec2b0eb308
Changes out the settings and a decent amount of test code to be pathlib compatible
2023-03-06 09:16:07 -08:00
Trenton Holmes
e36d46f0df
When splitting via barcodes, cleanup the split documents better
2023-02-12 08:20:12 -08:00
Fabian Ohler
c08b19c7a9
Feature: split documents on ASN barcode ( #2554 )
...
* also split documents when an ASN barcode is found
* linter
* fix test case parameters
* avoid pre-python-3.9 features
* simplify dict-creation in tests
* simplify dict-creation in tests for empty dicts
* Add test cases for the splitting by ASN barcode feature
* deleted supporting files for test case construction
2023-02-01 01:13:30 -08:00
Trenton H
b19ada7a41
Removes pikepdf based scanning, fixes up unit testing (+ commenting)
2023-01-27 12:24:47 -08:00
Trenton H
f61536f74c
Tweaks the resizing based on testing
2023-01-24 10:30:53 -08:00
Trenton H
68c9f7a614
Rescales images from PDFs so zbar can better find them
2023-01-24 10:30:53 -08:00
Trenton H
1102a18697
Use dataclasses to group data about barcodes in documents
2023-01-24 09:43:52 -08:00
Peter Kappelt
147293a2cc
Proper code formatting
2023-01-24 09:43:52 -08:00
Peter Kappelt
b865890bce
Unified separator ans ASN barcode parsing
...
so that barcode parsing won't run twice
2023-01-24 09:43:52 -08:00
Peter Kappelt
099b8b8161
Feature: Parse ASN from barcode
...
ASN-Barcodes are identified by a configurable prefix
2023-01-24 09:43:52 -08:00
Peter Kappelt
f8f8cc7dd0
split function for reading barcode and separating pages
2023-01-24 09:43:52 -08:00
Trenton H
189d02dfe6
Always use pikepdf, then pdf2image if needed to check for barcodes instead of requiring/allowing configuration
2022-11-09 13:01:39 -08:00
Trenton H
1e1f0347fa
More smoothly handle the case of a password protected PDF for barcodes
2022-10-24 13:16:14 -07:00
Trenton H
6d2851c693
Allows using pdf2image instead of pikepdf if desired
2022-10-24 09:58:34 -07:00
Trenton Holmes
ddef90d96e
Adds specific handling for CCITT Group 4, which pikepdf decodes, but not correctly
2022-10-11 13:51:14 -07:00
Trenton H
c888b3dfd3
In case pikepdf fails to convert an image to a PIL image, fall back to converting pages to PIL images
2022-10-11 13:51:13 -07:00
Trenton H
13465fcfda
Fixes grammar in comment
...
Co-authored-by: Florian <florian.brandes@posteo.de>
2022-09-16 09:08:16 -07:00
Trenton Holmes
b21f64de8a
Updates how barcodes are detected, using pikepdf images, instead of converting each page to an image
2022-09-16 09:08:16 -07:00
Trenton Holmes
33a4a273a3
Fixes the seperation of files by barcode, during the case where 2 barcodes appear back to back
2022-09-14 14:00:37 -07:00
Trenton Holmes
af204426af
Moves the barcode related functionality out of tasks and into its own location. Splits up the testing based on that
2022-07-02 16:19:22 +02:00