Trenton H
e6f59472e4
Chore: Drop Python 3.9 support ( #7774 )
2024-09-26 12:22:24 -07:00
Lukas Metzger
cc25cbc026
Refactor: performance and storage optimization of barcode scanning ( #7646 )
...
---------
Co-authored-by: Lukas Metzger <1814751+loewexy@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2024-09-07 16:11:36 -07:00
Trenton H
622f624132
Chore: Change the code formatter to Ruff ( #6756 )
...
* Changing the formatting to ruff-format
* Replaces references to black to ruff or ruff format, removes black from dependencies
2024-05-18 02:26:50 +00:00
Trenton H
7be7185418
Handcrafts SQL queries a little more to reduce the query count and/or the amount of returned data ( #6489 )
2024-04-30 07:37:09 -07:00
Trenton H
b9636a3def
Feature: Allow user to control PIL image pixel limit ( #5997 )
2024-03-05 00:19:56 +00:00
Trenton H
8d664fad56
Fixes the interaction when both splitting and ASN are enabled ( #5779 )
2024-02-15 17:33:26 +00:00
Trenton H
907b6d1294
Handle the ASN assignment last, after the splitting ( #5745 )
2024-02-13 16:26:13 +00:00
Trenton H
21f96f0679
Fix: Splitting on ASN barcodes even if not enabled ( #5740 )
...
* Fixes the barcodes always splitting on ASNs, even if splitting was disabled
2024-02-12 12:58:37 -08:00
pkrahmer
fb82aa0ee1
Feature: Allow tagging by putting barcode stickers on documents ( #5580 )
2024-02-05 17:38:19 +00:00
Trenton H
2da5e46386
Refactor file consumption task to allow beginnings of a plugin system ( #5367 )
2024-01-13 16:11:14 +00:00
Trenton H
bd35030c59
Fix: Crash in barcode ASN reading when the file type isn't supported ( #5261 )
...
* Fixes a random crash in the barcode ASN reading so it doesn't try to access a not created temp dir
* Don't parse the barcodes twice, store the result instead
2024-01-06 05:08:24 +00:00
Evgenii
151d337f6c
Fix: update ASN regex to support Unicode ( #5099 )
2023-12-25 16:33:30 -08:00
Trenton H
122e4141b0
Fix: Document metadata is lost during barcode splitting ( #4982 )
...
* Fixes barcode splitting dropping metadata that might be needed for the round 2
2023-12-15 09:17:25 -08:00
Sebastian Porombka
90db397ec6
barcode logic: strip non-numeric characters from detected ASN string ( #4379 )
...
* legacy barcodes exist which still contain characters after the number. the current logic did not truncate them. instead, int() was called from the remaining string. this does not work in this case. it is therefore sufficient to continue processing numeric characters.
* lint
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-10-17 03:44:22 +00:00
Trenton Holmes
650c816a7b
Removes support for Python 3.8 and lower from the code base
2023-09-10 11:42:59 -07:00
Dennis Brakhane
8c7554e081
Feature: collate two single-sided multipage scans ( #3784 )
...
* Feature: collate two single-sided scans
Some ADF only support single-sided scans, making scanning
double-sided documents a bit annoying.
This new feature enables Paperless to do most of the work,
by merging two seperate scans into a single one, collating
the even and odd numbered pages.
* Documentation: clarify that collation is disabled by default
* Apply suggestions from code review
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
* Address code review remarks
* Grammar fixes
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-07-24 00:29:04 -07:00
Trenton H
9f5d47c320
Fixes issues with copy2 or copystat and SELinux see #3665
2023-07-22 06:27:49 -07:00
Bastian Machek
931f5f9c27
Feature: support barcode upscaling for better detection of small barcodes ( #3655 )
2023-06-27 10:18:47 -07:00
Trenton H
70f3f98363
Let ruff autofix some things from the newest version
2023-06-13 20:15:18 -07:00
Trenton H
883937bfd7
In cases where a temporary file is created or used, copy the original file stats to it
2023-06-07 09:02:19 -07:00
Trenton H
07e07fc7e8
Updates handling of barcodes to encapsulate logic, moving it out of tasks and into barcodes
2023-05-22 06:52:31 -07:00
Trenton H
ce41ac9158
Configures ruff as the one stop linter and resolves warnings it raised
2023-04-01 17:03:52 -07:00
Trenton H
3c2bbf244d
Creates a data model for the document consumption, allowing stronger typing of arguments and setting of some information about the file only once
2023-04-01 11:05:34 -07:00
Trenton H
0778c2808b
Instead of using PIL directly to convert TIFF to PDF, use the existing library of img2pdf
2023-03-20 13:48:05 -07:00
Marvin Gaube
e89c0f15dd
feature: Add support for zxing as barcode scanning lib
2023-03-19 13:48:35 +01:00
Trenton H
41bcfcaffe
Changes out the settings and a decent amount of test code to be pathlib compatible
2023-03-06 09:16:07 -08:00
Trenton Holmes
8b3d01c49b
When splitting via barcodes, cleanup the split documents better
2023-02-12 08:20:12 -08:00
Fabian Ohler
658d372cd2
Feature: split documents on ASN barcode ( #2554 )
...
* also split documents when an ASN barcode is found
* linter
* fix test case parameters
* avoid pre-python-3.9 features
* simplify dict-creation in tests
* simplify dict-creation in tests for empty dicts
* Add test cases for the splitting by ASN barcode feature
* deleted supporting files for test case construction
2023-02-01 01:13:30 -08:00
Trenton H
2ab77fbaf7
Removes pikepdf based scanning, fixes up unit testing (+ commenting)
2023-01-27 12:24:47 -08:00
Trenton H
7273a8c7a5
Tweaks the resizing based on testing
2023-01-24 10:30:53 -08:00
Trenton H
4195d5746f
Rescales images from PDFs so zbar can better find them
2023-01-24 10:30:53 -08:00
Trenton H
7bc077ac08
Use dataclasses to group data about barcodes in documents
2023-01-24 09:43:52 -08:00
Peter Kappelt
31a03b1d30
Proper code formatting
2023-01-24 09:43:52 -08:00
Peter Kappelt
5004771d79
Unified separator ans ASN barcode parsing
...
so that barcode parsing won't run twice
2023-01-24 09:43:52 -08:00
Peter Kappelt
92b9fc1ba9
Feature: Parse ASN from barcode
...
ASN-Barcodes are identified by a configurable prefix
2023-01-24 09:43:52 -08:00
Peter Kappelt
585cc24dd5
split function for reading barcode and separating pages
2023-01-24 09:43:52 -08:00
Trenton H
10f6195bac
Always use pikepdf, then pdf2image if needed to check for barcodes instead of requiring/allowing configuration
2022-11-09 13:01:39 -08:00
Trenton H
d52fbbb040
More smoothly handle the case of a password protected PDF for barcodes
2022-10-24 13:16:14 -07:00
Trenton H
f8ce6285df
Allows using pdf2image instead of pikepdf if desired
2022-10-24 09:58:34 -07:00
Trenton Holmes
4cc2976614
Adds specific handling for CCITT Group 4, which pikepdf decodes, but not correctly
2022-10-11 13:51:14 -07:00
Trenton H
caf4b54bc7
In case pikepdf fails to convert an image to a PIL image, fall back to converting pages to PIL images
2022-10-11 13:51:13 -07:00
Trenton H
355b3fcb3d
Fixes grammar in comment
...
Co-authored-by: Florian <florian.brandes@posteo.de>
2022-09-16 09:08:16 -07:00
Trenton Holmes
7aa0e5650b
Updates how barcodes are detected, using pikepdf images, instead of converting each page to an image
2022-09-16 09:08:16 -07:00
Trenton Holmes
9ae847039b
Fixes the seperation of files by barcode, during the case where 2 barcodes appear back to back
2022-09-14 14:00:37 -07:00
Trenton Holmes
ec045e81f2
Moves the barcode related functionality out of tasks and into its own location. Splits up the testing based on that
2022-07-02 16:19:22 +02:00