paperless-ngx

mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2026-02-16 00:19:32 -06:00

Author	SHA1	Message	Date
shamoon	e14f4c94c2	Fix: ghostscript rendering error doesnt trigger frontend failure message (#4092 ) * Raise ParseError from gs rendering error * catch all parser errors as generic exception * Differentiate generic vs parse errors during consumption	2023-08-31 19:49:00 -07:00
Dennis Brakhane	93009c1eed	Don't consider better OCR as failing Tesseract 5.3.0 does a better job at OCR, and correctly reads "a webp" instead of "awebp", this is good, so we don't want the test to fail.	2023-07-11 16:44:18 +02:00
Trenton H	111960c530	Adds better handling for files with invalid utf8 content	2023-05-13 09:29:18 -07:00
Trenton H	6f163111ce	Upgrades black to v23, upgrades ruff	2023-04-26 09:35:27 -07:00
Trenton H	3bcbd05252	Fixes ruff not running isort against the codebase	2023-04-26 09:35:27 -07:00
Trenton H	ce41ac9158	Configures ruff as the one stop linter and resolves warnings it raised	2023-04-01 17:03:52 -07:00
Brandon Rothweiler	ca412e0184	Add PAPERLESS_OCR_SKIP_ARCHIVE_FILE config setting	2023-02-23 22:42:57 -05:00
Brandon Rothweiler	8a89f5ae27	Revert "Merge pull request #2732 from bdr99/skip_neverarchive" This reverts commit `77b23d3acb`, reversing changes made to `5d8aa27831`.	2023-02-23 21:26:53 -05:00
Brandon Rothweiler	93a6391f96	Add a setting to disable creating an archive file	2023-02-22 15:27:17 -05:00
Trenton Holmes	0df91c31f1	Creates a mix-in for asserting file system states	2023-02-20 10:25:21 -08:00
Trenton H	bdcba570cb	Adding more test coverage, in particular around Tika and its parser	2023-02-05 11:01:55 -08:00
shamoon	985f298c46	Merge pull request #2302 from paperless-ngx/feature-fix-display-rtl-content	2023-01-10 07:30:52 -08:00
Trenton H	d7939ca958	Fixes some sample test files showing as modified after running tests	2023-01-05 08:39:48 -08:00
Trenton Holmes	7be9ae9c02	Try a new way of extracting text from a given PDF file	2023-01-03 12:43:31 -08:00
Trenton H	0fd51e35e1	Adds testing coverage of multipage TIFF with alpha, without and with alpha/sRGB	2023-01-03 09:56:19 -08:00
Trenton H	a2b7687c3b	In the case of an RTL language being extracted via pdfminer.six, fall back to forced OCR, which handles RTL text better	2022-12-29 16:02:02 -08:00
Trenton Holmes	55ef0d4a1b	Fixes language code checks around two part languages	2022-12-04 12:23:12 -08:00
Trenton H	e96d65f945	Allows parsing of WebP format images	2022-11-28 09:35:54 -08:00
Trenton H	f015556562	Adds a test to cover this edge case	2022-11-22 07:22:41 -08:00
Trenton Holmes	d1aa08850d	Reverts the change around skip_noarchive to align with how it is documented to work	2022-10-20 13:34:41 -07:00
Trenton Holmes	b3b2519bf0	Fixes the creation of an archive file, even if noarchive was specified	2022-08-20 13:47:56 -07:00
Trenton Holmes	49a843dcdd	Changes the simple-alpha parsing test to use a tempdir so the original isn't modified in Git	2022-07-02 16:19:22 +02:00
Trenton Holmes	1771d18a21	Runs the pre-commit hooks over all the Python files	2022-03-11 11:34:28 -08:00
kpj	fc695896dd	Format Python code with black	2022-02-27 15:26:41 +01:00
Martin Müller	73a8569d21	Modify test for PNG image with alpha	2022-02-21 22:38:25 +01:00
jonaswinkler	0e596bd1fc	also apply \0 removal to sidecar contents	2021-03-22 23:08:34 +01:00
jonaswinkler	40ce38254b	fixes #631	2021-03-14 14:42:48 +01:00
jonaswinkler	6ab884a95c	update dependencies	2021-02-28 13:01:26 +01:00
jonaswinkler	99a18516b2	tests	2021-02-22 00:17:16 +01:00
jonaswinkler	50c1978d36	tests	2021-02-21 00:18:34 +01:00
jonaswinkler	9cbb1c5726	add some test files	2021-02-21 00:13:08 +01:00
jonaswinkler	56bd966c02	local import of ocrmypdf so that the webserver does not load that	2021-02-15 12:18:10 +01:00
jonaswinkler	89d6e422f5	fix bugs and test cases	2021-01-02 15:37:27 +01:00
jonaswinkler	1b1b57eb6a	more tests	2020-12-19 15:54:13 +01:00
jonaswinkler	a0631413d6	fixes bauerj/paperless_app#23 and most of all other scanner apps out there.	2020-12-12 18:25:15 +01:00
jonaswinkler	e3ce573fbb	a couple fixes and more supported image files	2020-12-02 17:39:49 +01:00
jonaswinkler	12fa844c7f	testing the new noarchive option.	2020-12-01 14:30:13 +01:00
jonaswinkler	ac1b701000	more tests!	2020-11-29 19:58:48 +01:00
jonaswinkler	06cfc3113a	test case fixes.	2020-11-27 14:06:37 +01:00
Jonas Winkler	e87575240d	more tests of the new parser	2020-11-26 00:08:23 +01:00
Jonas Winkler	f51d2be303	fixed the test cases	2020-11-25 19:51:09 +01:00
Jonas Winkler	56ce267f89	removed obsolete tests.	2020-11-25 14:51:32 +01:00
Jonas Winkler	41650f20f4	mime type handling	2020-11-20 13:31:03 +01:00
Jonas Winkler	1655d85a53	testing the tesseract parser	2020-11-19 20:31:08 +01:00
Jonas Winkler	d2e22e3f27	Changed the way parsers are discovered. This also prepares for upcoming changes regarding content types and file types: parsers should declare what they support, and actual file extensions should not be hardcoded everywhere.	2020-11-16 23:53:12 +01:00
Jonas Winkler	2e04ba1c04	code style fixes	2020-11-12 21:09:45 +01:00
Jonas Winkler	f182709fdd	fixed most of the tests	2020-11-02 19:42:23 +01:00
Jonas Winkler	7d282a4e4e	removed unused code, small fixes	2020-11-02 18:20:04 +01:00
Johannes Wienke	a311cd498c	Handle dateparser ValueErrors When parsing dates from the document text or filenames, correctly handle values errors indicating broken dates. Newly added tests ensure that this handling works properly.	2020-03-08 18:44:15 +01:00
Johannes Wienke	a3aab0cb48	Remove duplicated date parsing test The exact same tests existed twice in the file.	2020-03-08 18:26:29 +01:00

1 2

73 Commits