31 Commits

Author SHA1 Message Date
Trenton Holmes
0df91c31f1 Creates a mix-in for asserting file system states 2023-02-20 10:25:21 -08:00
Trenton H
bdcba570cb Adding more test coverage, in particular around Tika and its parser 2023-02-05 11:01:55 -08:00
shamoon
985f298c46
Merge pull request #2302 from paperless-ngx/feature-fix-display-rtl-content 2023-01-10 07:30:52 -08:00
Trenton H
d7939ca958 Fixes some sample test files showing as modified after running tests 2023-01-05 08:39:48 -08:00
Trenton Holmes
7be9ae9c02 Try a new way of extracting text from a given PDF file 2023-01-03 12:43:31 -08:00
Trenton H
0fd51e35e1 Adds testing coverage of multipage TIFF with alpha, without and with alpha/sRGB 2023-01-03 09:56:19 -08:00
Trenton H
a2b7687c3b In the case of an RTL language being extracted via pdfminer.six, fall back to forced OCR, which handles RTL text better 2022-12-29 16:02:02 -08:00
Trenton H
e96d65f945 Allows parsing of WebP format images 2022-11-28 09:35:54 -08:00
Trenton H
f015556562 Adds a test to cover this edge case 2022-11-22 07:22:41 -08:00
Trenton Holmes
d1aa08850d Reverts the change around skip_noarchive to align with how it is documented to work 2022-10-20 13:34:41 -07:00
Trenton Holmes
b3b2519bf0 Fixes the creation of an archive file, even if noarchive was specified 2022-08-20 13:47:56 -07:00
Trenton Holmes
49a843dcdd Changes the simple-alpha parsing test to use a tempdir so the original isn't modified in Git 2022-07-02 16:19:22 +02:00
Trenton Holmes
1771d18a21 Runs the pre-commit hooks over all the Python files 2022-03-11 11:34:28 -08:00
kpj
fc695896dd Format Python code with black 2022-02-27 15:26:41 +01:00
Martin Müller
73a8569d21 Modify test for PNG image with alpha 2022-02-21 22:38:25 +01:00
jonaswinkler
0e596bd1fc also apply \0 removal to sidecar contents 2021-03-22 23:08:34 +01:00
jonaswinkler
40ce38254b fixes #631 2021-03-14 14:42:48 +01:00
jonaswinkler
6ab884a95c update dependencies 2021-02-28 13:01:26 +01:00
jonaswinkler
99a18516b2 tests 2021-02-22 00:17:16 +01:00
jonaswinkler
50c1978d36 tests 2021-02-21 00:18:34 +01:00
jonaswinkler
56bd966c02 local import of ocrmypdf so that the webserver does not load that 2021-02-15 12:18:10 +01:00
jonaswinkler
89d6e422f5 fix bugs and test cases 2021-01-02 15:37:27 +01:00
jonaswinkler
a0631413d6 fixes bauerj/paperless_app#23 and most of all other scanner apps out there. 2020-12-12 18:25:15 +01:00
jonaswinkler
e3ce573fbb a couple fixes and more supported image files 2020-12-02 17:39:49 +01:00
jonaswinkler
12fa844c7f testing the new noarchive option. 2020-12-01 14:30:13 +01:00
jonaswinkler
ac1b701000 more tests! 2020-11-29 19:58:48 +01:00
jonaswinkler
06cfc3113a test case fixes. 2020-11-27 14:06:37 +01:00
Jonas Winkler
e87575240d more tests of the new parser 2020-11-26 00:08:23 +01:00
Jonas Winkler
f51d2be303 fixed the test cases 2020-11-25 19:51:09 +01:00
Jonas Winkler
56ce267f89 removed obsolete tests. 2020-11-25 14:51:32 +01:00
Jonas Winkler
1655d85a53 testing the tesseract parser 2020-11-19 20:31:08 +01:00