21 Commits

Author SHA1 Message Date
Trenton H
30655f1b73 Fixes ruff not running isort against the codebase 2023-04-26 09:35:27 -07:00
Trenton H
d2c02b9102 Configures ruff as the one stop linter and resolves warnings it raised 2023-04-01 17:03:52 -07:00
Trenton H
09ac404148 Adding more test coverage, in particular around Tika and its parser 2023-02-05 11:01:55 -08:00
Trenton H
8504b6f7da Cleans up and improves parser discovery testing, simplifies the determination of supported or not supported extensions and mime types 2023-01-05 08:39:48 -08:00
Trenton Holmes
ef6ebf9888 Entirely removes the optipng, updates ghostscript fall back to also use WebP. Updates the conversion to use a multiprocessing pool 2022-06-11 08:38:49 -07:00
Trenton Holmes
f62193099c Runs pyupgrade to Python 3.8+ and adds a hook for it 2022-05-06 09:04:08 -07:00
Trenton Holmes
6635fa5f0d Runs the pre-commit hooks over all the Python files 2022-03-11 11:34:28 -08:00
kpj
c56cb25b5f Format Python code with black 2022-02-27 15:26:41 +01:00
jonaswinkler
b04d91d68c fix a bug with thumbnail generation when TIKA was enabled 2021-02-09 22:12:43 +01:00
jonaswinkler
95f5c9f3a6 lazy loading for parsers 2021-02-04 13:17:24 +01:00
jonaswinkler
4311266e17 more test 2021-01-20 12:34:01 +01:00
jonaswinkler
75a226f507 test fixes and changelog 2020-12-02 22:44:18 +01:00
jonaswinkler
d7c424fdcd fix some tests. 2020-12-01 23:54:33 +01:00
jonaswinkler
1df64e3129 Merge branch 'dev' into feature-ocrmypdf 2020-11-30 16:48:09 +01:00
jonaswinkler
7658c07b4d added file type checks to the parsers to prevent temporary files from being consumed. Also: parsers announce file types they wish to use as default for each mime type. 2020-11-30 00:40:04 +01:00
jonaswinkler
cb959e296a more tests! 2020-11-29 19:22:49 +01:00
Jonas Winkler
bd0db57604 more test 2020-11-25 21:38:19 +01:00
Jonas Winkler
779157c4d5 code cleanup 2020-11-21 12:12:19 +01:00
Jonas Winkler
f976a0b4ba mime type handling 2020-11-20 13:31:03 +01:00
Jonas Winkler
9a48d6c577 Changed the way parsers are discovered. This also prepares for upcoming changes regarding content types and file types: parsers should declare what they support, and actual file extensions should not be hardcoded everywhere. 2020-11-16 23:53:12 +01:00
Jonas Winkler
30f837d49f fixed most of the test cases 2020-11-08 13:49:15 +01:00