2310 Commits

Author SHA1 Message Date
phail
90cb0836bb Downgrade pdf validation to text only 2022-10-27 23:11:41 +02:00
Paperless-ngx Translation Bot [bot]
c6a484439d New translations django.po (Dutch)
[ci skip]
2022-10-27 02:51:17 -07:00
phail
ef1d4264b5 improve test coverage a little 2022-10-27 00:27:15 +02:00
phail
e1fa59122d Merge remote-tracking branch 'paperless/dev' into feature-consume-eml 2022-10-26 20:59:49 +02:00
phail
5bf26369e2 remove erroring paramerter 2022-10-25 21:17:40 +02:00
Trenton H
d52fbbb040 More smoothly handle the case of a password protected PDF for barcodes 2022-10-24 13:16:14 -07:00
phail
36239ba09f rename help text 2022-10-24 22:15:33 +02:00
phail
318c1d2fbd Merge remote-tracking branch 'paperless/dev' into feature-consume-eml 2022-10-24 21:12:35 +02:00
Trenton H
f8ce6285df Allows using pdf2image instead of pikepdf if desired 2022-10-24 09:58:34 -07:00
Trenton H
a72cc5da83 Connects up the celery signals to support pending, started and success/failure, without relying on django-celery-results 2022-10-24 09:10:10 -07:00
phail
0da0b1c062 update variable names 2022-10-23 21:39:15 +02:00
phail
08988e11f8 Merge remote-tracking branch 'paperless/dev' into feature-consume-eml 2022-10-23 20:37:22 +02:00
phail
30372b0e85 add tests for mail_to_html and generate_pdf_from_mail 2022-10-23 17:18:10 +02:00
phail
567e89d1c7 test for broken eml, add test_generate_pdf 2022-10-22 02:25:23 +02:00
phail
f1f5227ccd add unittest for external images 2022-10-22 00:44:32 +02:00
Paperless-ngx Translation Bot [bot]
73845ef968 New translations django.po (Arabic)
[ci skip]
2022-10-20 16:30:32 -07:00
Michael Shamoon
8be6c707de Update django.po
[ci skip]
2022-10-20 15:33:16 -07:00
Michael Shamoon
60f76d3e1f rename backend Arabic translation file
[ci skip]
2022-10-20 15:31:28 -07:00
Paperless-ngx Translation Bot [bot]
70ef6412eb New translations django.po (Arabic)
[ci skip]
2022-10-20 15:28:49 -07:00
Trenton Holmes
d1aa08850d Reverts the change around skip_noarchive to align with how it is documented to work 2022-10-20 13:34:41 -07:00
phail
09b5bd17f2 add unittest for generate_pdf_from_html 2022-10-19 23:19:33 +02:00
phail
e384bd78c5 add unittest for transform_inline_html 2022-10-18 23:48:07 +02:00
Paperless-ngx Translation Bot [bot]
0050a20710 New translations django.po (Czech)
[ci skip]
2022-10-18 13:06:11 -07:00
Paperless-ngx Translation Bot [bot]
097ab55f7a New translations django.po (Belarusian)
[ci skip]
2022-10-16 06:46:10 -07:00
Paperless-ngx Translation Bot [bot]
377c37dfab New translations django.po (Belarusian)
[ci skip]
2022-10-16 05:44:52 -07:00
phail
fda844f64c add unittest for parse 2022-10-15 15:41:43 +02:00
phail
daf90399bd Add unitest for tika_parse() 2022-10-15 13:13:29 +02:00
phail
3d37e49c1a add 2 more tests 2022-10-14 15:43:43 +02:00
phail
261c6fb990 add unittest for get_thumbnail 2022-10-13 01:03:09 +02:00
Trenton Holmes
4cc2976614 Adds specific handling for CCITT Group 4, which pikepdf decodes, but not correctly 2022-10-11 13:51:14 -07:00
Trenton H
caf4b54bc7 In case pikepdf fails to convert an image to a PIL image, fall back to converting pages to PIL images 2022-10-11 13:51:13 -07:00
Trenton H
8025df5fe3 Catch the new error raised by redis when it can't find the broker and stub out the call for testing 2022-10-10 14:21:42 -07:00
Trenton H
5aeb656a48 Fixes usage of a depracated logger method 2022-10-10 14:20:19 -07:00
Trenton H
d1a17480ea Account for plusses in the OCR language setting 2022-10-10 08:58:23 -07:00
Trenton H
1e891414a3 Allows disabling NLTK, adds it as a consideration for low power devices 2022-10-10 08:58:23 -07:00
Trenton Holmes
c44c914d3d Changes the NLTK language to be based on the Tesseract OCR language, with fallback to the default processing 2022-10-10 08:58:23 -07:00
Trenton H
d10d2f5a54 Allows configuration of the NLTK processing language 2022-10-10 08:58:23 -07:00
Trenton Holmes
6523cf0c4b Fixes the download and usage of the downloaded data 2022-10-10 08:58:23 -07:00
Trenton Holmes
1262c121f0 Missed one mock 2022-10-10 08:58:23 -07:00
Trenton Holmes
f7cd6974c5 Mock out the nltk portions so the data doesn't need to be downloaded 2022-10-10 08:58:23 -07:00
Trenton Holmes
d856e48045 Updates the pre-processing of document content to be much more robust, with tokenization, stemming and stop word removal 2022-10-10 08:58:23 -07:00
shamoon
6f50285f47
Merge pull request #1648 from paperless-ngx/feature-use-celery
Feature: Transition to celery for background tasks
2022-10-10 00:07:55 -07:00
Trenton Holmes
77b3aa5011 Fixes is_relative_to not being availible for 3.8 2022-10-09 17:43:58 -07:00
Trenton Holmes
9aefff38e7 If the original file containing a barcode was in the temporary scratch dir, move the split files to consume dir 2022-10-09 17:43:58 -07:00
Trenton H
97ceb1a8a6 Enable some testing against a real email server to hopefully catch things earlier 2022-10-07 18:28:11 -07:00
Trenton H
55089aab32 Fixes handling of gmail label extension to IMAP 2022-10-07 18:28:11 -07:00
Trenton Holmes
9c0c734b34 Enables some basic live testing against a tika server with actual sample documents to catch some more errors mocking won't catch 2022-10-07 18:06:06 -07:00
shamoon
5357775d42
Merge pull request #1692 from paperless-ngx/feature-frontend-update-checking
Feature: frontend update checking settings
2022-10-05 13:46:32 -07:00
Michael Shamoon
c42388f7e2 Use text mime type for csv files for browser preview
Co-Authored-By: Trenton H <797416+stumpylog@users.noreply.github.com>
Co-Authored-By: bin101 <12427722+bin101@users.noreply.github.com>
2022-10-04 13:01:06 -07:00
Trenton H
ff7d4d15cd Fixes migration error if some tasks are defined already 2022-10-04 07:56:40 -07:00