Jens van Almsick 
							
						 
					 
					
						
						
							
						
						ad6ef7314b 
					 
					
						
						
							
							fix: csv recognition by consumer  
						
						... 
						
						
						
						paperless-ngx detects the file format via the mime-type based on the response of python-magic which rely on the response of the file command.
In version 5.39 (which is shipped with debian bullseye and I think many more non-rolling distributions) of the file command a *.csv will be detected as "application/csv" instead of "text/csv" as in newer versions. 
						
						
					 
					
						2022-10-02 16:09:07 -07:00 
						 
				 
			
				
					
						
							
							
								Trenton Holmes 
							
						 
					 
					
						
						
							
						
						6844f8f2bf 
					 
					
						
						
							
							Minor tweaks to getting the document thumbnail path.  Adds text thumbnail as webp  
						
						
						
						
					 
					
						2022-06-10 06:56:28 -07:00 
						 
				 
			
				
					
						
							
							
								Trenton Holmes 
							
						 
					 
					
						
						
							
						
						fc26fe0ac0 
					 
					
						
						
							
							Updates to provide the user provided max pixel size to ocrmypdf  
						
						
						
						
					 
					
						2022-05-22 16:56:08 -07:00 
						 
				 
			
				
					
						
							
							
								Trenton Holmes 
							
						 
					 
					
						
						
							
						
						3003bdd507 
					 
					
						
						
							
							Runs pyupgrade to Python 3.8+ and adds a hook for it  
						
						
						
						
					 
					
						2022-05-06 09:04:08 -07:00 
						 
				 
			
				
					
						
							
							
								Henning Häcker 
							
						 
					 
					
						
						
							
						
						3b4da70c85 
					 
					
						
						
							
							extract OCR_MAX_IMAGE_PIXELS into settings.py  
						
						
						
						
					 
					
						2022-03-30 09:23:45 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Häcker 
							
						 
					 
					
						
						
							
						
						95199bd325 
					 
					
						
						
							
							formatting according to black  
						
						
						
						
					 
					
						2022-03-30 09:23:45 +02:00 
						 
				 
			
				
					
						
							
							
								Henning Häcker 
							
						 
					 
					
						
						
							
						
						a8887b211e 
					 
					
						
						
							
							implement PAPERLESS_OCR_MAX_IMAGE_PIXELS  
						
						
						
						
					 
					
						2022-03-30 09:23:45 +02:00 
						 
				 
			
				
					
						
							
							
								Trenton Holmes 
							
						 
					 
					
						
						
							
						
						1771d18a21 
					 
					
						
						
							
							Runs the pre-commit hooks over all the Python files  
						
						
						
						
					 
					
						2022-03-11 11:34:28 -08:00 
						 
				 
			
				
					
						
							
							
								kpj 
							
						 
					 
					
						
						
							
						
						fc695896dd 
					 
					
						
						
							
							Format Python code with black  
						
						
						
						
					 
					
						2022-02-27 15:26:41 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						8d6071e977 
					 
					
						
						
							
							fix a bug with thumbnail generation when TIKA was enabled  
						
						
						
						
					 
					
						2021-02-09 22:12:43 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						431d4fd8e4 
					 
					
						
						
							
							rework most of the logging  
						
						
						
						
					 
					
						2021-02-05 01:10:29 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						44ec3a3d9c 
					 
					
						
						
							
							lazy loading for parsers  
						
						
						
						
					 
					
						2021-02-04 13:17:24 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						40ef375c15 
					 
					
						
						
							
							supply file_name for tika parser  
						
						
						
						
					 
					
						2021-01-01 22:19:43 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						f964dd5935 
					 
					
						
						
							
							added configuration option for the font  #197   #207  
						
						
						
						
					 
					
						2020-12-29 12:26:41 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						ee31fdc650 
					 
					
						
						
							
							removed unused code  
						
						
						
						
					 
					
						2020-12-20 14:00:24 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						b2e0a8c884 
					 
					
						
						
							
							thumbnail generation  
						
						
						
						
					 
					
						2020-12-16 14:19:11 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						e47b105185 
					 
					
						
						
							
							fixes   #7  and some test cases.  
						
						
						
						
					 
					
						2020-12-16 14:17:05 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						7e0aa7136a 
					 
					
						
						
							
							more tests  
						
						
						
						
					 
					
						2020-12-15 13:26:01 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						aaa6599283 
					 
					
						
						
							
							Merge branch 'dev' into feature-ocrmypdf  
						
						
						
						
					 
					
						2020-11-30 16:48:09 +01:00 
						 
				 
			
				
					
						
							
							
								jonaswinkler 
							
						 
					 
					
						
						
							
						
						f51207fc32 
					 
					
						
						
							
							added file type checks to the parsers to prevent temporary files from being consumed. Also: parsers announce file types they wish to use as default for each mime type.  
						
						
						
						
					 
					
						2020-11-30 00:40:04 +01:00 
						 
				 
			
				
					
						
							
							
								Jonas Winkler 
							
						 
					 
					
						
						
							
						
						df801d17e1 
					 
					
						
						
							
							reworked the interface of the parsers.  
						
						
						
						
					 
					
						2020-11-25 19:36:39 +01:00 
						 
				 
			
				
					
						
							
							
								Jonas Winkler 
							
						 
					 
					
						
						
							
						
						41650f20f4 
					 
					
						
						
							
							mime type handling  
						
						
						
						
					 
					
						2020-11-20 13:31:03 +01:00 
						 
				 
			
				
					
						
							
							
								Jonas Winkler 
							
						 
					 
					
						
						
							
						
						d2e22e3f27 
					 
					
						
						
							
							Changed the way parsers are discovered. This also prepares for upcoming changes regarding content types and file types: parsers should declare what they support, and actual file extensions should not be hardcoded everywhere.  
						
						
						
						
					 
					
						2020-11-16 23:53:12 +01:00 
						 
				 
			
				
					
						
							
							
								Jonas Winkler 
							
						 
					 
					
						
						
							
						
						2e04ba1c04 
					 
					
						
						
							
							code style fixes  
						
						
						
						
					 
					
						2020-11-12 21:09:45 +01:00 
						 
				 
			
				
					
						
							
							
								Jonas Winkler 
							
						 
					 
					
						
						
							
						
						d15405ef56 
					 
					
						
						
							
							reworked most of the tesseract parser, better logging  
						
						
						
						
					 
					
						2020-11-02 15:40:44 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
							
						
						750ab5bf85 
					 
					
						
						
							
							Use optipng to optimise document thumbnails  
						
						
						
						
					 
					
						2018-10-07 14:56:38 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
							
						
						2a3f766b93 
					 
					
						
						
							
							Consolidate get_date onto the DocumentParser parent class  
						
						
						
						
					 
					
						2018-10-07 14:56:02 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
							
						
						c99f5923d5 
					 
					
						
						
							
							Rename parsers to DATE_REGEX  
						
						... 
						
						
						
						In moving the `parsers` variable into the package-level, it lost the
context, so a more descriptive name was needed. 
						
						
					 
					
						2018-09-09 21:02:30 +01:00 
						 
				 
			
				
					
						
							
							
								Daniel Quinn 
							
						 
					 
					
						
						
							
						
						ef302abed7 
					 
					
						
						
							
							Fix pycodestyle complaints  
						
						
						
						
					 
					
						2018-09-09 20:55:37 +01:00 
						 
				 
			
				
					
						
							
							
								Joshua Taillon 
							
						 
					 
					
						
						
							
						
						72c828170e 
					 
					
						
						
							
							move date-matching regex pattern to base parser module for use by all subclasses  
						
						
						
						
					 
					
						2018-09-05 21:13:36 -04:00 
						 
				 
			
				
					
						
							
							
								Joshua Taillon 
							
						 
					 
					
						
						
							
						
						4849249d86 
					 
					
						
						
							
							explicitly add txt, md, and csv types for consumer and viewer; fix thumbnail generation  
						
						
						
						
					 
					
						2018-09-03 23:46:13 -04:00 
						 
				 
			
				
					
						
							
							
								Joshua Taillon 
							
						 
					 
					
						
						
							
						
						d6fedbec52 
					 
					
						
						
							
							first stab at text consumer  
						
						
						
						
					 
					
						2018-08-30 23:32:41 -04:00