mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-10-30 03:56:23 -05:00 
			
		
		
		
	better language code help
This commit is contained in:
		| @@ -383,21 +383,20 @@ needs. | |||||||
| : Customize the language that paperless will attempt to use when | : Customize the language that paperless will attempt to use when | ||||||
| parsing documents. | parsing documents. | ||||||
|  |  | ||||||
|     It should be a 3-letter language code consistent with ISO 639: |     It should be a 3-letter code, see the list of [languages Tesseract supports](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html). | ||||||
|     https://www.loc.gov/standards/iso639-2/php/code_list.php |  | ||||||
|  |  | ||||||
|     Set this to the language most of your documents are written in. |     Set this to the language most of your documents are written in. | ||||||
|  |  | ||||||
|     This can be a combination of multiple languages such as `deu+eng`, |     This can be a combination of multiple languages such as `deu+eng`, | ||||||
|     in which case tesseract will use whatever language matches best. |     in which case Tesseract will use whatever language matches best. | ||||||
|     Keep in mind that tesseract uses much more cpu time with multiple |     Keep in mind that Tesseract uses much more CPU time with multiple | ||||||
|     languages enabled. |     languages enabled. | ||||||
|  |  | ||||||
|     Defaults to "eng". |     Defaults to "eng". | ||||||
|  |  | ||||||
|     !!! note |     !!! note | ||||||
|  |  | ||||||
|         If your language contains a '-' such as chi-sim, you must use chi_sim |         If your language contains a '-' such as chi-sim, you must use `chi_sim`. | ||||||
|  |  | ||||||
| `PAPERLESS_OCR_MODE=<mode>` | `PAPERLESS_OCR_MODE=<mode>` | ||||||
|  |  | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 tooomm
					tooomm