mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Merge branch 'master' of github.com:danielquinn/paperless
This commit is contained in:
commit
a1a8eb00de
@ -59,7 +59,7 @@ powerful tools.
|
|||||||
|
|
||||||
* `ImageMagick`_ converts the images between colour and greyscale.
|
* `ImageMagick`_ converts the images between colour and greyscale.
|
||||||
* `Tesseract`_ does the character recognition.
|
* `Tesseract`_ does the character recognition.
|
||||||
* `Unpaper`_ despeckles and and deskews the scanned image.
|
* `Unpaper`_ despeckles and deskews the scanned image.
|
||||||
* `GNU Privacy Guard`_ is used as the encryption backend.
|
* `GNU Privacy Guard`_ is used as the encryption backend.
|
||||||
* `Python 3`_ is the language of the project.
|
* `Python 3`_ is the language of the project.
|
||||||
|
|
||||||
|
@ -128,7 +128,7 @@ following name/value pairs:
|
|||||||
don't start uploading stuff to your server. The means of generating this
|
don't start uploading stuff to your server. The means of generating this
|
||||||
signature is defined below.
|
signature is defined below.
|
||||||
|
|
||||||
Specify ``enctype="multipart/form-data"``, and then POST your file with:::
|
Specify ``enctype="multipart/form-data"``, and then POST your file with::
|
||||||
|
|
||||||
Content-Disposition: form-data; name="document"; filename="whatever.pdf"
|
Content-Disposition: form-data; name="document"; filename="whatever.pdf"
|
||||||
|
|
||||||
|
@ -33,4 +33,5 @@ Contents
|
|||||||
api
|
api
|
||||||
utilities
|
utilities
|
||||||
migrating
|
migrating
|
||||||
|
troubleshooting
|
||||||
changelog
|
changelog
|
||||||
|
@ -8,7 +8,7 @@ should work) that has the following software installed on it:
|
|||||||
|
|
||||||
* `Python3`_ (with development libraries, pip and virtualenv)
|
* `Python3`_ (with development libraries, pip and virtualenv)
|
||||||
* `GNU Privacy Guard`_
|
* `GNU Privacy Guard`_
|
||||||
* `Tesseract`_
|
* `Tesseract`_, plus its language files matching your document base.
|
||||||
* `Imagemagick`_
|
* `Imagemagick`_
|
||||||
* `unpaper`_
|
* `unpaper`_
|
||||||
|
|
||||||
|
19
docs/troubleshooting.rst
Normal file
19
docs/troubleshooting.rst
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
.. _troubleshooting:
|
||||||
|
|
||||||
|
Troubleshooting
|
||||||
|
===============
|
||||||
|
|
||||||
|
.. _troubleshooting_ocr_language_files_missing:
|
||||||
|
|
||||||
|
Consumer warns ``OCR for XX failed``
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
If you find the OCR accuracy to be too low, and/or the document consumer warns that ``OCR for
|
||||||
|
XX failed, but we're going to stick with what we've got since FORGIVING_OCR is enabled``, then you
|
||||||
|
might need to install the `Tesseract language files
|
||||||
|
<http://packages.ubuntu.com/search?keywords=tesseract-ocr>`_ marching your documents languages.
|
||||||
|
|
||||||
|
As an example, if you are running Paperless from the Vagrant setup provided (or from any Ubuntu or Debian
|
||||||
|
box), and your documents are written in Spanish you may need to run::
|
||||||
|
|
||||||
|
apt-get install -y tesseract-ocr-spa
|
@ -155,7 +155,7 @@ class Document(models.Model):
|
|||||||
)
|
)
|
||||||
tags = models.ManyToManyField(
|
tags = models.ManyToManyField(
|
||||||
Tag, related_name="documents", blank=True)
|
Tag, related_name="documents", blank=True)
|
||||||
created = models.DateTimeField(default=timezone.now, editable=False)
|
created = models.DateTimeField(default=timezone.now)
|
||||||
modified = models.DateTimeField(auto_now=True, editable=False)
|
modified = models.DateTimeField(auto_now=True, editable=False)
|
||||||
|
|
||||||
class Meta(object):
|
class Meta(object):
|
||||||
|
Loading…
x
Reference in New Issue
Block a user