mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-10-30 03:56:23 -05:00 
			
		
		
		
	Merge branch 'dev'
This commit is contained in:
		| @@ -15,7 +15,7 @@ services: | |||||||
|       POSTGRES_PASSWORD: paperless |       POSTGRES_PASSWORD: paperless | ||||||
| 
 | 
 | ||||||
|   webserver: |   webserver: | ||||||
|     image: jonaswinkler/paperless-ng:0.9 |     image: jonaswinkler/paperless-ng:0.9.1 | ||||||
|     restart: always |     restart: always | ||||||
|     depends_on: |     depends_on: | ||||||
|       - db |       - db | ||||||
| @@ -5,7 +5,7 @@ services: | |||||||
|     restart: always |     restart: always | ||||||
|  |  | ||||||
|   webserver: |   webserver: | ||||||
|     image: jonaswinkler/paperless-ng:0.9 |     image: jonaswinkler/paperless-ng:0.9.1 | ||||||
|     restart: always |     restart: always | ||||||
|     depends_on: |     depends_on: | ||||||
|       - broker |       - broker | ||||||
|   | |||||||
| @@ -1,7 +1,3 @@ | |||||||
| ############################################################################### |  | ||||||
| ### Back end                                                                ### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| FROM python:3.7-slim | FROM python:3.7-slim | ||||||
|  |  | ||||||
| WORKDIR /usr/src/paperless/ | WORKDIR /usr/src/paperless/ | ||||||
|   | |||||||
| @@ -82,6 +82,13 @@ A.  If you used the docker-compose file, simply download the files of the new re | |||||||
|     If you see everything working, you can start paperless-ng with "-d" to have it |     If you see everything working, you can start paperless-ng with "-d" to have it | ||||||
|     run in the background. |     run in the background. | ||||||
|  |  | ||||||
|  |     .. hint:: | ||||||
|  |  | ||||||
|  |         The released docker-compose files specify exact versions to be pulled from the hub. | ||||||
|  |         This is to ensure that if the docker-compose files should change at some point | ||||||
|  |         (i.e., services updates/configured differently), you wont run into trouble due to | ||||||
|  |         docker pulling the ``latest`` image and running it in an older environment. | ||||||
|  |          | ||||||
| B.  If you built the image yourself, grab the new archive and replace your current | B.  If you built the image yourself, grab the new archive and replace your current | ||||||
|     paperless folder with the new contents. |     paperless folder with the new contents. | ||||||
|  |  | ||||||
| @@ -120,6 +127,7 @@ After grabbing the new release and unpacking the contents, do the following: | |||||||
|         $ pip install --upgrade pipenv |         $ pip install --upgrade pipenv | ||||||
|         $ cd /path/to/paperless |         $ cd /path/to/paperless | ||||||
|         $ pipenv install |         $ pipenv install | ||||||
|  |         $ pipenv clean | ||||||
|  |  | ||||||
|     This creates a new virtual environment (or uses your existing environment) |     This creates a new virtual environment (or uses your existing environment) | ||||||
|     and installs all dependencies into it. |     and installs all dependencies into it. | ||||||
| @@ -143,7 +151,7 @@ Management utilities | |||||||
| #################### | #################### | ||||||
|  |  | ||||||
| Paperless comes with some management commands that perform various maintenance | Paperless comes with some management commands that perform various maintenance | ||||||
| tasks on your paperless instance. You can invoce these commands either by | tasks on your paperless instance. You can invoke these commands either by | ||||||
|  |  | ||||||
| .. code:: bash | .. code:: bash | ||||||
|  |  | ||||||
| @@ -311,6 +319,19 @@ the naming scheme. | |||||||
| The command takes no arguments and processes all your documents at once. | The command takes no arguments and processes all your documents at once. | ||||||
|  |  | ||||||
|  |  | ||||||
|  | Fetching e-mail | ||||||
|  | =============== | ||||||
|  |  | ||||||
|  | Paperless automatically fetches your e-mail every 10 minutes by default. If | ||||||
|  | you want to invoke the email consumer manually, call the following management | ||||||
|  | command: | ||||||
|  |  | ||||||
|  | .. code:: | ||||||
|  |  | ||||||
|  |     mail_fetcher | ||||||
|  |  | ||||||
|  | The command takes no arguments and processes all your mail accounts and rules. | ||||||
|  |  | ||||||
| .. _utilities-encyption: | .. _utilities-encyption: | ||||||
|  |  | ||||||
| Managing encryption | Managing encryption | ||||||
| @@ -320,7 +341,7 @@ Documents can be stored in Paperless using GnuPG encryption. | |||||||
|  |  | ||||||
| .. danger:: | .. danger:: | ||||||
|  |  | ||||||
|     Decryption is depreceated since paperless-ng 0.9 and doesn't really provide any |     Encryption is depreceated since paperless-ng 0.9 and doesn't really provide any | ||||||
|     additional security, since you have to store the passphrase in a configuration |     additional security, since you have to store the passphrase in a configuration | ||||||
|     file on the same system as the encrypted documents for paperless to work. |     file on the same system as the encrypted documents for paperless to work. | ||||||
|     Furthermore, the entire text content of the documents is stored plain in the |     Furthermore, the entire text content of the documents is stored plain in the | ||||||
|   | |||||||
| @@ -52,6 +52,8 @@ filename as described above. | |||||||
|  |  | ||||||
| .. _dateparser: https://github.com/scrapinghub/dateparser/blob/v0.7.0/docs/usage.rst#settings | .. _dateparser: https://github.com/scrapinghub/dateparser/blob/v0.7.0/docs/usage.rst#settings | ||||||
|  |  | ||||||
|  | .. _advanced-transforming_filenames: | ||||||
|  |  | ||||||
| Transforming filenames for parsing | Transforming filenames for parsing | ||||||
| ================================== | ================================== | ||||||
|  |  | ||||||
| @@ -219,6 +221,7 @@ the consumption process will begin with the newly modified file. | |||||||
|  |  | ||||||
| .. _pdf2pdfocr.py: https://github.com/LeoFCardoso/pdf2pdfocr | .. _pdf2pdfocr.py: https://github.com/LeoFCardoso/pdf2pdfocr | ||||||
|  |  | ||||||
|  | .. _advanced-post_consume_script: | ||||||
|  |  | ||||||
| Post-consumption script | Post-consumption script | ||||||
| ======================= | ======================= | ||||||
|   | |||||||
| @@ -91,6 +91,7 @@ Result object: | |||||||
|         "document": { |         "document": { | ||||||
|              |              | ||||||
|         } |         } | ||||||
|  |     } | ||||||
|  |  | ||||||
| *   ``id``: the primary key of the found document | *   ``id``: the primary key of the found document | ||||||
| *   ``highlights``: an object containing parseable highlights for the result. | *   ``highlights``: an object containing parseable highlights for the result. | ||||||
| @@ -109,7 +110,7 @@ Each fragment contains a list of strings, and some of them are marked as a highl | |||||||
|  |  | ||||||
| .. code:: json | .. code:: json | ||||||
|  |  | ||||||
|     "highlights": [ |     [ | ||||||
|         [ |         [ | ||||||
|             {"text": "This is a sample text with a "}, |             {"text": "This is a sample text with a "}, | ||||||
|             {"text": "highlighted", "term": 0}, |             {"text": "highlighted", "term": 0}, | ||||||
| @@ -120,6 +121,8 @@ Each fragment contains a list of strings, and some of them are marked as a highl | |||||||
|             {"text": " fragment with a highlight."} |             {"text": " fragment with a highlight."} | ||||||
|         ] |         ] | ||||||
|     ] |     ] | ||||||
|  |      | ||||||
|  |  | ||||||
|  |  | ||||||
| When ``term`` is present within a string, the word within ``text`` should be highlighted. | When ``term`` is present within a string, the word within ``text`` should be highlighted. | ||||||
| The term index groups multiple matches together and words with the same index | The term index groups multiple matches together and words with the same index | ||||||
|   | |||||||
| @@ -66,7 +66,6 @@ paperless-ng 0.9.0 | |||||||
|  |  | ||||||
|   * If ``PAPERLESS_DBHOST`` is specified in the settings, paperless uses postgresql instead of sqlite. |   * If ``PAPERLESS_DBHOST`` is specified in the settings, paperless uses postgresql instead of sqlite. | ||||||
|     Username, database and password all default to ``paperless`` if not specified. |     Username, database and password all default to ``paperless`` if not specified. | ||||||
|   * **docker-compose.yml uses PostgreSQL by default.** |  | ||||||
|  |  | ||||||
| * **Modified [breaking]:** document_retagger management command rework. See | * **Modified [breaking]:** document_retagger management command rework. See | ||||||
|   :ref:`utilities-retagger` for details. Replaces ``document_correspondents`` |   :ref:`utilities-retagger` for details. Replaces ``document_correspondents`` | ||||||
|   | |||||||
| @@ -1,9 +1,10 @@ | |||||||
|  | .. _configuration: | ||||||
|  |  | ||||||
| ************* | ************* | ||||||
| Configuration | Configuration | ||||||
| ************* | ************* | ||||||
|  |  | ||||||
| Paperless provides a wide range of customizations. | Paperless provides a wide range of customizations. | ||||||
| Have a look at ``paperless.conf.example`` for available configuration options. |  | ||||||
| Depending on how you run paperless, these settings have to be defined in different | Depending on how you run paperless, these settings have to be defined in different | ||||||
| places. | places. | ||||||
|  |  | ||||||
| @@ -18,5 +19,288 @@ places. | |||||||
|         /etc/paperless.conf |         /etc/paperless.conf | ||||||
|         /usr/local/etc/paperless.conf |         /usr/local/etc/paperless.conf | ||||||
|  |  | ||||||
|     Copy ``paperless.conf.example`` to any of these locations and adjust it to your |  | ||||||
|     needs. | Required services | ||||||
|  | ################# | ||||||
|  |  | ||||||
|  | PAPERLESS_REDIS=<url> | ||||||
|  |     This is required for processing scheduled tasks such as email fetching, index | ||||||
|  |     optimization and for training the automatic document matcher. | ||||||
|  |  | ||||||
|  |     Defaults to redis://localhost:6379. | ||||||
|  |  | ||||||
|  | PAPERLESS_DBHOST=<hostname> | ||||||
|  |     By default, sqlite is used as the database backend. This can be changed here. | ||||||
|  |     Set PAPERLESS_DBHOST and PostgreSQL will be used instead of mysql. | ||||||
|  |  | ||||||
|  | PAPERLESS_DBPORT=<port> | ||||||
|  |     Adjust port if necessary. | ||||||
|  |      | ||||||
|  |     Default is 5432. | ||||||
|  |  | ||||||
|  | PAPERLESS_DBNAME=<name> | ||||||
|  |     Database name in PostgreSQL. | ||||||
|  |      | ||||||
|  |     Defaults to "paperless". | ||||||
|  |  | ||||||
|  | PAPERLESS_DBUSER=<name> | ||||||
|  |     Database user in PostgreSQL. | ||||||
|  |      | ||||||
|  |     Defaults to "paperless". | ||||||
|  |  | ||||||
|  | PAPERLESS_DBPASS=<password> | ||||||
|  |     Database password for PostgreSQL. | ||||||
|  |      | ||||||
|  |     Defaults to "paperless". | ||||||
|  |  | ||||||
|  |  | ||||||
|  | Paths and folders | ||||||
|  | ################# | ||||||
|  |  | ||||||
|  | PAPERLESS_CONSUMPTION_DIR=<path> | ||||||
|  |     This where your documents should go to be consumed.  Make sure that it exists | ||||||
|  |     and that the user running the paperless service can read/write its contents | ||||||
|  |     before you start Paperless. | ||||||
|  |  | ||||||
|  |     Don't change this when using docker, as it only changes the path within the | ||||||
|  |     container. Change the local consumption directory in the docker-compose.yml | ||||||
|  |     file instead. | ||||||
|  |  | ||||||
|  |     Defaults to "../consume", relative to the "src" directory. | ||||||
|  |  | ||||||
|  | PAPERLESS_DATA_DIR=<path> | ||||||
|  |     This is where paperless stores all its data (search index, sqlite database, | ||||||
|  |     classification model, etc). | ||||||
|  |  | ||||||
|  |     Defaults to "../data", relative to the "src" directory. | ||||||
|  |  | ||||||
|  | PAPERLESS_MEDIA_ROOT=<path> | ||||||
|  |     This is where your documents and thumbnails are stored. | ||||||
|  |  | ||||||
|  |     You can set this and PAPERLESS_DATA_DIR to the same folder to have paperless | ||||||
|  |     store all its data within the same volume. | ||||||
|  |  | ||||||
|  |     Defaults to "../media", relative to the "src" directory. | ||||||
|  |  | ||||||
|  | PAPERLESS_STATICDIR=<path> | ||||||
|  |     Override the default STATIC_ROOT here.  This is where all static files | ||||||
|  |     created using "collectstatic" manager command are stored. | ||||||
|  |  | ||||||
|  |     Unless you're doing something fancy, there is no need to override this. | ||||||
|  |  | ||||||
|  |     Defaults to "../static", relative to the "src" directory. | ||||||
|  |  | ||||||
|  | PAPERLESS_FILENAME_FORMAT=<format> | ||||||
|  |     Changes the filenames paperless uses to store documents in the media directory. | ||||||
|  |     See :ref:`advanced-file_name_handling` for details. | ||||||
|  |  | ||||||
|  |     Default is none, which disables this feature. | ||||||
|  |  | ||||||
|  | Hosting & Security | ||||||
|  | ################## | ||||||
|  |  | ||||||
|  | PAPERLESS_SECRET_KEY=<key> | ||||||
|  |     Paperless uses this to make session tokens. If you exose paperless on the | ||||||
|  |     internet, you need to change this, since the default secret is well known. | ||||||
|  |  | ||||||
|  |     Use any sequence of characters. The more, the better. You don't need to | ||||||
|  |     remember this. Just face-roll your keyboard. | ||||||
|  |  | ||||||
|  |     Default is listed in the file ``src/paperless/settings.py``. | ||||||
|  |  | ||||||
|  | PAPERLESS_ALLOWED_HOSTS<comma-separated-list> | ||||||
|  |     If you're planning on putting Paperless on the open internet, then you | ||||||
|  |     really should set this value to the domain name you're using.  Failing to do | ||||||
|  |     so leaves you open to HTTP host header attacks: | ||||||
|  |     https://docs.djangoproject.com/en/3.1/topics/security/#host-header-validation | ||||||
|  |      | ||||||
|  |     Just remember that this is a comma-separated list, so "example.com" is fine, | ||||||
|  |     as is "example.com,www.example.com", but NOT " example.com" or "example.com," | ||||||
|  |  | ||||||
|  |     Defaults to "*", which is all hosts. | ||||||
|  |  | ||||||
|  | PAPERLESS_CORS_ALLOWED_HOSTS<comma-separated-list> | ||||||
|  |     You need to add your servers to the list of allowed hosts that can do CORS | ||||||
|  |     calls. Set this to your public domain name. | ||||||
|  |  | ||||||
|  |     Defaults to "http://localhost:8000". | ||||||
|  |  | ||||||
|  | PAPERLESS_FORCE_SCRIPT_NAME=<path> | ||||||
|  |     To host paperless under a subpath url like example.com/paperless you set | ||||||
|  |     this value to /paperless. No trailing slash! | ||||||
|  |  | ||||||
|  |     .. note:: | ||||||
|  |  | ||||||
|  |         I don't know if this works in paperless-ng. Probably not. | ||||||
|  |      | ||||||
|  |     Defaults to none, which hosts paperless at "/". | ||||||
|  |  | ||||||
|  | PAPERLESS_STATIC_URL=<path> | ||||||
|  |     Override the STATIC_URL here.  Unless you're hosting Paperless off a | ||||||
|  |     subdomain like /paperless/, you probably don't need to change this. | ||||||
|  |      | ||||||
|  |     Defaults to "/static/". | ||||||
|  |  | ||||||
|  |  | ||||||
|  | Software tweaks | ||||||
|  | ############### | ||||||
|  |  | ||||||
|  | PAPERLESS_TASK_WORKERS=<num> | ||||||
|  |     Paperless does multiple things in the background: Maintain the search index, | ||||||
|  |     maintain the automatic matching algorithm, check emails, consume documents, | ||||||
|  |     etc. This variable specifies how many things it will do in parallel. | ||||||
|  |  | ||||||
|  | PAPERLESS_THREADS_PER_WORKER=<num> | ||||||
|  |     Furthermore, paperless uses multiple threads when consuming documents to | ||||||
|  |     speed up OCR. This variable specifies how many pages paperless will process | ||||||
|  |     in parallel on a single document. | ||||||
|  |  | ||||||
|  |     .. caution:: | ||||||
|  |          | ||||||
|  |         Ensure that the product | ||||||
|  |          | ||||||
|  |             PAPERLESS_TASK_WORKERS * PAPERLESS_THREADS_PER_WORKER | ||||||
|  |          | ||||||
|  |         does not exceed your CPU core count or else paperless will be extremely slow. | ||||||
|  |         If you want paperless to process many documents in parallel, choose a high | ||||||
|  |         worker count. If you want paperless to process very large documents faster, | ||||||
|  |         use a higher thread per worker count. | ||||||
|  |  | ||||||
|  |     The default is a balance between the two, according to your CPU core count, | ||||||
|  |     with a slight favor towards threads per worker, and using as much cores as | ||||||
|  |     possible. | ||||||
|  |  | ||||||
|  |     If you only specify PAPERLESS_TASK_WORKERS, paperless will adjust | ||||||
|  |     PAPERLESS_THREADS_PER_WORKER automatically. | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | PAPERLESS_TIME_ZONE=<timezone> | ||||||
|  |     Set the time zone here. | ||||||
|  |     See https://docs.djangoproject.com/en/3.1/ref/settings/#std:setting-TIME_ZONE | ||||||
|  |     for details on how to set it. | ||||||
|  |  | ||||||
|  |     Defaults to UTC. | ||||||
|  |  | ||||||
|  |  | ||||||
|  |  | ||||||
|  | PAPERLESS_OCR_LANGUAGE=<lang> | ||||||
|  |     Customize the default language that tesseract will attempt to use when | ||||||
|  |     parsing documents. The default language is used whenever | ||||||
|  |      | ||||||
|  |     * No language could be detected on a document | ||||||
|  |     * No tesseract data files are available for the detected language | ||||||
|  |      | ||||||
|  |     It should be a 3-letter language code consistent with ISO | ||||||
|  |     639: https://www.loc.gov/standards/iso639-2/php/code_list.php | ||||||
|  |  | ||||||
|  |     Set this to the language most of your documents are written in. | ||||||
|  |  | ||||||
|  |     Defaults to "eng". | ||||||
|  |  | ||||||
|  | PAPERLESS_OCR_ALWAYS=<bool> | ||||||
|  |     By default Paperless does not OCR a document if the text can be retrieved from | ||||||
|  |     the document directly. Set to true to always OCR documents. | ||||||
|  |  | ||||||
|  |     Defaults to false. | ||||||
|  |  | ||||||
|  | PAPERLESS_CONSUMER_POLLING=<num> | ||||||
|  |     If paperless won't find documents added to your consume folder, it might | ||||||
|  |     not be able to automatically detect filesystem changes. In that case, | ||||||
|  |     specify a polling interval in seconds here, which will then cause paperless | ||||||
|  |     to periodically check your consumption directory for changes. | ||||||
|  |  | ||||||
|  |     Defaults to 0, which disables polling and uses filesystem notifiactions. | ||||||
|  |  | ||||||
|  | PAPERLESS_CONSUMER_DELETE_DUPLICATES=<bool> | ||||||
|  |     When the consumer detects a duplicate document, it will not touch the | ||||||
|  |     original document. This default behavior can be changed here. | ||||||
|  |  | ||||||
|  |     Defaults to false. | ||||||
|  |  | ||||||
|  | PAPERLESS_CONVERT_MEMORY_LIMIT=<num> | ||||||
|  |     On smaller systems, or even in the case of Very Large Documents, the consumer | ||||||
|  |     may explode, complaining about how it's "unable to extend pixel cache".  In | ||||||
|  |     such cases, try setting this to a reasonably low value, like 32.  The | ||||||
|  |     default is to use whatever is necessary to do everything without writing to | ||||||
|  |     disk, and units are in megabytes. | ||||||
|  |      | ||||||
|  |     For more information on how to use this value, you should search | ||||||
|  |     the web for "MAGICK_MEMORY_LIMIT". | ||||||
|  |  | ||||||
|  |     Defaults to 0, which disables the limit. | ||||||
|  |  | ||||||
|  | PAPERLESS_CONVERT_TMPDIR=<path> | ||||||
|  |     Similar to the memory limit, if you've got a small system and your OS mounts | ||||||
|  |     /tmp as tmpfs, you should set this to a path that's on a physical disk, like | ||||||
|  |     /home/your_user/tmp or something.  ImageMagick will use this as scratch space | ||||||
|  |     when crunching through very large documents. | ||||||
|  |      | ||||||
|  |     For more information on how to use this value, you should search | ||||||
|  |     the web for "MAGICK_TMPDIR". | ||||||
|  |  | ||||||
|  |     Default is none, which disables the temporary directory. | ||||||
|  |  | ||||||
|  | PAPERLESS_CONVERT_DENSITY=<num> | ||||||
|  |     This setting has a high impact on the physical size of tmp page files, | ||||||
|  |     the speed of document conversion, and can affect the accuracy of OCR | ||||||
|  |     results. Individual results can vary and this setting should be tested | ||||||
|  |     thoroughly against the documents you are importing to see if it has any | ||||||
|  |     impacts either negative or positive. | ||||||
|  |     Testing on limited document sets has shown a setting of 200 can cut the | ||||||
|  |     size of tmp files by 1/3, and speed up conversion by up to 4x | ||||||
|  |     with little impact to OCR accuracy. | ||||||
|  |  | ||||||
|  |     Default is 300. | ||||||
|  |  | ||||||
|  | PAPERLESS_OPTIMIZE_THUMBNAILS=<bool> | ||||||
|  |     Use optipng to optimize thumbnails. This usually reduces the sice of | ||||||
|  |     thumbnails by about 20%, but uses considerable compute time during | ||||||
|  |     consumption. | ||||||
|  |  | ||||||
|  |     Defaults to true. | ||||||
|  |  | ||||||
|  | PAPERLESS_POST_CONSUME_SCRIPT=<filename> | ||||||
|  |     After a document is consumed, Paperless can trigger an arbitrary script if | ||||||
|  |     you like.  This script will be passed a number of arguments for you to work | ||||||
|  |     with. For more information, take a look at :ref:`advanced-post_consume_script`. | ||||||
|  |  | ||||||
|  |     The default is blank, which means nothing will be executed. | ||||||
|  |  | ||||||
|  | PAPERLESS_FILENAME_DATE_ORDER=<format> | ||||||
|  |     Paperless will check the document text for document date information. | ||||||
|  |     Use this setting to enable checking the document filename for date | ||||||
|  |     information. The date order can be set to any option as specified in | ||||||
|  |     https://dateparser.readthedocs.io/en/latest/settings.html#date-order. | ||||||
|  |     The filename will be checked first, and if nothing is found, the document  | ||||||
|  |     text will be checked as normal. | ||||||
|  |  | ||||||
|  |     Defaults to none, which disables this feature. | ||||||
|  |  | ||||||
|  | PAPERLESS_FILENAME_PARSE_TRANSFORMS | ||||||
|  |     Transforms filenames before they are processed by paperless. See | ||||||
|  |     :ref:`advanced-transforming_filenames` for details. | ||||||
|  |  | ||||||
|  |     Defaults to none, which disables this feature. | ||||||
|  |  | ||||||
|  | Binaries | ||||||
|  | ######## | ||||||
|  |  | ||||||
|  | There are a few external software packages that Paperless expects to find on | ||||||
|  | your system when it starts up.  Unless you've done something creative with | ||||||
|  | their installation, you probably won't need to edit any of these.  However, | ||||||
|  | if you've installed these programs somewhere where simply typing the name of | ||||||
|  | the program doesn't automatically execute it (ie. the program isn't in your | ||||||
|  | $PATH), then you'll need to specify the literal path for that program. | ||||||
|  |  | ||||||
|  | PAPERLESS_CONVERT_BINARY=<path> | ||||||
|  |     Defaults to "/usr/bin/convert". | ||||||
|  |  | ||||||
|  | PAPERLESS_GS_BINARY=<path> | ||||||
|  |     Defaults to "/usr/bin/gs". | ||||||
|  |  | ||||||
|  | PAPERLESS_UNPAPER_BINARY=<path> | ||||||
|  |     Defaults to "/usr/bin/unpaper". | ||||||
|  |  | ||||||
|  | PAPERLESS_OPTIPNG_BINARY=<path> | ||||||
|  |     Defaults to "/usr/bin/optipng". | ||||||
|   | |||||||
| @@ -25,14 +25,28 @@ and then shred them.  Perhaps you might find it useful too. | |||||||
| Paperless-ng | Paperless-ng | ||||||
| ============ | ============ | ||||||
|  |  | ||||||
| I wanted to make big changes to the project that will impact the way it is used | Paperless-ng is a fork of the original paperless project. It changes many | ||||||
| by its users greatly. Among the users who currently use paperless in production | things both on the surface and under the hood. Paperless-ng was created | ||||||
| there are probably many that don't want these changes right away. I also wanted | because I feel that these changes are too big to be pushed into the main | ||||||
| to have more control over what goes into the code and what does not. Therefore, | repository right away. | ||||||
| paperless-ng was created. NG stands for both Angular (the framework used for the |  | ||||||
|  | NG stands for both Angular (the framework used for the | ||||||
| Frontend) and next-gen. Publishing this project under a different name also | Frontend) and next-gen. Publishing this project under a different name also | ||||||
| avoids confusion between paperless and paperless-ng. | avoids confusion between paperless and paperless-ng. | ||||||
|  |  | ||||||
|  | If you want to learn about what's different in paperless-ng, check out these | ||||||
|  | resources in the documentation: | ||||||
|  |  | ||||||
|  | *   :ref:`Some screenshots <screenshots>` of the new UI are available. | ||||||
|  | *   Read :ref:`this section <advanced-automatic_matching>` if you want to | ||||||
|  |     learn about how paperless automates all tagging using machine learning. | ||||||
|  | *   Paperless now comes with a :ref:`proper email consumer <usage-email>` | ||||||
|  |     that's fully tested and production ready. | ||||||
|  | *   See :ref:`this note <utilities-encyption>` about GnuPG encryption in | ||||||
|  |     paperless-ng. | ||||||
|  | *   The :ref:`changelog <paperless_changelog>` contains a detailed list of all changes | ||||||
|  |     in paperless-ng. | ||||||
|  |  | ||||||
| It would be great if this project could eventually merge back into the main | It would be great if this project could eventually merge back into the main | ||||||
| repository, but it needs a lot more work before that can happen. | repository, but it needs a lot more work before that can happen. | ||||||
|  |  | ||||||
|   | |||||||
| @@ -1,3 +1,5 @@ | |||||||
|  | .. _screenshots: | ||||||
|  |  | ||||||
| *********** | *********** | ||||||
| Screenshots | Screenshots | ||||||
| *********** | *********** | ||||||
|   | |||||||
| @@ -28,20 +28,20 @@ Overview of Paperless-ng | |||||||
|  |  | ||||||
| Compared to paperless, paperless-ng works a little different under the hood and has | Compared to paperless, paperless-ng works a little different under the hood and has | ||||||
| more moving parts that work together. While this increases the complexity of | more moving parts that work together. While this increases the complexity of | ||||||
| the system, it also brings many benefits.  | the system, it also brings many benefits. | ||||||
|  |  | ||||||
| Paperless consists of the following components: | Paperless consists of the following components: | ||||||
|  |  | ||||||
| *   **The webserver:** This is pretty much the same as in paperless. It serves  | *   **The webserver:** This is pretty much the same as in paperless. It serves | ||||||
|     the administration pages, the API, and the new frontend. This is the main |     the administration pages, the API, and the new frontend. This is the main | ||||||
|     tool you'll be using to interact with paperless. You may start the webserver |     tool you'll be using to interact with paperless. You may start the webserver | ||||||
|     with |     with | ||||||
|  |  | ||||||
|     .. code:: shell-session |     .. code:: shell-session | ||||||
|          |  | ||||||
|         $ cd /path/to/paperless/src/ |         $ cd /path/to/paperless/src/ | ||||||
|         $ pipenv run gunicorn -c /usr/src/paperless/gunicorn.conf.py -b 0.0.0.0:8000 paperless.wsgi |         $ pipenv run gunicorn -c /usr/src/paperless/gunicorn.conf.py -b 0.0.0.0:8000 paperless.wsgi | ||||||
|      |  | ||||||
|     or by any other means such as Apache ``mod_wsgi``. |     or by any other means such as Apache ``mod_wsgi``. | ||||||
|  |  | ||||||
| *   **The consumer:** This is what watches your consumption folder for documents. | *   **The consumer:** This is what watches your consumption folder for documents. | ||||||
| @@ -53,7 +53,7 @@ Paperless consists of the following components: | |||||||
|     Start the consumer with the management command ``document_consumer``: |     Start the consumer with the management command ``document_consumer``: | ||||||
|  |  | ||||||
|     .. code:: shell-session |     .. code:: shell-session | ||||||
|      |  | ||||||
|         $ cd /path/to/paperless/src/ |         $ cd /path/to/paperless/src/ | ||||||
|         $ pipenv run python3 manage.py document_consumer |         $ pipenv run python3 manage.py document_consumer | ||||||
|  |  | ||||||
| @@ -61,7 +61,7 @@ Paperless consists of the following components: | |||||||
|     for doing much of the heavy lifting. This is a task queue that accepts tasks from |     for doing much of the heavy lifting. This is a task queue that accepts tasks from | ||||||
|     multiple sources and processes tasks in parallel. It also comes with a scheduler that executes |     multiple sources and processes tasks in parallel. It also comes with a scheduler that executes | ||||||
|     certain commands periodically. |     certain commands periodically. | ||||||
|      |  | ||||||
|     This task processor is responsible for: |     This task processor is responsible for: | ||||||
|  |  | ||||||
|     *   Consuming documents. When the consumer finds new documents, it notifies the task processor to |     *   Consuming documents. When the consumer finds new documents, it notifies the task processor to | ||||||
| @@ -72,7 +72,7 @@ Paperless consists of the following components: | |||||||
|         the web interface. |         the web interface. | ||||||
|     *   Maintain the search index and the automatic matching algorithm. These are things that paperless |     *   Maintain the search index and the automatic matching algorithm. These are things that paperless | ||||||
|         needs to do from time to time in order to operate properly. |         needs to do from time to time in order to operate properly. | ||||||
|      |  | ||||||
|     This allows paperless to process multiple documents from your consumption folder in parallel! On |     This allows paperless to process multiple documents from your consumption folder in parallel! On | ||||||
|     a modern multicore system, consumption with full ocr is blazing fast. |     a modern multicore system, consumption with full ocr is blazing fast. | ||||||
|  |  | ||||||
| @@ -82,7 +82,7 @@ Paperless consists of the following components: | |||||||
|     You may start the task processor by executing: |     You may start the task processor by executing: | ||||||
|  |  | ||||||
|     .. code:: shell-session |     .. code:: shell-session | ||||||
|      |  | ||||||
|         $ cd /path/to/paperless/src/ |         $ cd /path/to/paperless/src/ | ||||||
|         $ pipenv run python3 manage.py qcluster |         $ pipenv run python3 manage.py qcluster | ||||||
|  |  | ||||||
| @@ -116,7 +116,7 @@ Docker Route | |||||||
|  |  | ||||||
|     .. caution:: |     .. caution:: | ||||||
|  |  | ||||||
|         If you want to use the included ``docker-compose.yml.example`` file, you |         If you want to use the included ``docker-compose.*.yml`` file, you | ||||||
|         need to have at least Docker version **17.09.0** and docker-compose |         need to have at least Docker version **17.09.0** and docker-compose | ||||||
|         version **1.17.0**. |         version **1.17.0**. | ||||||
|  |  | ||||||
| @@ -129,20 +129,28 @@ Docker Route | |||||||
|         .. _Docker installation guide: https://docs.docker.com/engine/installation/ |         .. _Docker installation guide: https://docs.docker.com/engine/installation/ | ||||||
|         .. _docker-compose installation guide: https://docs.docker.com/compose/install/ |         .. _docker-compose installation guide: https://docs.docker.com/compose/install/ | ||||||
|  |  | ||||||
|  | 2.  Copy either ``docker-compose.sqlite.yml`` or ``docker-compose.postgres.yml`` to | ||||||
|  |     ``docker-compose.yml``, depending on which database backend you want to use. | ||||||
|  |  | ||||||
|  |     .. hint:: | ||||||
|  |  | ||||||
|  |         For new installations, it is recommended to use postgresql as the database | ||||||
|  |         backend. This is due to the increased amount of concurrency in paperless-ng. | ||||||
|  |  | ||||||
| 2.  Modify ``docker-compose.yml`` to your preferences. You should change the path | 2.  Modify ``docker-compose.yml`` to your preferences. You should change the path | ||||||
|     to the consumption directory in this file. Find the line that specifies where |     to the consumption directory in this file. Find the line that specifies where | ||||||
|     to mount the consumption directory: |     to mount the consumption directory: | ||||||
|  |  | ||||||
|     .. code:: |     .. code:: | ||||||
|      |  | ||||||
|         - ./consume:/usr/src/paperless/consume |         - ./consume:/usr/src/paperless/consume | ||||||
|      |  | ||||||
|     Replace the part BEFORE the colon with a local directory of your choice: |     Replace the part BEFORE the colon with a local directory of your choice: | ||||||
|  |  | ||||||
|     .. code:: |     .. code:: | ||||||
|  |  | ||||||
|         - /home/jonaswinkler/paperless-inbox:/usr/src/paperless/consume |         - /home/jonaswinkler/paperless-inbox:/usr/src/paperless/consume | ||||||
|      |  | ||||||
|     Don't change the part after the colon or paperless wont find your documents. |     Don't change the part after the colon or paperless wont find your documents. | ||||||
|  |  | ||||||
|  |  | ||||||
| @@ -154,6 +162,11 @@ Docker Route | |||||||
|     1000 (the default for the first normal user on most systems), it will |     1000 (the default for the first normal user on most systems), it will | ||||||
|     work out of the box without any modifications. |     work out of the box without any modifications. | ||||||
|  |  | ||||||
|  |     .. note:: | ||||||
|  |  | ||||||
|  |         You can use any settings from the file ``paperless.conf`` in this file. | ||||||
|  |         Have a look at :ref:`configuration` to see whats available. | ||||||
|  |  | ||||||
| 4.  Run ``docker-compose up -d``. This will create and start the necessary | 4.  Run ``docker-compose up -d``. This will create and start the necessary | ||||||
|     containers. This will also build the image of paperless if you grabbed the |     containers. This will also build the image of paperless if you grabbed the | ||||||
|     source archive. |     source archive. | ||||||
| @@ -196,14 +209,9 @@ things have changed under the hood, so you need to adapt your setup depending on | |||||||
| how you installed paperless. The important things to keep in mind are as follows. | how you installed paperless. The important things to keep in mind are as follows. | ||||||
|  |  | ||||||
| * Read the :ref:`changelog <paperless_changelog>` and take note of breaking changes. | * Read the :ref:`changelog <paperless_changelog>` and take note of breaking changes. | ||||||
| * It is recommended to use postgresql as the database now. The docker-compose | * It is recommended to use postgresql as the database now. If you want to continue | ||||||
|   deployment will automatically create a postgresql instance and instruct |   using SQLite, which is the default of paperless, use ``docker-compose.sqlite.yml``. | ||||||
|   paperless to use it. This means that if you use the docker-compose script |   See :ref:`setup-sqlite_to_psql` for details on how to move your data from | ||||||
|   with your current paperless media and data volumes and used the default |  | ||||||
|   sqlite database, **it will not use your sqlite database and it may seem |  | ||||||
|   as if your documents are gone**. You may use the provided |  | ||||||
|   ``docker-compose.sqlite.yml`` script instead, which does not use postgresql. See |  | ||||||
|   :ref:`setup-sqlite_to_psql` for details on how to move your data from |  | ||||||
|   sqlite to postgres. |   sqlite to postgres. | ||||||
| * The task scheduler of paperless, which is used to execute periodic tasks | * The task scheduler of paperless, which is used to execute periodic tasks | ||||||
|   such as email checking and maintenance, requires a `redis`_ message broker |   such as email checking and maintenance, requires a `redis`_ message broker | ||||||
| @@ -228,26 +236,40 @@ Migration to paperless-ng is then performed in a few simple steps: | |||||||
| 3.  Download the latest release of paperless-ng. You can either go with the | 3.  Download the latest release of paperless-ng. You can either go with the | ||||||
|     docker-compose files or use the archive to build the image yourself. |     docker-compose files or use the archive to build the image yourself. | ||||||
|     You can either replace your current paperless folder or put paperless-ng |     You can either replace your current paperless folder or put paperless-ng | ||||||
|     in a different location. Paperless-ng will use the same docker volumes |     in a different location. | ||||||
|     as paperless. |  | ||||||
|  |     .. caution:: | ||||||
|  |  | ||||||
|  |         Make sure you also download the ``.env`` file. This will set the | ||||||
|  |         project name for docker compose to ``paperless`` and then it will | ||||||
|  |         automatically reuse your existing paperless volumes. | ||||||
|  |  | ||||||
| 4.  Adjust ``docker-compose.yml`` and | 4.  Adjust ``docker-compose.yml`` and | ||||||
|     ``docker-compose.env`` to your needs. |     ``docker-compose.env`` to your needs. | ||||||
|     See `docker route`_ for details on which edits are required. |     See `docker route`_ for details on which edits are required. | ||||||
|  |  | ||||||
| 5.  Update paperless. See :ref:`administration-updating` for details. | 5.  Start paperless-ng. | ||||||
|  |  | ||||||
| 6.  Start paperless-ng. |     .. code:: bash | ||||||
|  |  | ||||||
|  |         $ docker-compose up | ||||||
|  |  | ||||||
|  |     If you see everything working (you should see some migrations getting | ||||||
|  |     applied, for instance), you can gracefully stop paperless-ng with Ctrl-C | ||||||
|  |     and then start paperless-ng as usual with | ||||||
|  |  | ||||||
|     .. code:: bash |     .. code:: bash | ||||||
|  |  | ||||||
|         $ docker-compose up -d |         $ docker-compose up -d | ||||||
|  |  | ||||||
| 7.  Paperless installed a permanent redirect to ``admin/`` in your browser. This |     This will run paperless in the background and automatically start it on system boot. | ||||||
|     redirect is still in place and prevents access to the new UI. Clear  |  | ||||||
|  | 6.  Paperless installed a permanent redirect to ``admin/`` in your browser. This | ||||||
|  |     redirect is still in place and prevents access to the new UI. Clear | ||||||
|     everything related to paperless in your browsers data in order to fix |     everything related to paperless in your browsers data in order to fix | ||||||
|     this issue. |     this issue. | ||||||
|  |  | ||||||
|  |  | ||||||
| .. _setup-sqlite_to_psql: | .. _setup-sqlite_to_psql: | ||||||
|  |  | ||||||
| Moving data from sqlite to postgresql | Moving data from sqlite to postgresql | ||||||
|   | |||||||
| @@ -82,6 +82,7 @@ files from the scanner.  Typically, you're looking at an FTP server like | |||||||
|  |  | ||||||
| .. TODO: hyperref to configuration of the location of this magic folder. | .. TODO: hyperref to configuration of the location of this magic folder. | ||||||
|  |  | ||||||
|  | .. _usage-email: | ||||||
|  |  | ||||||
| IMAP (Email) | IMAP (Email) | ||||||
| ============ | ============ | ||||||
| @@ -133,6 +134,11 @@ These are as follows: | |||||||
|     paperless will read them automatically. The default acion "mark as read" is |     paperless will read them automatically. The default acion "mark as read" is | ||||||
|     pretty tame and will not cause any damage or data loss whatsoever. |     pretty tame and will not cause any damage or data loss whatsoever. | ||||||
|  |  | ||||||
|  |     You can also setup a special folder in your mail account for paperless and use | ||||||
|  |     your favorite mail client to move to be consumed mails into that folder | ||||||
|  |     automatically or manually and tell paperless to move them to yet another folder | ||||||
|  |     after consumption. It's up to you. | ||||||
|  |  | ||||||
| .. note:: | .. note:: | ||||||
|  |  | ||||||
|     Paperless will process the rules in the order defined in the admin page. |     Paperless will process the rules in the order defined in the admin page. | ||||||
|   | |||||||
| @@ -1,287 +1,55 @@ | |||||||
| # Sample paperless.conf | # Have a look at the docs for documentation. | ||||||
| # Copy this file to /etc/paperless.conf and modify it to suit your needs. | # https://paperless-ng.readthedocs.io/en/latest/configuration.html | ||||||
| # As this file contains passwords it should only be readable by the user |  | ||||||
| # running paperless. |  | ||||||
|  |  | ||||||
| ############################################################################### | # Debug. Only enable this for development. | ||||||
| ####                           Message Broker                              #### |  | ||||||
| ############################################################################### | #PAPERLESS_DEBUG=false | ||||||
|  |  | ||||||
|  | # Required services | ||||||
|  |  | ||||||
| # This is required for processing scheduled tasks such as email fetching, index |  | ||||||
| # optimization and for training the automatic document matcher. |  | ||||||
| # Defaults to localhost:6379. |  | ||||||
| #PAPERLESS_REDIS=redis://localhost:6379 | #PAPERLESS_REDIS=redis://localhost:6379 | ||||||
|  |  | ||||||
|  |  | ||||||
| ############################################################################### |  | ||||||
| ####                        Database Settings                              #### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| # By default, sqlite is used as the database backend. This can be changed here. |  | ||||||
| # The docker-compose service definition uses a postgresql server. The |  | ||||||
| # configuration for this is already done inside the docker-compose.env file. |  | ||||||
|  |  | ||||||
| #Set PAPERLESS_DBHOST and postgresql will be used instead of mysql. |  | ||||||
| #PAPERLESS_DBHOST=localhost | #PAPERLESS_DBHOST=localhost | ||||||
|  | #PAPERLESS_DBPORT=5432 | ||||||
| #Adjust port if necessary |  | ||||||
| #PAPERLESS_DBPORT= |  | ||||||
|  |  | ||||||
| #name, user and pass all default to "paperless" |  | ||||||
| #PAPERLESS_DBNAME=paperless | #PAPERLESS_DBNAME=paperless | ||||||
| #PAPERLESS_DBUSER=paperless | #PAPERLESS_DBUSER=paperless | ||||||
| #PAPERLESS_DBPASS=paperless | #PAPERLESS_DBPASS=paperless | ||||||
|  |  | ||||||
|  | # Paths and folders | ||||||
|  |  | ||||||
| ############################################################################### | #PAPERLESS_CONSUMPTION_DIR=../consume | ||||||
| ####                         Paths & Folders                               #### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| # This where your documents should go to be consumed.  Make sure that it exists |  | ||||||
| # and that the user running the paperless service can read/write its contents |  | ||||||
| # before you start Paperless. |  | ||||||
| PAPERLESS_CONSUMPTION_DIR=../consume |  | ||||||
|  |  | ||||||
| # This is where paperless stores all its data (search index, sqlite database, |  | ||||||
| # classification model, etc). |  | ||||||
| #PAPERLESS_DATA_DIR=../data | #PAPERLESS_DATA_DIR=../data | ||||||
|  |  | ||||||
| # This is where your documents and thumbnails are stored. |  | ||||||
| #PAPERLESS_MEDIA_ROOT=../media | #PAPERLESS_MEDIA_ROOT=../media | ||||||
|  |  | ||||||
| # Override the default STATIC_ROOT here.  This is where all static files |  | ||||||
| # created using "collectstatic" manager command are stored. |  | ||||||
| #PAPERLESS_STATICDIR=../static | #PAPERLESS_STATICDIR=../static | ||||||
|  |  | ||||||
|  |  | ||||||
| # Override the STATIC_URL here.  Unless you're hosting Paperless off a |  | ||||||
| # subdomain like /paperless/, you probably don't need to change this. |  | ||||||
| #PAPERLESS_STATIC_URL=/static/ |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # Specify a filename format for the document (directories are supported) |  | ||||||
| # Use the following placeholders: |  | ||||||
| # * {correspondent} |  | ||||||
| # * {title} |  | ||||||
| # * {created} |  | ||||||
| # * {added} |  | ||||||
| # * {tags[KEY]} If your tags conform to key_value or key-value |  | ||||||
| # * {tags[INDEX]} If your tags are strings, select the tag by index |  | ||||||
| # Uniqueness of filenames is ensured, as an incrementing counter is attached |  | ||||||
| # to each filename. |  | ||||||
| #PAPERLESS_FILENAME_FORMAT= | #PAPERLESS_FILENAME_FORMAT= | ||||||
|  |  | ||||||
| ############################################################################### | # Security and hosting | ||||||
| ####                              Security                                 #### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| # Controls whether django's debug mode is enabled. Disable this on production |  | ||||||
| # systems. Debug mode is disabled by default. |  | ||||||
| #PAPERLESS_DEBUG=false |  | ||||||
|  |  | ||||||
| # GnuPG encryption is deprecated and will be removed in future versions. |  | ||||||
| # |  | ||||||
| # Dont use it. It does not provide any security at all. |  | ||||||
| # |  | ||||||
| # Paperless can be instructed to attempt to encrypt your PDF files with GPG |  | ||||||
| # using the PAPERLESS_PASSPHRASE specified below.  If however you're not |  | ||||||
| # concerned about encrypting these files (for example if you have disk |  | ||||||
| # encryption locally) then you don't need this and can safely leave this value |  | ||||||
| # un-set. |  | ||||||
| # |  | ||||||
| # One final note about the passphrase.  Once you've consumed a document with |  | ||||||
| # one passphrase, DON'T CHANGE IT.  Paperless assumes this to be a constant and |  | ||||||
| # can't properly export documents that were encrypted with an old passphrase if |  | ||||||
| # you've since changed it to a new one. |  | ||||||
| # |  | ||||||
| # The default is to not use encryption at all. |  | ||||||
| #PAPERLESS_PASSPHRASE=secret |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # The secret key has a default that should be fine so long as you're hosting |  | ||||||
| # Paperless on a closed network.  However, if you're putting this anywhere |  | ||||||
| # public, you should change the key to something unique and verbose. |  | ||||||
| #PAPERLESS_SECRET_KEY=change-me | #PAPERLESS_SECRET_KEY=change-me | ||||||
|  |  | ||||||
|  |  | ||||||
| # If you're planning on putting Paperless on the open internet, then you |  | ||||||
| # really should set this value to the domain name you're using.  Failing to do |  | ||||||
| # so leaves you open to HTTP host header attacks: |  | ||||||
| # https://docs.djangoproject.com/en/1.10/topics/security/#host-headers-virtual-hosting |  | ||||||
| # |  | ||||||
| # Just remember that this is a comma-separated list, so "example.com" is fine, |  | ||||||
| # as is "example.com,www.example.com", but NOT " example.com" or "example.com," |  | ||||||
| #PAPERLESS_ALLOWED_HOSTS=example.com,www.example.com | #PAPERLESS_ALLOWED_HOSTS=example.com,www.example.com | ||||||
|  |  | ||||||
| # If you decide to use the Paperless API in an ajax call, you need to add your |  | ||||||
| # servers to the list of allowed hosts that can do CORS calls. By default |  | ||||||
| # Paperless allows calls from localhost:8080, but you'd like to change that, |  | ||||||
| # you can set this value to a comma-separated list. |  | ||||||
| #PAPERLESS_CORS_ALLOWED_HOSTS=localhost:8080,example.com,localhost:8000 | #PAPERLESS_CORS_ALLOWED_HOSTS=localhost:8080,example.com,localhost:8000 | ||||||
|  |  | ||||||
| # To host paperless under a subpath url like example.com/paperless you set |  | ||||||
| # this value to /paperless. No trailing slash! |  | ||||||
| # |  | ||||||
| # https://docs.djangoproject.com/en/1.11/ref/settings/#force-script-name |  | ||||||
| #PAPERLESS_FORCE_SCRIPT_NAME= | #PAPERLESS_FORCE_SCRIPT_NAME= | ||||||
|  | #PAPERLESS_STATIC_URL=/static/ | ||||||
|  |  | ||||||
| ############################################################################### | # Software tweaks | ||||||
| ####                          Software Tweaks                              #### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| # Paperless does multiple things in the background: Maintain the search index, |  | ||||||
| # maintain the automatic matching algorithm, check emails, consume documents, |  | ||||||
| # etc. This variable specifies how many things it will do in parallel. |  | ||||||
| #PAPERLESS_TASK_WORKERS=1 | #PAPERLESS_TASK_WORKERS=1 | ||||||
|  |  | ||||||
| # Furthermore, paperless uses multiple threads when consuming documents to |  | ||||||
| # speed up OCR. This variable specifies how many pages paperless will process |  | ||||||
| # in parallel on a single document. |  | ||||||
| #PAPERLESS_THREADS_PER_WORKER=1 | #PAPERLESS_THREADS_PER_WORKER=1 | ||||||
|  |  | ||||||
| # Ensure that the product |  | ||||||
| #   PAPERLESS_TASK_WORKERS * PAPERLESS_THREADS_PER_WORKER |  | ||||||
| # does not exceed your CPU core count or else paperless will be extremely slow. |  | ||||||
| # If you want paperless to process many documents in parallel, choose a high |  | ||||||
| # worker count. If you want paperless to process very large documents faster, |  | ||||||
| # use a higher thread per worker count. |  | ||||||
| # The default is a balance between the two, according to your CPU core count, |  | ||||||
| # with a slight favor towards threads per worker, and using as much cores as |  | ||||||
| # possible. |  | ||||||
| # If you only specify PAPERLESS_TASK_WORKERS, paperless will adjust |  | ||||||
| # PAPERLESS_THREADS_PER_WORKER automatically. |  | ||||||
|  |  | ||||||
| # If paperless won't find documents added to your consume folder, it might |  | ||||||
| # not be able to automatically detect filesystem changes. In that case, |  | ||||||
| # specify a polling interval in seconds below, which will then cause paperless |  | ||||||
| # to periodically check your consumption directory for changes. |  | ||||||
| #PAPERLESS_CONSUMER_POLLING=10 |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # When the consumer detects a duplicate document, it will not touch the |  | ||||||
| # original document. This default behavior can be changed here. |  | ||||||
| #PAPERLESS_CONSUMER_DELETE_DUPLICATES=false |  | ||||||
|  |  | ||||||
| # Use optipng to optimize thumbnails. This usually reduces the sice of |  | ||||||
| # thumbnails by about 20%, but uses considerable compute time during |  | ||||||
| # consumption. |  | ||||||
| #PAPERLESS_OPTIMIZE_THUMBNAILS=true |  | ||||||
|  |  | ||||||
| # After a document is consumed, Paperless can trigger an arbitrary script if |  | ||||||
| # you like.  This script will be passed a number of arguments for you to work |  | ||||||
| # with.  The default is blank, which means nothing will be executed.  For more |  | ||||||
| # information, take a look at the docs: |  | ||||||
| # http://paperless.readthedocs.org/en/latest/consumption.html#hooking-into-the-consumption-process |  | ||||||
| #PAPERLESS_POST_CONSUME_SCRIPT=/path/to/an/arbitrary/script.sh |  | ||||||
|  |  | ||||||
| # By default, paperless will check the document text for document date information. |  | ||||||
| # Uncomment the line below to enable checking the document filename for date |  | ||||||
| # information. The date order can be set to any option as specified in |  | ||||||
| # https://dateparser.readthedocs.io/en/latest/#settings. The filename will be |  | ||||||
| # checked first, and if nothing is found, the document text will be checked |  | ||||||
| # as normal. |  | ||||||
| #PAPERLESS_FILENAME_DATE_ORDER=YMD |  | ||||||
|  |  | ||||||
| # Sometimes devices won't create filenames which can be parsed properly |  | ||||||
| # by the filename parser (see |  | ||||||
| # https://paperless.readthedocs.io/en/latest/guesswork.html). |  | ||||||
| # |  | ||||||
| # This setting allows to specify a list of transformations |  | ||||||
| # in regular expression syntax, which are passed in order to re.sub. |  | ||||||
| # Transformation stops after the first match, so at most one transformation |  | ||||||
| # is applied. |  | ||||||
| # |  | ||||||
| # Syntax is a JSON array of dictionaries containing "pattern" and "repl" |  | ||||||
| # as keys. |  | ||||||
| # |  | ||||||
| # The example below transforms filenames created by a Brother ADS-2400N |  | ||||||
| # document scanner in its standard configuration `Name_Date_Count', so that |  | ||||||
| # count is used as title, name as tag and date can be parsed by paperless. |  | ||||||
| #PAPERLESS_FILENAME_PARSE_TRANSFORMS=[{"pattern":"^([a-z]+)_(\\d{8})_(\\d{6})_([0-9]+)\\.", "repl":"\\2\\3Z - \\4 - \\1."}] |  | ||||||
|  |  | ||||||
| # |  | ||||||
| # The following values use sensible defaults for modern systems, but if you're |  | ||||||
| # running Paperless on a low-resource device (like a Raspberry Pi), modifying |  | ||||||
| # some of these values may be necessary. |  | ||||||
| # |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # Customize the default language that tesseract will attempt to use when |  | ||||||
| # parsing documents. The default language is used whenever |  | ||||||
| #  - No language could be detected on a document |  | ||||||
| #  - No tesseract data files are available for the detected language |  | ||||||
| # It should be a 3-letter language code consistent with ISO |  | ||||||
| # 639: https://www.loc.gov/standards/iso639-2/php/code_list.php |  | ||||||
| #PAPERLESS_OCR_LANGUAGE=eng |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # On smaller systems, or even in the case of Very Large Documents, the consumer |  | ||||||
| # may explode, complaining about how it's "unable to extend pixel cache".  In |  | ||||||
| # such cases, try setting this to a reasonably low value, like 32000000.  The |  | ||||||
| # default is to use whatever is necessary to do everything without writing to |  | ||||||
| # disk, and units are in megabytes. |  | ||||||
| # |  | ||||||
| # For more information on how to use this value, you should probably search |  | ||||||
| # the web for "MAGICK_MEMORY_LIMIT". |  | ||||||
| #PAPERLESS_CONVERT_MEMORY_LIMIT=0 |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # Similar to the memory limit, if you've got a small system and your OS mounts |  | ||||||
| # /tmp as tmpfs, you should set this to a path that's on a physical disk, like |  | ||||||
| # /home/your_user/tmp or something.  ImageMagick will use this as scratch space |  | ||||||
| # when crunching through very large documents. |  | ||||||
| # |  | ||||||
| # For more information on how to use this value, you should probably search |  | ||||||
| # the web for "MAGICK_TMPDIR". |  | ||||||
| #PAPERLESS_CONVERT_TMPDIR=/var/tmp/paperless |  | ||||||
|  |  | ||||||
|  |  | ||||||
| # By default the conversion density setting for documents is 300DPI, in some |  | ||||||
| # cases it has proven useful to configure a lesser value. |  | ||||||
| # This setting has a high impact on the physical size of tmp page files, |  | ||||||
| # the speed of document conversion, and can affect the accuracy of OCR |  | ||||||
| # results. Individual results can vary and this setting should be tested |  | ||||||
| # thoroughly against the documents you are importing to see if it has any |  | ||||||
| # impacts either negative or positive. |  | ||||||
| # Testing on limited document sets has shown a setting of 200 can cut the |  | ||||||
| # size of tmp files by 1/3, and speed up conversion by up to 4x |  | ||||||
| # with little impact to OCR accuracy. |  | ||||||
| #PAPERLESS_CONVERT_DENSITY=300 |  | ||||||
|  |  | ||||||
| # By default Paperless does not OCR a document if the text can be retrieved from |  | ||||||
| # the document directly. Set to true to always OCR documents. |  | ||||||
| #PAPERLESS_OCR_ALWAYS=false |  | ||||||
|  |  | ||||||
|  |  | ||||||
| ############################################################################### |  | ||||||
| ####                            Interface                                  #### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| # Override the default UTC time zone here. |  | ||||||
| # See https://docs.djangoproject.com/en/1.10/ref/settings/#std:setting-TIME_ZONE |  | ||||||
| # for details on how to set it. |  | ||||||
| #PAPERLESS_TIME_ZONE=UTC | #PAPERLESS_TIME_ZONE=UTC | ||||||
|  | #PAPERLESS_OCR_LANGUAGE=eng | ||||||
|  | #PAPERLESS_OCR_ALWAYS=false | ||||||
|  | #PAPERLESS_CONSUMER_POLLING=10 | ||||||
|  | #PAPERLESS_CONSUMER_DELETE_DUPLICATES=false | ||||||
|  | #PAPERLESS_CONVERT_MEMORY_LIMIT=0 | ||||||
|  | #PAPERLESS_CONVERT_TMPDIR=/var/tmp/paperless | ||||||
|  | #PAPERLESS_CONVERT_DENSITY=300 | ||||||
|  | #PAPERLESS_OPTIMIZE_THUMBNAILS=true | ||||||
|  | #PAPERLESS_POST_CONSUME_SCRIPT=/path/to/an/arbitrary/script.sh | ||||||
|  | #PAPERLESS_FILENAME_DATE_ORDER=YMD | ||||||
|  | #PAPERLESS_FILENAME_PARSE_TRANSFORMS=[] | ||||||
|  |  | ||||||
|  | # Binaries | ||||||
|  |  | ||||||
| ############################################################################### |  | ||||||
| ####                     Third-Party Binaries                              #### |  | ||||||
| ############################################################################### |  | ||||||
|  |  | ||||||
| # There are a few external software packages that Paperless expects to find on |  | ||||||
| # your system when it starts up.  Unless you've done something creative with |  | ||||||
| # their installation, you probably won't need to edit any of these.  However, |  | ||||||
| # if you've installed these programs somewhere where simply typing the name of |  | ||||||
| # the program doesn't automatically execute it (ie. the program isn't in your |  | ||||||
| # $PATH), then you'll need to specify the literal path for that program here. |  | ||||||
|  |  | ||||||
| # Convert (part of the ImageMagick suite) |  | ||||||
| #PAPERLESS_CONVERT_BINARY=/usr/bin/convert | #PAPERLESS_CONVERT_BINARY=/usr/bin/convert | ||||||
|  |  | ||||||
| # Ghostscript |  | ||||||
| #PAPERLESS_GS_BINARY=/usr/bin/gs | #PAPERLESS_GS_BINARY=/usr/bin/gs | ||||||
|  |  | ||||||
| # Unpaper |  | ||||||
| #PAPERLESS_UNPAPER_BINARY=/usr/bin/unpaper | #PAPERLESS_UNPAPER_BINARY=/usr/bin/unpaper | ||||||
|  |  | ||||||
| # Optipng (for optimising thumbnail sizes) |  | ||||||
| #PAPERLESS_OPTIPNG_BINARY=/usr/bin/optipng | #PAPERLESS_OPTIPNG_BINARY=/usr/bin/optipng | ||||||
|   | |||||||
| @@ -79,6 +79,7 @@ cp "$PAPERLESS_ROOT/docker/docker-compose.env" "$PAPERLESS_DIST_APP" | |||||||
|  |  | ||||||
| # docker files for pulling from docker hub | # docker files for pulling from docker hub | ||||||
| cp "$PAPERLESS_ROOT/docker/hub/"* "$PAPERLESS_DIST" | cp "$PAPERLESS_ROOT/docker/hub/"* "$PAPERLESS_DIST" | ||||||
|  | cp "$PAPERLESS_ROOT/.env" "$PAPERLESS_DIST" | ||||||
| cp "$PAPERLESS_ROOT/docker/docker-compose.env" "$PAPERLESS_DIST" | cp "$PAPERLESS_ROOT/docker/docker-compose.env" "$PAPERLESS_DIST" | ||||||
|  |  | ||||||
| # auxiliary files required for the docker image | # auxiliary files required for the docker image | ||||||
|   | |||||||
| @@ -152,11 +152,11 @@ else: | |||||||
|     X_FRAME_OPTIONS = 'SAMEORIGIN' |     X_FRAME_OPTIONS = 'SAMEORIGIN' | ||||||
|  |  | ||||||
| # We allow CORS from localhost:8080 | # We allow CORS from localhost:8080 | ||||||
| CORS_ORIGIN_WHITELIST = tuple(os.getenv("PAPERLESS_CORS_ALLOWED_HOSTS", "http://localhost:8080,https://localhost:8080").split(",")) | CORS_ALLOWED_ORIGINS = tuple(os.getenv("PAPERLESS_CORS_ALLOWED_HOSTS", "http://localhost:8000").split(",")) | ||||||
|  |  | ||||||
| if DEBUG: | if DEBUG: | ||||||
|     # Allow access from the angular development server during debugging |     # Allow access from the angular development server during debugging | ||||||
|     CORS_ORIGIN_WHITELIST += ('http://localhost:4200',) |     CORS_ALLOWED_ORIGINS += ('http://localhost:4200',) | ||||||
|  |  | ||||||
| # The secret key has a default that should be fine so long as you're hosting | # The secret key has a default that should be fine so long as you're hosting | ||||||
| # Paperless on a closed network.  However, if you're putting this anywhere | # Paperless on a closed network.  However, if you're putting this anywhere | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 Jonas Winkler
					Jonas Winkler