mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-10-30 03:56:23 -05:00 
			
		
		
		
	Further cleanup of docs, including fixing autoconvert issues and general cleanups
This commit is contained in:
		 Trenton Holmes
					Trenton Holmes
				
			
				
					committed by
					
						 Trenton H
						Trenton H
					
				
			
			
				
	
			
			
			 Trenton H
						Trenton H
					
				
			
						parent
						
							32d546740b
						
					
				
				
					commit
					7788d93227
				
			| @@ -9,7 +9,7 @@ Before making backups, make sure that paperless is not running. | ||||
|  | ||||
| Options available to any installation of paperless: | ||||
|  | ||||
| - Use the [document exporter](administration#exporter). The document exporter exports all your documents, | ||||
| - Use the [document exporter](#exporter). The document exporter exports all your documents, | ||||
|   thumbnails and metadata to a specific folder. You may import your | ||||
|   documents into a fresh instance of paperless again or store your | ||||
|   documents in another DMS with this export. | ||||
| @@ -52,7 +52,7 @@ Options available to bare-metal and non-docker installations: | ||||
|  | ||||
| ## Updating Paperless {#updating} | ||||
|  | ||||
| ### Docker Route | ||||
| ### Docker Route {#docker-updating} | ||||
|  | ||||
| If a new release of paperless-ngx is available, upgrading depends on how | ||||
| you installed paperless-ngx in the first place. The releases are | ||||
| @@ -70,7 +70,7 @@ After that, [make a backup](#backup). | ||||
|  | ||||
| A. If you pull the image from the docker hub, all you need to do is: | ||||
|  | ||||
|     ``` shell-session | ||||
|     ```shell-session | ||||
|     $ docker-compose pull | ||||
|     $ docker-compose up | ||||
|     ``` | ||||
| @@ -80,7 +80,7 @@ A. If you pull the image from the docker hub, all you need to do is: | ||||
|  | ||||
| B. If you built the image yourself, do the following: | ||||
|  | ||||
|     ``` shell-session | ||||
|     ```shell-session | ||||
|     $ git pull | ||||
|     $ docker-compose build | ||||
|     $ docker-compose up | ||||
| @@ -131,7 +131,7 @@ the background. | ||||
|     image: ghcr.io/paperless-ngx/paperless-ngx:1.7 | ||||
|     ``` | ||||
|  | ||||
| ### Bare Metal Route | ||||
| ### Bare Metal Route {#bare-metal-updating} | ||||
|  | ||||
| After grabbing the new release and unpacking the contents, do the | ||||
| following: | ||||
| @@ -158,7 +158,7 @@ following: | ||||
|     This might not actually do anything. Not every new paperless version | ||||
|     comes with new database migrations. | ||||
|  | ||||
| ## Downgrading Paperless | ||||
| ## Downgrading Paperless {#downgrade-paperless} | ||||
|  | ||||
| Downgrades are possible. However, some updates also contain database | ||||
| migrations (these change the layout of the database and may move data). | ||||
| @@ -366,7 +366,7 @@ task scheduler. | ||||
| ### Managing filenames {#renamer} | ||||
|  | ||||
| If you use paperless' feature to | ||||
| [assign custom filenames to your documents](/advanced_usage#file_name_handling), you can use this command to move all your files after | ||||
| [assign custom filenames to your documents](/advanced_usage#file-name-handling), you can use this command to move all your files after | ||||
| changing the naming scheme. | ||||
|  | ||||
| !!! warning | ||||
| @@ -430,9 +430,7 @@ rules. | ||||
|     As of October 2022 Microsoft no longer supports IMAP authentication | ||||
|     for Exchange servers, thus Exchange is no longer supported until a | ||||
|     solution is implemented in the Python IMAP library used by Paperless. | ||||
|     See | ||||
|  | ||||
| [learn.microsoft.com](https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online) | ||||
|     See [learn.microsoft.com](https://learn.microsoft.com/en-us/exchange/clients-and-mobile-in-exchange-online/deprecation-of-basic-authentication-exchange-online) | ||||
|  | ||||
| ### Creating archived documents {#archiver} | ||||
|  | ||||
|   | ||||
| @@ -50,7 +50,7 @@ and run another document through the consumer. Once complete, you should | ||||
| see the newly-created document, automatically tagged with the | ||||
| appropriate data. | ||||
|  | ||||
| ### Automatic matching {#automatic_matching} | ||||
| ### Automatic matching {#automatic-matching} | ||||
|  | ||||
| Paperless-ngx comes with a new matching algorithm called _Auto_. This | ||||
| matching algorithm tries to assign tags, correspondents, document types, | ||||
| @@ -59,8 +59,8 @@ assigned these on existing documents. It uses a neural network under the | ||||
| hood. | ||||
|  | ||||
| If, for example, all your bank statements of your account 123 at the | ||||
| Bank of America are tagged with the tag "bofa*123" and the matching | ||||
| algorithm of this tag is set to \_Auto*, this neural network will examine | ||||
| Bank of America are tagged with the tag "bofa123" and the matching | ||||
| algorithm of this tag is set to _Auto_, this neural network will examine | ||||
| your documents and automatically learn when to assign this tag. | ||||
|  | ||||
| Paperless tries to hide much of the involved complexity with this | ||||
| @@ -95,7 +95,7 @@ when using this feature: | ||||
|   of these correspondents to ANY new document, if both are set to | ||||
|   automatic matching. | ||||
|  | ||||
| ## Hooking into the consumption process | ||||
| ## Hooking into the consumption process {#consume-hooks} | ||||
|  | ||||
| Sometimes you may want to do something arbitrary whenever a document is | ||||
| consumed. Rather than try to predict what you may want to do, Paperless | ||||
| @@ -115,7 +115,7 @@ and then put the path to that script in `paperless.conf` or | ||||
|     asynchronously, you'll have to fork the process in your script and | ||||
|     exit. | ||||
|  | ||||
| ### Pre-consumption script | ||||
| ### Pre-consumption script {#pre-consume-script} | ||||
|  | ||||
| Executed after the consumer sees a new document in the consumption | ||||
| folder, but before any processing of the document is performed. This | ||||
| @@ -151,7 +151,7 @@ with the newly modified file. | ||||
| The script's stdout and stderr will be logged line by line to the | ||||
| webserver log, along with the exit code of the script. | ||||
|  | ||||
| ### Post-consumption script {#post_consume_script} | ||||
| ### Post-consumption script {#post-consume-script} | ||||
|  | ||||
| Executed after the consumer has successfully processed a document and | ||||
| has moved it into paperless. It receives the following environment | ||||
| @@ -181,33 +181,34 @@ The post consumption script cannot cancel the consumption process. | ||||
| The script's stdout and stderr will be logged line by line to the | ||||
| webserver log, along with the exit code of the script. | ||||
|  | ||||
| #### Docker | ||||
| ### Docker {#docker-consume-hooks} | ||||
|  | ||||
| Assumed you have | ||||
| `/home/foo/paperless-ngx/scripts/post-consumption-example.sh`. | ||||
| To hook into the consumption process when using Docker, you | ||||
| will need to pass the scripts into the container via a host mount | ||||
| in your `docker-compose.yml`. | ||||
|  | ||||
| You can pass that script into the consumer container via a host mount in | ||||
| your `docker-compose.yml`. | ||||
| Assuming you have | ||||
| `/home/paperless-ngx/scripts/post-consumption-example.sh` as a | ||||
| script which you'd like to run. | ||||
|  | ||||
| ```bash | ||||
| You can pass that script into the consumer container via a host mount: | ||||
|  | ||||
| ```yaml | ||||
| ... | ||||
| consumer: | ||||
| webserver: | ||||
|   ... | ||||
|   volumes: | ||||
|     ... | ||||
|     - /home/paperless-ngx/scripts:/path/in/container/scripts/ | ||||
|     - /home/paperless-ngx/scripts:/path/in/container/scripts/ # (1)! | ||||
|   environment: # (3)! | ||||
|     ... | ||||
|     PAPERLESS_POST_CONSUME_SCRIPT: /path/in/container/scripts/post-consumption-example.sh # (2)! | ||||
| ... | ||||
| ``` | ||||
|  | ||||
| Example (docker-compose.yml): | ||||
| `- /home/foo/paperless-ngx/scripts:/usr/src/paperless/scripts` | ||||
|  | ||||
| which in turn requires the variable `PAPERLESS_POST_CONSUME_SCRIPT` in | ||||
| `docker-compose.env` to point to | ||||
| `/path/in/container/scripts/post-consumption-example.sh`. | ||||
|  | ||||
| Example (docker-compose.env): | ||||
| `PAPERLESS_POST_CONSUME_SCRIPT=/usr/src/paperless/scripts/post-consumption-example.sh` | ||||
| 1. The external scripts directory is mounted to a location inside the container. | ||||
| 2. The internal location of the script is used to set the script to run | ||||
| 3. This can also be set in `docker-compose.env` | ||||
|  | ||||
| Troubleshooting: | ||||
|  | ||||
| @@ -218,7 +219,7 @@ Troubleshooting: | ||||
| - Pipe your scripts's output to a log file e.g. | ||||
|   `echo "${DOCUMENT_ID}" | tee --append /usr/src/paperless/scripts/post-consumption-example.log` | ||||
|  | ||||
| ## File name handling {#file_name_handling} | ||||
| ## File name handling {#file-name-handling} | ||||
|  | ||||
| By default, paperless stores your documents in the media directory and | ||||
| renames them using the identifier which it has assigned to each | ||||
| @@ -316,7 +317,7 @@ value. | ||||
|     Paperless checks the filename of a document whenever it is saved. | ||||
|     Therefore, you need to update the filenames of your documents and move | ||||
|     them after altering this setting by invoking the | ||||
|     [`document renamer <utilities-renamer>`](). | ||||
|     [`document renamer`](administration#renamer). | ||||
|  | ||||
| !!! warning | ||||
|  | ||||
| @@ -344,7 +345,7 @@ When as single storage layout is not sufficient for your use case, | ||||
| storage paths come to the rescue. Storage paths allow you to configure | ||||
| more precisely where each document is stored in the file system. | ||||
|  | ||||
| - Each storage path is a [PAPERLESS_FILENAME_FORMAT]{.title-ref} and | ||||
| - Each storage path is a `PAPERLESS_FILENAME_FORMAT` and | ||||
|   follows the rules described above | ||||
| - Each document is assigned a storage path using the matching | ||||
|   algorithms described above, but can be overwritten at any time | ||||
| @@ -352,7 +353,7 @@ more precisely where each document is stored in the file system. | ||||
| For example, you could define the following two storage paths: | ||||
|  | ||||
| 1.  Normal communications are put into a folder structure sorted by | ||||
|     [year/correspondent]{.title-ref} | ||||
|     `year/correspondent` | ||||
| 2.  Communications with insurance companies are stored in a flat | ||||
|     structure with longer file names, but containing the full date of | ||||
|     the correspondence. | ||||
| @@ -384,7 +385,7 @@ structure as in the previous example above. | ||||
| !!! tip | ||||
|  | ||||
|     Defining a storage path is optional. If no storage path is defined for a | ||||
|     document, the global [PAPERLESS_FILENAME_FORMAT]{.title-ref} is applied. | ||||
|     document, the global `PAPERLESS_FILENAME_FORMAT` is applied. | ||||
|  | ||||
| !!! warning | ||||
|  | ||||
| @@ -403,27 +404,32 @@ queued and completed tasks, timing and more. Flower can also be used | ||||
| with Prometheus, as it exports metrics. For details on its capabilities, | ||||
| refer to the Flower documentation. | ||||
|  | ||||
| To configure Flower further, create a [flowerconfig.py]{.title-ref} and | ||||
| place it into the [src/paperless]{.title-ref} directory. For a Docker | ||||
| To configure Flower further, create a `flowerconfig.py` and | ||||
| place it into the `src/paperless` directory. For a Docker | ||||
| installation, you can use volumes to accomplish this: | ||||
|  | ||||
| ```yaml | ||||
| services: | ||||
|   # ... | ||||
|   webserver: | ||||
|     ports: | ||||
|       - 5555:5555 # (2)! | ||||
|     # ... | ||||
|     volumes: | ||||
|       - /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro | ||||
|       - /path/to/my/flowerconfig.py:/usr/src/paperless/src/paperless/flowerconfig.py:ro # (1)! | ||||
| ``` | ||||
|  | ||||
| 1. Note the `:ro` tag means the file will be mounted as read only. | ||||
| 2. `flower` runs by default on port 5555, but this can be configured | ||||
|  | ||||
| ## Custom Container Initialization | ||||
|  | ||||
| The Docker image includes the ability to run custom user scripts during | ||||
| startup. This could be utilized for installing additional tools or | ||||
| Python packages, for example. | ||||
| Python packages, for example. Scripts are expected to be shell scripts. | ||||
|  | ||||
| To utilize this, mount a folder containing your scripts to the custom | ||||
| initialization directory, [/custom-cont-init.d]{.title-ref} and place | ||||
| initialization directory, `/custom-cont-init.d` and place | ||||
| scripts you wish to run inside. For security, the folder must be owned | ||||
| by `root` and should have permissions of `a=rx`. Additionally, scripts | ||||
| must only be writable by `root`. | ||||
| @@ -445,9 +451,11 @@ services: | ||||
|   webserver: | ||||
|     # ... | ||||
|     volumes: | ||||
|       - /path/to/my/scripts:/custom-cont-init.d:ro | ||||
|       - /path/to/my/scripts:/custom-cont-init.d:ro # (1)! | ||||
| ``` | ||||
|  | ||||
| 1. Note the `:ro` tag means the folder will be mounted as read only. This is for extra security against changes | ||||
|  | ||||
| ## MySQL Caveats {#mysql-caveats} | ||||
|  | ||||
| ### Case Sensitivity | ||||
|   | ||||
| @@ -225,7 +225,7 @@ Query parameters: | ||||
|  | ||||
| Results returned by the endpoint are ordered by importance of the term | ||||
| in the document index. The first result is the term that has the highest | ||||
| Tf/Idf score in the index. | ||||
| [Tf/Idf](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) score in the index. | ||||
|  | ||||
| ```json | ||||
| ["term1", "term3", "term6", "term4"] | ||||
|   | ||||
| @@ -33,15 +33,15 @@ matcher. | ||||
|     [More information on securing your Redis | ||||
|     Instance](https://redis.io/docs/getting-started/#securing-redis). | ||||
|  | ||||
|     Defaults to <redis://localhost:6379>. | ||||
|     Defaults to `redis://localhost:6379`. | ||||
|  | ||||
| `PAPERLESS_DBENGINE=<engine_name>` | ||||
|  | ||||
| : Optional, gives the ability to choose Postgres or MariaDB for | ||||
| database engine. Available options are [postgresql]{.title-ref} and | ||||
| [mariadb]{.title-ref}. | ||||
| database engine. Available options are `postgresql` and | ||||
| `mariadb`. | ||||
|  | ||||
|     Default is [postgresql]{.title-ref}. | ||||
|     Default is `postgresql`. | ||||
|  | ||||
|     !!! warning | ||||
|  | ||||
| @@ -150,25 +150,25 @@ files created using "collectstatic" manager command are stored. | ||||
| `PAPERLESS_FILENAME_FORMAT=<format>` | ||||
|  | ||||
| : Changes the filenames paperless uses to store documents in the media | ||||
| directory. See [File name handling](advanced_usage#file_name_handling) for details. | ||||
| directory. See [File name handling](advanced_usage#file-name-handling) for details. | ||||
|  | ||||
|     Default is none, which disables this feature. | ||||
|  | ||||
| `PAPERLESS_FILENAME_FORMAT_REMOVE_NONE=<bool>` | ||||
|  | ||||
| : Tells paperless to replace placeholders in | ||||
| [PAPERLESS_FILENAME_FORMAT]{.title-ref} that would resolve to | ||||
| `PAPERLESS_FILENAME_FORMAT` that would resolve to | ||||
| 'none' to be omitted from the resulting filename. This also holds | ||||
| true for directory names. See [File name handling](advanced_usage#file_name_handling) for | ||||
| true for directory names. See [File name handling](advanced_usage#file-name-handling) for | ||||
| details. | ||||
|  | ||||
|     Defaults to [false]{.title-ref} which disables this feature. | ||||
|     Defaults to `false` which disables this feature. | ||||
|  | ||||
| `PAPERLESS_LOGGING_DIR=<path>` | ||||
|  | ||||
| : This is where paperless will store log files. | ||||
|  | ||||
|     Defaults to "`PAPERLESS_DATA_DIR`/log/". | ||||
|     Defaults to `PAPERLESS_DATA_DIR/log/`. | ||||
|  | ||||
| ## Logging | ||||
|  | ||||
| @@ -283,10 +283,10 @@ login with the selected user. | ||||
| : If this environment variable is specified, Paperless automatically | ||||
| creates a superuser with the provided username at start. This is | ||||
| useful in cases where you can not run the | ||||
| [createsuperuser]{.title-ref} command separately, such as Kubernetes | ||||
| `createsuperuser` command separately, such as Kubernetes | ||||
| or AWS ECS. | ||||
|  | ||||
|     Requires [PAPERLESS_ADMIN_PASSWORD]{.title-ref} to be set. | ||||
|     Requires PAPERLESS_ADMIN_PASSWORD be set. | ||||
|  | ||||
|     !!! note | ||||
|  | ||||
| @@ -297,13 +297,13 @@ or AWS ECS. | ||||
| `PAPERLESS_ADMIN_MAIL=<email>` | ||||
|  | ||||
| : (Optional) Specify superuser email address. Only used when | ||||
| [PAPERLESS_ADMIN_USER]{.title-ref} is set. | ||||
| PAPERLESS_ADMIN_USER is set. | ||||
|  | ||||
|     Defaults to `root@localhost`. | ||||
|  | ||||
| `PAPERLESS_ADMIN_PASSWORD=<password>` | ||||
|  | ||||
| : Only used when [PAPERLESS_ADMIN_USER]{.title-ref} is set. This will | ||||
| : Only used when PAPERLESS_ADMIN_USER is set. This will | ||||
| be the password of the automatically created superuser. | ||||
|  | ||||
| `PAPERLESS_COOKIE_PREFIX=<str>` | ||||
| @@ -331,26 +331,25 @@ applications. | ||||
|         If you're exposing paperless to the internet directly, do not use | ||||
|         this. | ||||
|  | ||||
|         Also see the warning [in the official documentation | ||||
|         <https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration>]{.title-ref}. | ||||
|         Also see the warning [in the official documentation](https://docs.djangoproject.com/en/3.1/howto/auth-remote-user/#configuration). | ||||
|  | ||||
|     Defaults to [false]{.title-ref} which disables this feature. | ||||
|     Defaults to "false" which disables this feature. | ||||
|  | ||||
| `PAPERLESS_HTTP_REMOTE_USER_HEADER_NAME=<str>` | ||||
|  | ||||
| : If [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} is enabled, this | ||||
| : If "PAPERLESS*ENABLE_HTTP_REMOTE_USER" is enabled, this | ||||
| property allows to customize the name of the HTTP header from which | ||||
| the authenticated username is extracted. Values are in terms of | ||||
| \[HttpRequest.META\](<https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META>). | ||||
| Thus, the configured value must start with [HTTP\_]{.title-ref} | ||||
| [HttpRequest.META](https://docs.djangoproject.com/en/3.1/ref/request-response/#django.http.HttpRequest.META). | ||||
| Thus, the configured value must start with `HTTP*` | ||||
| followed by the normalized actual header name. | ||||
|  | ||||
|     Defaults to [HTTP_REMOTE_USER]{.title-ref}. | ||||
|     Defaults to "HTTP_REMOTE_USER". | ||||
|  | ||||
| `PAPERLESS_LOGOUT_REDIRECT_URL=<str>` | ||||
|  | ||||
| : URL to redirect the user to after a logout. This can be used | ||||
| together with [PAPERLESS_ENABLE_HTTP_REMOTE_USER]{.title-ref} to | ||||
| together with PAPERLESS_ENABLE_HTTP_REMOTE_USER to | ||||
| redirect the user back to the SSO application's logout page. | ||||
|  | ||||
|     Defaults to None, which disables this feature. | ||||
| @@ -368,7 +367,7 @@ needs. | ||||
| parsing documents. | ||||
|  | ||||
|     It should be a 3-letter language code consistent with ISO 639: | ||||
|     <https://www.loc.gov/standards/iso639-2/php/code_list.php> | ||||
|     https://www.loc.gov/standards/iso639-2/php/code_list.php | ||||
|  | ||||
|     Set this to the language most of your documents are written in. | ||||
|  | ||||
| @@ -624,8 +623,7 @@ Add the configuration variables to the environment of the webserver | ||||
| and add the additional services below the webserver service. Watch out | ||||
| for indentation. | ||||
|  | ||||
| Make sure to use the correct format [PAPERLESS_TIKA_ENABLED = | ||||
| 1]{.title-ref} so python_dotenv can parse the statement correctly. | ||||
| Make sure to use the correct format `PAPERLESS_TIKA_ENABLED = 1` so python_dotenv can parse the statement correctly. | ||||
|  | ||||
| ## Software tweaks {#software_tweaks} | ||||
|  | ||||
| @@ -648,7 +646,7 @@ paperless will process in parallel on a single document. | ||||
|  | ||||
|         Ensure that the product | ||||
|  | ||||
|         `PAPERLESS_TASK_WORKERS \:   PAPERLESS_THREADS_PER_WORKER` | ||||
|         `PAPERLESS_TASK_WORKERS * PAPERLESS_THREADS_PER_WORKER` | ||||
|  | ||||
|         does not exceed your CPU core count or else paperless will be | ||||
|         extremely slow. If you want paperless to process many documents in | ||||
| @@ -752,7 +750,7 @@ consumption directory as well. | ||||
| `PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=<bool>` | ||||
|  | ||||
| : Set the names of subdirectories as tags for consumed files. E.g. | ||||
| <CONSUMPTION_DIR>/foo/bar/file.pdf will add the tags "foo" and | ||||
| `<CONSUMPTION_DIR>/foo/bar/file.pdf` will add the tags "foo" and | ||||
| "bar" to the consumed file. Paperless will create any tags that | ||||
| don't exist yet. | ||||
|  | ||||
| @@ -827,7 +825,7 @@ documents. | ||||
|  | ||||
| : After a document is consumed, Paperless can trigger an arbitrary | ||||
| script if you like. This script will be passed a number of arguments | ||||
| for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post_consume_script). | ||||
| for you to work with. For more information, take a look at [Post-consumption script](advanced_usage#post-consume-script). | ||||
|  | ||||
|     The default is blank, which means nothing will be executed. | ||||
|  | ||||
| @@ -841,8 +839,7 @@ option as specified in | ||||
| The filename will be checked first, and if nothing is found, the | ||||
| document text will be checked as normal. | ||||
|  | ||||
|     A date in a filename must have some separators ([.]{.title-ref}, | ||||
|     [-]{.title-ref}, [/]{.title-ref}, etc) for it to be parsed. | ||||
|     A date in a filename must have some separators (`.`, `,`, `-`, `/`, etc) for it to be parsed. | ||||
|  | ||||
|     Defaults to none, which disables this feature. | ||||
|  | ||||
| @@ -928,7 +925,7 @@ the literal path for that program. | ||||
|  | ||||
| These options don't have any effect in `paperless.conf`. These options | ||||
| adjust the behavior of the docker container. Configure these in | ||||
| [docker-compose.env]{.title-ref}. | ||||
| `docker-compose.env`. | ||||
|  | ||||
| `PAPERLESS_WEBSERVER_WORKERS=<num>` | ||||
|  | ||||
| @@ -946,7 +943,7 @@ increase RAM usage. | ||||
| There are special setups where you may need to configure this value | ||||
| to restrict the Ip address or interface the webserver listens on. | ||||
|  | ||||
|     Defaults to \[::\], meaning all interfaces, including IPv6. | ||||
|     Defaults to `[::]`, meaning all interfaces, including IPv6. | ||||
|  | ||||
| `PAPERLESS_PORT=<port>` | ||||
|  | ||||
|   | ||||
| @@ -39,16 +39,16 @@ guidelines](https://github.com/paperless-ngx/paperless-ngx/blob/main/CONTRIBUTIN | ||||
| ## Code formatting with pre-commit Hooks | ||||
|  | ||||
| To ensure a consistent style and formatting across the project source, | ||||
| the project utilizes a Git [pre-commit]{.title-ref} hook to perform some | ||||
| formatting and linting before a commit is allowed. That way, everyone | ||||
| uses the same style and some common issues can be caught early on. See | ||||
| below for installation instructions. | ||||
| the project utilizes a Git [`pre-commit`](https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks) | ||||
| hook to perform some formatting and linting before a commit is allowed. | ||||
| That way, everyone uses the same style and some common issues can be caught | ||||
| early on. See below for installation instructions. | ||||
|  | ||||
| Once installed, hooks will run when you commit. If the formatting isn't | ||||
| quite right or a linter catches something, the commit will be rejected. | ||||
| You'll need to look at the output and fix the issue. Some hooks, such | ||||
| as the Python formatting tool [black]{.title-ref}, will format failing | ||||
| files, so all you need to do is [git add]{.title-ref} those files again | ||||
| as the Python formatting tool `black`, will format failing | ||||
| files, so all you need to do is `git add` those files again | ||||
| and retry your commit. | ||||
|  | ||||
| ## Initial setup and first start | ||||
| @@ -58,7 +58,7 @@ first-time setup. To do the setup you need to perform the steps from the | ||||
| following chapters in a certain order: | ||||
|  | ||||
| 1.  Install prerequisites + pipenv as mentioned in | ||||
|     `[Bare metal route](/setup#bare_metal) | ||||
|     [Bare metal route](/setup#bare_metal) | ||||
|  | ||||
| 2.  Copy `paperless.conf.example` to `paperless.conf` and enable debug | ||||
|     mode. | ||||
| @@ -69,7 +69,7 @@ following chapters in a certain order: | ||||
|     $ npm install -g @angular/cli | ||||
|     ``` | ||||
|  | ||||
| 4.  Install pre-commit | ||||
| 4.  Install pre-commit hooks | ||||
|  | ||||
|     ```shell-session | ||||
|     pre-commit install | ||||
| @@ -81,7 +81,7 @@ following chapters in a certain order: | ||||
|     mkdir -p consume media | ||||
|     ``` | ||||
|  | ||||
| 6.  You can now either \... | ||||
| 6.  You can now either ... | ||||
|  | ||||
|     - install redis or | ||||
|  | ||||
| @@ -91,9 +91,9 @@ following chapters in a certain order: | ||||
|  | ||||
|     - spin up a bare redis container | ||||
|  | ||||
|       > ```shell-session | ||||
|       > docker run -d -p 6379:6379 --restart unless-stopped redis:latest | ||||
|       > ``` | ||||
|       ```shell-session | ||||
|       docker run -d -p 6379:6379 --restart unless-stopped redis:latest | ||||
|       ``` | ||||
|  | ||||
| 7.  Install the python dependencies by performing in the src/ directory. | ||||
|  | ||||
| @@ -101,10 +101,12 @@ following chapters in a certain order: | ||||
|     pipenv install --dev | ||||
|     ``` | ||||
|  | ||||
| > - Make sure you're using python 3.9.x or lower. Otherwise you might | ||||
| >   get issues with building dependencies. You can use | ||||
| >   [pyenv](https://github.com/pyenv/pyenv) to install a specific | ||||
| >   python version. | ||||
| !!! note | ||||
|  | ||||
|     Make sure you're using python 3.10.x or lower. Otherwise you might | ||||
|     get issues with building dependencies. You can use | ||||
|     [pyenv](https://github.com/pyenv/pyenv) to install a specific | ||||
|     python version. | ||||
|  | ||||
| 8.  Generate the static UI so you can perform a login to get session | ||||
|     that is required for frontend development (this needs to be done one | ||||
| @@ -126,9 +128,9 @@ following chapters in a certain order: | ||||
|     you're developing for, you need to have some or all of them | ||||
|     running. | ||||
|  | ||||
| > ```shell-session | ||||
| > python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker | ||||
| > ``` | ||||
|     ```shell-session | ||||
|     python3 manage.py runserver & python3 manage.py document_consumer & celery --app paperless worker | ||||
|     ``` | ||||
|  | ||||
| 11. Login with the superuser credentials provided in step 8 at | ||||
|     `http://localhost:8000` to create a session that enables you to use | ||||
| @@ -140,15 +142,15 @@ development go to `/src-ui` and run `ng serve`. From there you can use | ||||
|  | ||||
| ## Back end development | ||||
|  | ||||
| The backend is a django application. PyCharm works well for development, | ||||
| The backend is a [Django](https://www.djangoproject.com/) application. PyCharm works well for development, | ||||
| but you can use whatever you want. | ||||
|  | ||||
| Configure the IDE to use the src/ folder as the base source folder. | ||||
| Configure the following launch configurations in your IDE: | ||||
|  | ||||
| - python3 manage.py runserver | ||||
| - celery \--app paperless worker | ||||
| - python3 manage.py document_consumer | ||||
| - `python3 manage.py runserver` | ||||
| - `celery --app paperless worker` | ||||
| - `python3 manage.py document_consumer` | ||||
|  | ||||
| To start them all: | ||||
|  | ||||
| @@ -158,24 +160,26 @@ python3 manage.py runserver & python3 manage.py document_consumer & celery --app | ||||
|  | ||||
| Testing and code style: | ||||
|  | ||||
| - Run `pytest` in the src/ directory to execute all tests. This also | ||||
| - Run `pytest` in the `src/` directory to execute all tests. This also | ||||
|   generates a HTML coverage report. When runnings test, paperless.conf | ||||
|   is loaded as well. However: the tests rely on the default | ||||
|   configuration. This is not ideal. But for now, make sure no settings | ||||
|   except for DEBUG are overridden when testing. | ||||
|  | ||||
| - Coding style is enforced by the Git pre-commit hooks. These will | ||||
|   ensure your code is formatted and do some linting when you do a [git | ||||
|   commit]{.title-ref}. | ||||
|   ensure your code is formatted and do some linting when you do a `git commit`. | ||||
|  | ||||
| - You can also run `black` manually to format your code | ||||
|  | ||||
|   !!! note | ||||
| - The `pre-commit` hooks will modify files and interact with each other. | ||||
|   It may take a couple of `git add`, `git commit` cycle to satisfy them. | ||||
|  | ||||
| !!! note | ||||
|  | ||||
|       The line length rule E501 is generally useful for getting multiple | ||||
|       source files next to each other on the screen. However, in some | ||||
|       cases, its just not possible to make some lines fit, especially | ||||
|       complicated IF cases. Append `# NOQA: E501` to disable this check | ||||
|       complicated IF cases. Append `# noqa: E501` to disable this check | ||||
|       for certain lines. | ||||
|  | ||||
| ## Front end development | ||||
| @@ -353,7 +357,8 @@ LANGUAGES = [ | ||||
|  | ||||
| ## Building the documentation | ||||
|  | ||||
| The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/). If you want to build the documentation locally, this is how you do it: | ||||
| The documentation is built using material-mkdocs, see their [documentation](https://squidfunk.github.io/mkdocs-material/reference/). | ||||
| If you want to build the documentation locally, this is how you do it: | ||||
|  | ||||
| 1.  Install python dependencies. | ||||
|  | ||||
| @@ -366,7 +371,7 @@ The documentation is built using material-mkdocs, see their [documentation](http | ||||
|  | ||||
|     ```shell-session | ||||
|     $ cd /path/to/paperless | ||||
|     $ pipenv mkdocs build | ||||
|     $ pipenv mkdocs build --config-file mkdocs.yml | ||||
|     ``` | ||||
|  | ||||
| ## Building the Docker image | ||||
| @@ -379,9 +384,9 @@ helper script `build-docker-image.sh`. | ||||
|  | ||||
| Building the docker image from source: | ||||
|  | ||||
| > ```shell-session | ||||
| > ./build-docker-image.sh Dockerfile -t <your-tag> | ||||
| > ``` | ||||
| ```shell-session | ||||
| ./build-docker-image.sh Dockerfile -t <your-tag> | ||||
| ``` | ||||
|  | ||||
| ## Extending Paperless | ||||
|  | ||||
| @@ -428,7 +433,7 @@ class MyCustomParser(DocumentParser): | ||||
|     def get_thumbnail(self, document_path, mime_type): | ||||
|         # This should return the path to a thumbnail you created for this | ||||
|         # document. | ||||
|         return os.path.join(self.tempdir, "thumb.png") | ||||
|         return os.path.join(self.tempdir, "thumb.webp") | ||||
| ``` | ||||
|  | ||||
| If you encounter any issues during parsing, raise a | ||||
|   | ||||
							
								
								
									
										32
									
								
								docs/faq.md
									
									
									
									
									
								
							
							
						
						
									
										32
									
								
								docs/faq.md
									
									
									
									
									
								
							| @@ -1,6 +1,6 @@ | ||||
| # Frequently Asked Questions | ||||
|  | ||||
| ### _What's the general plan for Paperless-ngx?_ | ||||
| ## _What's the general plan for Paperless-ngx?_ | ||||
|  | ||||
| **A:** While Paperless-ngx is already considered largely | ||||
| "feature-complete" it is a community-driven project and development | ||||
| @@ -9,7 +9,7 @@ discussions and "up-voted" by the community but this is not a | ||||
| guarantee the feature will be implemented. This project will always be | ||||
| open to collaboration in the form of PRs, ideas etc. | ||||
|  | ||||
| ### _I'm using docker. Where are my documents?_ | ||||
| ## _I'm using docker. Where are my documents?_ | ||||
|  | ||||
| **A:** Your documents are stored inside the docker volume | ||||
| `paperless_media`. Docker manages this volume automatically for you. It | ||||
| @@ -27,9 +27,7 @@ system. On Linux, chances are high that this location is | ||||
|     files around manually. This folder is meant to be entirely managed by | ||||
|     docker and paperless. | ||||
|  | ||||
| ### Let's say I want to switch tools in a year. Can I easily move | ||||
|  | ||||
| to other systems?\* | ||||
| ## Let's say I want to switch tools in a year. Can I easily move to other systems? | ||||
|  | ||||
| **A:** Your documents are stored as plain files inside the media folder. | ||||
| You can always drag those files out of that folder to use them | ||||
| @@ -41,17 +39,17 @@ elsewhere. Here are a couple notes about that. | ||||
| - By default, paperless uses the internal ID of each document as its | ||||
|   filename. This might not be very convenient for export. However, you | ||||
|   can adjust the way files are stored in paperless by | ||||
|   [configuring the filename format](advanced_usage#file_name_handling). | ||||
|   [configuring the filename format](advanced_usage#file-name-handling). | ||||
| - [The exporter](administration#exporter) is | ||||
|   another easy way to get your files out of paperless with reasonable | ||||
|   file names. | ||||
|  | ||||
| ### _What file types does paperless-ngx support?_ | ||||
| ## _What file types does paperless-ngx support?_ | ||||
|  | ||||
| **A:** Currently, the following files are supported: | ||||
|  | ||||
| - PDF documents, PNG images, JPEG images, TIFF images and GIF images | ||||
|   are processed with OCR and converted into PDF documents. | ||||
| - PDF documents, PNG images, JPEG images, TIFF images, GIF images and | ||||
|   WebP images are processed with OCR and converted into PDF documents. | ||||
| - Plain text documents are supported as well and are added verbatim to | ||||
|   paperless. | ||||
| - With the optional Tika integration enabled (see [Tika configuration](configuration#tika), | ||||
| @@ -61,7 +59,7 @@ elsewhere. Here are a couple notes about that. | ||||
| Paperless-ngx determines the type of a file by inspecting its content. | ||||
| The file extensions do not matter. | ||||
|  | ||||
| ### _Will paperless-ngx run on Raspberry Pi?_ | ||||
| ## _Will paperless-ngx run on Raspberry Pi?_ | ||||
|  | ||||
| **A:** The short answer is yes. I've tested it on a Raspberry Pi 3 B. | ||||
| The long answer is that certain parts of Paperless will run very slow, | ||||
| @@ -73,11 +71,11 @@ has to do much less work to serve the data. | ||||
| !!! note | ||||
|  | ||||
|     You can adjust some of the settings so that paperless uses less | ||||
|     processing power. See [setup](setup#less_powerful_devices) for details. | ||||
|     processing power. See [setup](setup#less-powerful-devices) for details. | ||||
|  | ||||
| ### _How do I install paperless-ngx on Raspberry Pi?_ | ||||
| ## _How do I install paperless-ngx on Raspberry Pi?_ | ||||
|  | ||||
| **A:** Docker images are available for arm and arm64 hardware, so just | ||||
| **A:** Docker images are available for armv7 and arm64 hardware, so just | ||||
| follow the docker-compose instructions. Apart from more required disk | ||||
| space compared to a bare metal installation, docker comes with close to | ||||
| zero overhead, even on Raspberry Pi. | ||||
| @@ -87,13 +85,13 @@ the python requirements do not have precompiled packages for ARM / | ||||
| ARM64. Installation of these will require additional development | ||||
| libraries and compilation will take a long time. | ||||
|  | ||||
| ### _How do I run this on Unraid?_ | ||||
| ## _How do I run this on Unraid?_ | ||||
|  | ||||
| **A:** Paperless-ngx is available as [community | ||||
| app](https://unraid.net/community/apps?q=paperless-ngx) in Unraid. [Uli | ||||
| Fahrer](https://github.com/Tooa) created a container template for that. | ||||
|  | ||||
| ### _How do I run this on my toaster?_ | ||||
| ## _How do I run this on my toaster?_ | ||||
|  | ||||
| **A:** I honestly don't know! As for all other devices that might be | ||||
| able to run paperless, you're a bit on your own. If you can't run the | ||||
| @@ -103,11 +101,11 @@ This is also what I use to test new releases with. Apart from that, I | ||||
| also have a Raspberry Pi, which I occasionally build the image on and | ||||
| see if it works. | ||||
|  | ||||
| ### _How do I proxy this with NGINX?_ | ||||
| ## _How do I proxy this with NGINX?_ | ||||
|  | ||||
| **A:** See [here](setup#nginx). | ||||
|  | ||||
| ### _How do I get WebSocket support with Apache mod_wsgi_? | ||||
| ## _How do I get WebSocket support with Apache mod_wsgi_? | ||||
|  | ||||
| **A:** `mod_wsgi` by itself does not support ASGI. Paperless will | ||||
| continue to work with WSGI, but certain features such as status | ||||
|   | ||||
| @@ -50,7 +50,7 @@ If you want to learn about what's different in paperless-ngx from | ||||
| Paperless, check out these resources in the documentation: | ||||
|  | ||||
| - [Some screenshots](#screenshots) of the new UI are available. | ||||
| - Read [this section](/advanced_usage/#advanced-automatic_matching) if you want to learn about how paperless automates all | ||||
| - Read [this section](/advanced_usage/#advanced-automatic-matching) if you want to learn about how paperless automates all | ||||
|   tagging using machine learning. | ||||
| - Paperless now comes with a [proper email consumer](/usage/#usage-email) that's fully tested and production ready. | ||||
| - Paperless creates searchable PDF/A documents from whatever you put into the consumption directory. This means | ||||
|   | ||||
| @@ -767,7 +767,7 @@ After that, you need to clear your cookies (Paperless-ngx comes with | ||||
| updated dependencies that do cookie-processing differently) and probably | ||||
| your cache as well. | ||||
|  | ||||
| # Considerations for less powerful devices {#less_powerful_devices} | ||||
| # Considerations for less powerful devices {#less-powerful-devices} | ||||
|  | ||||
| Paperless runs on Raspberry Pi. However, some things are rather slow on | ||||
| the Pi and configuring some options in paperless can help improve | ||||
| @@ -803,7 +803,7 @@ For details, refer to [configuration](configuration). | ||||
| !!! note | ||||
|  | ||||
|     Updating the | ||||
|     [automatic matching algorithm](/advanced_usage#automatic_matching) takes quite a bit of time. However, the update mechanism | ||||
|     [automatic matching algorithm](/advanced_usage#automatic-matching) takes quite a bit of time. However, the update mechanism | ||||
|     checks if your data has changed before doing the heavy lifting. If you | ||||
|     experience the algorithm taking too much cpu time, consider changing the | ||||
|     schedule in the admin interface to daily. You can also manually invoke | ||||
|   | ||||
| @@ -1,9 +1,9 @@ | ||||
| # Usage Overview | ||||
|  | ||||
| Paperless is an application that manages your personal documents. With | ||||
| the help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)), paperless transforms your wieldy physical document binders | ||||
| into a searchable archive and provides many utilities for finding and | ||||
| managing your documents. | ||||
| the help of a document scanner (see [the scanners wiki](https://github.com/paperless-ngx/paperless-ngx/wiki/Scanner-&-Software-Recommendations)), | ||||
| paperless transforms your unwieldy physical document binders into a searchable archive | ||||
| and provides many utilities for finding and managing your documents. | ||||
|  | ||||
| ## Terms and definitions | ||||
|  | ||||
| @@ -37,7 +37,7 @@ Each document has a couple of fields that you can assign to them: | ||||
|   date you signed a contract, or the date a letter was sent to you. | ||||
| - The _archive serial number_ (short: ASN) of a document is the | ||||
|   identifier of the document in your physical document binders. See | ||||
|   [recommended workflow](#usage-reccomended_workflow) below. | ||||
|   [recommended workflow](#usage-recommended-workflow) below. | ||||
| - The _content_ of a document is the text that was OCR'ed from the | ||||
|   document. This text is fed into the search engine and is used for | ||||
|   matching tags, correspondents and document types. | ||||
| @@ -74,8 +74,8 @@ following operations on your documents: | ||||
| ### The consumption directory | ||||
|  | ||||
| The primary method of getting documents into your database is by putting | ||||
| them in the consumption directory. The consumer runs in an infinite | ||||
| loop, looking for new additions to this directory. When it finds them, | ||||
| them in the consumption directory. The consumer waits patiently, looking | ||||
| for new additions to this directory. When it finds them, | ||||
| the consumer goes about the process of parsing them with the OCR, | ||||
| indexing what it finds, and storing it in the media directory. | ||||
|  | ||||
| @@ -99,7 +99,7 @@ dragging-and-dropping files into your browser window. | ||||
|  | ||||
| ### Mobile upload {#usage-mobile_upload} | ||||
|  | ||||
| The mobile app over at <https://github.com/qcasey/paperless_share> | ||||
| The mobile app over at [https://github.com/qcasey/paperless_share](https://github.com/qcasey/paperless_share) | ||||
| allows Android users to share any documents with paperless. This can be | ||||
| combined with any of the mobile scanning apps out there, such as Office | ||||
| Lens. | ||||
| @@ -325,7 +325,7 @@ language](https://whoosh.readthedocs.io/en/latest/querylang.html). For | ||||
| details on what date parsing utilities are available, see [Date | ||||
| parsing](https://whoosh.readthedocs.io/en/latest/dates.html#parsing-date-queries). | ||||
|  | ||||
| ## The recommended workflow {#usage-recommended_workflow} | ||||
| ## The recommended workflow {#usage-recommended-workflow} | ||||
|  | ||||
| Once you have familiarized yourself with paperless and are ready to use | ||||
| it for all your documents, the recommended workflow for managing your | ||||
|   | ||||
| @@ -23,6 +23,7 @@ theme: | ||||
|     - navigation.tabs | ||||
|     - navigation.top | ||||
|     - toc.integrate | ||||
|     - content.code.annotate | ||||
|   icon: | ||||
|     repo: fontawesome/brands/github | ||||
|   favicon: assets/favicon.png | ||||
| @@ -39,6 +40,8 @@ markdown_extensions: | ||||
|   - pymdownx.highlight: | ||||
|       anchor_linenums: true | ||||
|   - pymdownx.superfences | ||||
|   - pymdownx.inlinehilite | ||||
| strict: true | ||||
| nav: | ||||
|     - index.md | ||||
|     - setup.md | ||||
|   | ||||
		Reference in New Issue
	
	Block a user