mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-07-28 18:24:38 -05:00
Merge branch 'dev'
This commit is contained in:
@@ -68,23 +68,23 @@ $ docker-compose down
|
||||
|
||||
After that, [make a backup](#backup).
|
||||
|
||||
1. If you pull the image from the docker hub, all you need to do is:
|
||||
1. If you pull the image from the docker hub, all you need to do is:
|
||||
|
||||
```shell-session
|
||||
$ docker-compose pull
|
||||
$ docker-compose up
|
||||
```
|
||||
```shell-session
|
||||
$ docker-compose pull
|
||||
$ docker-compose up
|
||||
```
|
||||
|
||||
The docker-compose files refer to the `latest` version, which is
|
||||
always the latest stable release.
|
||||
The docker-compose files refer to the `latest` version, which is
|
||||
always the latest stable release.
|
||||
|
||||
2. If you built the image yourself, do the following:
|
||||
1. If you built the image yourself, do the following:
|
||||
|
||||
```shell-session
|
||||
$ git pull
|
||||
$ docker-compose build
|
||||
$ docker-compose up
|
||||
```
|
||||
```shell-session
|
||||
$ git pull
|
||||
$ docker-compose build
|
||||
$ docker-compose up
|
||||
```
|
||||
|
||||
Running `docker-compose up` will also apply any new database migrations.
|
||||
If you see everything working, press CTRL+C once to gracefully stop
|
||||
@@ -470,7 +470,7 @@ The issues detected by the sanity checker are as follows:
|
||||
- Inaccessible thumbnails due to improper permissions.
|
||||
- Documents without any content (warning).
|
||||
- Orphaned files in the media directory (warning). These are files
|
||||
that are not referenced by any document im paperless.
|
||||
that are not referenced by any document in paperless.
|
||||
|
||||
```
|
||||
document_sanity_checker
|
||||
|
@@ -1,6 +1,6 @@
|
||||
# Advanced Topics
|
||||
|
||||
Paperless offers a couple features that automate certain tasks and make
|
||||
Paperless offers a couple of features that automate certain tasks and make
|
||||
your life easier.
|
||||
|
||||
## Matching tags, correspondents, document types, and storage paths {#matching}
|
||||
@@ -35,9 +35,9 @@ The following algorithms are available:
|
||||
(i.e. preserve ordering) in the PDF.
|
||||
- **Regular expression:** Parses the match as a regular expression and
|
||||
tries to find a match within the document.
|
||||
- **Fuzzy match:** I don't know. Look at the source.
|
||||
- **Fuzzy match:** I don't know. Look at [the source](https://github.com/paperless-ngx/paperless-ngx/blob/main/src/documents/matching.py).
|
||||
- **Auto:** Tries to automatically match new documents. This does not
|
||||
require you to set a match. See the notes below.
|
||||
require you to set a match. See the [notes below](#automatic-matching).
|
||||
|
||||
When using the _any_ or _all_ matching algorithms, you can search for
|
||||
terms that consist of multiple words by enclosing them in double quotes.
|
||||
@@ -92,7 +92,7 @@ when using this feature:
|
||||
decide when not to assign a certain tag, correspondent, document
|
||||
type, or storage path. This will usually be the case as you start
|
||||
filling up paperless with documents. Example: If all your documents
|
||||
are either from "Webshop" and "Bank", paperless will assign one
|
||||
are either from "Webshop" or "Bank", paperless will assign one
|
||||
of these correspondents to ANY new document, if both are set to
|
||||
automatic matching.
|
||||
|
||||
@@ -101,7 +101,7 @@ when using this feature:
|
||||
Sometimes you may want to do something arbitrary whenever a document is
|
||||
consumed. Rather than try to predict what you may want to do, Paperless
|
||||
lets you execute scripts of your own choosing just before or after a
|
||||
document is consumed using a couple simple hooks.
|
||||
document is consumed using a couple of simple hooks.
|
||||
|
||||
Just write a script, put it somewhere that Paperless can read & execute,
|
||||
and then put the path to that script in `paperless.conf` or
|
||||
@@ -197,7 +197,7 @@ The script can be in any language, A simple shell script example:
|
||||
!!! warning
|
||||
|
||||
The post consumption script should not modify the document files
|
||||
directly
|
||||
directly.
|
||||
|
||||
The script's stdout and stderr will be logged line by line to the
|
||||
webserver log, along with the exit code of the script.
|
||||
@@ -311,6 +311,7 @@ Paperless provides the following placeholders within filenames:
|
||||
- `{added_day}`: Day added only (number 01-31).
|
||||
- `{owner_username}`: Username of document owner, if any, or "none"
|
||||
- `{original_name}`: Document original filename, minus the extension, if any, or "none"
|
||||
- `{doc_pk}`: The paperless identifier (primary key) for the document.
|
||||
|
||||
Paperless will try to conserve the information from your database as
|
||||
much as possible. However, some characters that you can use in document
|
||||
@@ -528,7 +529,7 @@ For how to enable barcode usage, see [the configuration](/configuration#barcodes
|
||||
The two settings may be enabled independently, but do have interactions as explained
|
||||
below.
|
||||
|
||||
### Document Splitting
|
||||
### Document Splitting {#document-splitting}
|
||||
|
||||
When enabled, Paperless will look for a barcode with the configured value and create a new document
|
||||
starting from the next page. The page with the barcode on it will _not_ be retained. It
|
||||
@@ -543,3 +544,69 @@ If document splitting via barcode is also enabled, documents will be split when
|
||||
barcode is located. However, differing from the splitting, the page with the
|
||||
barcode _will_ be retained. This allows application of a barcode to any page, including
|
||||
one which holds data to keep in the document.
|
||||
|
||||
## Automatic collation of double-sided documents {#collate}
|
||||
|
||||
!!! note
|
||||
|
||||
If your scanner supports double-sided scanning natively, you do not need this feature.
|
||||
|
||||
This feature is turned off by default, see [configuration](/configuration#collate) on how to turn it on.
|
||||
|
||||
### Summary
|
||||
|
||||
If you have a scanner with an automatic document feeder (ADF) that only scans a single side,
|
||||
this feature makes scanning double-sided documents much more convenient by automatically
|
||||
collating two separate scans into one document, reordering the pages as necessary.
|
||||
|
||||
### Usage example
|
||||
|
||||
Suppose you have a double-sided document with 6 pages (3 sheets of paper). First,
|
||||
put the stack into your ADF as normal, ensuring that page 1 is scanned first. Your ADF
|
||||
will now scan pages 1, 3, and 5. Then you (or your the scanner, if it supports it) upload
|
||||
the scan into the correct sub-directory of the consume folder (`double-sided` by default;
|
||||
keep in mind that Paperless will _not_ automatically create the directory for you.)
|
||||
Paperless will then process the scan and move it into an internal staging area.
|
||||
|
||||
The next step is to turn your stack upside down (without reordering the sheets of paper),
|
||||
and scan it once again, your ADF will now scan pages 6, 4, and 2, in that order. Once this
|
||||
scan is copied into the sub-directory, Paperless will collate the previous scan with the
|
||||
new one, reversing the order of the pages on the second, "even numbered" scan. The
|
||||
resulting document will have the pages 1-6 in the correct order, and this new file will
|
||||
then be processed as normal.
|
||||
|
||||
!!! tip
|
||||
|
||||
When scanning the even numbered pages, you can omit the last empty pages, if there are
|
||||
any. For example, if page 6 is empty, you only need to scan pages 2 and 4. _Do not_ omit
|
||||
empty pages in the middle of the document.
|
||||
|
||||
### Things that could go wrong
|
||||
|
||||
Paperless will notice when the first, "odd numbered" scan has less pages than the second
|
||||
scan (this can happen when e.g. the ADF skipped a few pages in the first pass). In that
|
||||
case, Paperless will remove the staging copy as well as the scan, and give you an error
|
||||
message asking you to restart the process from scratch, by scanning the odd pages again,
|
||||
followed by the even pages.
|
||||
|
||||
Another thing that might happen is that you start a double sided scan, but then forget
|
||||
to upload the second file. To avoid collating the wrong documents if you then come back
|
||||
a day later to scan a new double-sided document, Paperless will only keep an "odd numbered
|
||||
pages" file for up to 30 minutes. If more time passes, it will consider the next incoming
|
||||
scan a completely new "odd numbered pages" one. The old staging file will get discarded.
|
||||
|
||||
### Interaction with "subdirs as tags"
|
||||
|
||||
The collation feature can be used together with the "subdirs as tags" feature (but this is not
|
||||
a requirement). Just create a correctly named double-sided subdir in the hierachy and upload
|
||||
your scans there. For example, both `double-sided/foo/bar` as well as `foo/bar/double-sided` will
|
||||
cause the collated document to be treated as if it were uploaded into `foo/bar` and receive both
|
||||
`foo` and `bar` tags, but not `double-sided`.
|
||||
|
||||
### Interaction with document splitting
|
||||
|
||||
You can use the [document splitting](#document-splitting) feature, but if you use a normal
|
||||
single-sided split marker page, the split document(s) will have an empty page at the front (or
|
||||
whatever else was on the backside of the split marker page.) You can work around that by having
|
||||
a split marker page that has the split barcode on _both_ sides. This way, the extra page will
|
||||
get automatically removed.
|
||||
|
@@ -524,7 +524,7 @@ parsing documents.
|
||||
|
||||
`PAPERLESS_OCR_MODE=<mode>`
|
||||
|
||||
: Tell paperless when and how to perform ocr on your documents. Four
|
||||
: Tell paperless when and how to perform ocr on your documents. Three
|
||||
modes are available:
|
||||
|
||||
- `skip`: Paperless skips all pages and will perform ocr only on
|
||||
@@ -1116,6 +1116,43 @@ combination with PAPERLESS_CONSUMER_BARCODE_UPSCALE bigger than 1.0.
|
||||
|
||||
Defaults to "300"
|
||||
|
||||
## Collate Double-Sided Documents {#collate}
|
||||
|
||||
`PAPERLESS_CONSUMER_ENABLE_COLLATE_DOUBLE_SIDED=<bool>`
|
||||
|
||||
: Enables automatic collation of two single-sided scans into a double-sided
|
||||
document.
|
||||
|
||||
This is useful if you have an automatic document feeder that only supports
|
||||
single-sided scans, but you need to scan a double-sided document. If your
|
||||
ADF supports double-sided scans natively, you do not need this feature.
|
||||
|
||||
`PAPERLESS_CONSUMER_RECURSIVE` must be enabled for this to work.
|
||||
|
||||
For more information, read the [corresponding section in the advanced
|
||||
documentation](/advanced_usage#collate).
|
||||
|
||||
Defaults to false.
|
||||
|
||||
`PAPERLESS_CONSUMER_COLLATE_DOUBLE_SIDED_SUBDIR_NAME=<str>`
|
||||
|
||||
: The name of the subdirectory that the collate feature expects documents to
|
||||
arrive.
|
||||
|
||||
This only has an effect if `PAPERLESS_CONSUMER_ENABLE_COLLATE_DOUBLE_SIDED`
|
||||
has been enabled. Note that Paperless will not automatically create the
|
||||
directory.
|
||||
|
||||
Defaults to "double-sided".
|
||||
|
||||
`PAPERLESS_CONSUMER_COLLATE_DOUBLE_SIDED_TIFF_SUPPORT=<bool>`
|
||||
: Whether TIFF image files should be supported when collating documents.
|
||||
This will automatically convert any TIFF image(s) to pdfs for later
|
||||
processing. This only has an effect if
|
||||
`PAPERLESS_CONSUMER_ENABLE_COLLATE_DOUBLE_SIDED` has been enabled.
|
||||
|
||||
Defaults to false.
|
||||
|
||||
## Binaries
|
||||
|
||||
There are a few external software packages that Paperless expects to
|
||||
@@ -1123,7 +1160,7 @@ find on your system when it starts up. Unless you've done something
|
||||
creative with their installation, you probably won't need to edit any
|
||||
of these. However, if you've installed these programs somewhere where
|
||||
simply typing the name of the program doesn't automatically execute it
|
||||
(ie. the program isn't in your \$PATH), then you'll need to specify
|
||||
(ie. the program isn't in your $PATH), then you'll need to specify
|
||||
the literal path for that program.
|
||||
|
||||
`PAPERLESS_CONVERT_BINARY=<path>`
|
||||
@@ -1207,7 +1244,7 @@ actual group ID on the host system, which you can get by executing
|
||||
with English, German, Italian, Spanish and French. If your language
|
||||
is not in this list, install additional languages with this
|
||||
configuration option. You will need to [find the right LangCodes](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html)
|
||||
but note that (tesseract-ocr-\* package names)[https://packages.debian.org/bullseye/graphics/]
|
||||
but note that [tesseract-ocr-\* package names](https://packages.debian.org/bullseye/graphics/)
|
||||
do not always correspond with the language codes e.g. "chi_tra" should be
|
||||
specified as "chi-tra".
|
||||
|
||||
|
@@ -58,7 +58,7 @@ first-time setup.
|
||||
|
||||
!!! note
|
||||
|
||||
Every command is executed directly from the root folder of the project unless specified otherwise.
|
||||
Every command is executed directly from the root folder of the project unless specified otherwise.
|
||||
|
||||
1. Install prerequisites + pipenv as mentioned in
|
||||
[Bare metal route](/setup#bare_metal).
|
||||
@@ -177,68 +177,69 @@ The front end is built using AngularJS. In order to get started, you need Node.j
|
||||
|
||||
The following commands are all performed in the `src-ui`-directory. You will need a running back end (including an active session) to connect to the back end API. To spin it up refer to the commands under the section [above](#back-end-development).
|
||||
|
||||
1. Install the Angular CLI. You might need sudo privileges
|
||||
to perform this command:
|
||||
1. Install the Angular CLI. You might need sudo privileges to perform this command:
|
||||
|
||||
```bash
|
||||
$ npm install -g @angular/cli
|
||||
```
|
||||
```bash
|
||||
$ npm install -g @angular/cli
|
||||
```
|
||||
|
||||
2. Make sure that it's on your path.
|
||||
2. Make sure that it's on your path.
|
||||
|
||||
3. Install all necessary modules:
|
||||
3. Install all necessary modules:
|
||||
|
||||
```bash
|
||||
$ npm install
|
||||
```
|
||||
```bash
|
||||
$ npm install
|
||||
```
|
||||
|
||||
4. You can launch a development server by running:
|
||||
4. You can launch a development server by running:
|
||||
|
||||
```bash
|
||||
$ ng serve
|
||||
```
|
||||
```bash
|
||||
$ ng serve
|
||||
```
|
||||
|
||||
This will automatically update whenever you save. However, in-place
|
||||
compilation might fail on syntax errors, in which case you need to
|
||||
restart it.
|
||||
This will automatically update whenever you save. However, in-place
|
||||
compilation might fail on syntax errors, in which case you need to
|
||||
restart it.
|
||||
|
||||
By default, the development server is available on `http://localhost:4200/` and is configured to access the API at
|
||||
`http://localhost:8000/api/`, which is the default of the backend. If you enabled `DEBUG` on the back end, several security overrides for allowed hosts, CORS and X-Frame-Options are in place so that the front end behaves exactly as in production.
|
||||
By default, the development server is available on `http://localhost:4200/` and is configured to access the API at
|
||||
`http://localhost:8000/api/`, which is the default of the backend. If you enabled `DEBUG` on the back end, several security overrides for allowed hosts, CORS and X-Frame-Options are in place so that the front end behaves exactly as in production.
|
||||
|
||||
### Testing and code style
|
||||
|
||||
- The front end code (.ts, .html, .scss) use `prettier` for code
|
||||
formatting via the Git `pre-commit` hooks which run automatically on
|
||||
commit. See [above](#code-formatting-with-pre-commit-hooks) for installation instructions. You can also run this via the CLI with a
|
||||
command such as
|
||||
The front end code (.ts, .html, .scss) use `prettier` for code
|
||||
formatting via the Git `pre-commit` hooks which run automatically on
|
||||
commit. See [above](#code-formatting-with-pre-commit-hooks) for installation instructions. You can also run this via the CLI with a
|
||||
command such as
|
||||
|
||||
```bash
|
||||
$ git ls-files -- '*.ts' | xargs pre-commit run prettier --files
|
||||
```
|
||||
```bash
|
||||
$ git ls-files -- '*.ts' | xargs pre-commit run prettier --files
|
||||
```
|
||||
|
||||
- Front end testing uses Jest and Playwright. Unit tests and e2e tests,
|
||||
respectively, can be run non-interactively with:
|
||||
Front end testing uses Jest and Playwright. Unit tests and e2e tests,
|
||||
respectively, can be run non-interactively with:
|
||||
|
||||
```bash
|
||||
$ ng test
|
||||
$ npx playwright test
|
||||
```
|
||||
```bash
|
||||
$ ng test
|
||||
$ npx playwright test
|
||||
```
|
||||
|
||||
- Playwright also includes a UI which can be run with:
|
||||
Playwright also includes a UI which can be run with:
|
||||
|
||||
```bash
|
||||
$ npx playwright test --ui
|
||||
```
|
||||
```bash
|
||||
$ npx playwright test --ui
|
||||
```
|
||||
|
||||
- In order to build the front end and serve it as part of Django, execute:
|
||||
### Building the frontend
|
||||
|
||||
```bash
|
||||
$ ng build --configuration production
|
||||
```
|
||||
In order to build the front end and serve it as part of Django, execute:
|
||||
|
||||
This will build the front end and put it in a location from which the
|
||||
Django server will serve it as static content. This way, you can verify
|
||||
that authentication is working.
|
||||
```bash
|
||||
$ ng build --configuration production
|
||||
```
|
||||
|
||||
This will build the front end and put it in a location from which the
|
||||
Django server will serve it as static content. This way, you can verify
|
||||
that authentication is working.
|
||||
|
||||
## Localization
|
||||
|
||||
|
13
docs/faq.md
13
docs/faq.md
@@ -3,10 +3,11 @@
|
||||
## _What's the general plan for Paperless-ngx?_
|
||||
|
||||
**A:** While Paperless-ngx is already considered largely
|
||||
"feature-complete" it is a community-driven project and development
|
||||
will be guided in this way. New features can be submitted via GitHub
|
||||
discussions and "up-voted" by the community but this is not a
|
||||
guarantee the feature will be implemented. This project will always be
|
||||
"feature-complete", it is a community-driven project and development
|
||||
will be guided in this way. New features can be submitted via
|
||||
[GitHub discussions](https://github.com/paperless-ngx/paperless-ngx/discussions)
|
||||
and "up-voted" by the community, but this is not a
|
||||
guarantee that the feature will be implemented. This project will always be
|
||||
open to collaboration in the form of PRs, ideas etc.
|
||||
|
||||
## _I'm using docker. Where are my documents?_
|
||||
@@ -58,7 +59,7 @@ elsewhere. Here are a couple notes about that.
|
||||
WebP images are processed with OCR and converted into PDF documents.
|
||||
- Plain text documents are supported as well and are added verbatim to
|
||||
paperless.
|
||||
- With the optional Tika integration enabled (see [Tika configuration](/configuration#tika),
|
||||
- With the optional Tika integration enabled (see [Tika configuration](https://docs.paperless-ngx.com/configuration#tika)),
|
||||
Paperless also supports various Office documents (.docx, .doc, odt,
|
||||
.ppt, .pptx, .odp, .xls, .xlsx, .ods).
|
||||
|
||||
@@ -82,7 +83,7 @@ has to do much less work to serve the data.
|
||||
## _How do I install paperless-ngx on Raspberry Pi?_
|
||||
|
||||
**A:** Docker images are available for armv7 and arm64 hardware, so just
|
||||
follow the docker-compose instructions. Apart from more required disk
|
||||
follow the [docker-compose instructions](https://docs.paperless-ngx.com/setup/#installation). Apart from more required disk
|
||||
space compared to a bare metal installation, docker comes with close to
|
||||
zero overhead, even on Raspberry Pi.
|
||||
|
||||
|
Reference in New Issue
Block a user