mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-03-31 13:35:08 -05:00
Page:
Pre Consume Script Examples
Pages
Affiliated Projects
Backend Ideas List
Email OAuth App Setup
Home
Platform‐Specific Troubleshooting
Post Consume Script Examples
Pre Consume Script Examples
Related Projects
Scanner & Software Recommendations
Using Security Tools with Paperless ngx
Using a Reverse Proxy with Paperless ngx
Using and Generating ASN Barcodes
Clone
9
Pre Consume Script Examples
tooomm edited this page 2025-01-05 22:35:48 +01:00
Table of Contents
This wiki page is a repository of example pre-consume scripts contributed by the community. As always, you should exercise caution when using a script and make sure you understand the code before using a script from the internet.
Removing Blank Pages
Warning
This script modifies the original file!
Note
Original source: https://github.com/paperless-ngx/paperless-ngx/discussions/668#discussioncomment-3936343 with slight update (suppress warnings for Apple PDFs)
#!/bin/bash
#set -x -e -o pipefail
set -e -o pipefail
export LC_ALL=C
#IN="$1"
IN="$DOCUMENT_WORKING_PATH"
# Check for PDF format
TYPE=$(file -b "$IN")
if [ "${TYPE%%,*}" != "PDF document" ]; then
>&2 echo "Skipping $IN - non PDF [$TYPE]."
exit 0
fi
# PDF file - proceed
#PAGES=$(pdfinfo "$IN" | grep ^Pages: | tr -dc '0-9')
PAGES=$(pdfinfo "$IN" | awk '/Pages:/ {print $2}')
>&2 echo Total pages $PAGES
# Threshold for HP scanners
# THRESHOLD=1
# Threshold for Lexmar MC2425
THRESHOLD=0.8
non_blank() {
for i in $(seq 1 $PAGES) ; do
PERCENT=$(gs -o - -dFirstPage=${i} -dLastPage=${i} -sDEVICE=ink_cov "${IN}" | grep CMYK | nawk 'BEGIN { sum=0; } {sum += $1 + $2 + $3 + $4;} END { printf "%.5f\n", sum } ')
>&2 echo -n "Color-sum in page $i is $PERCENT: "
if awk "BEGIN { exit !($PERCENT > $THRESHOLD) }"; then
echo $i
>&2 echo "Page added to document"
else
>&2 echo "Page removed from document"
fi
done
}
NON_BLANK=$(non_blank)
if [ -n "$NON_BLANK" ]; then
NON_BLANK=$(echo $NON_BLANK | tr ' ' ",")
qpdf "$IN" --warning-exit-0 --replace-input --pages . $NON_BLANK --
fi
Cleaning with qpdf
- ⚠️ This script modifies the original file
- Useful for correcting certain structural issues with PDFs
#!/usr/bin/env bash
qpdf --replace-input "$DOCUMENT_WORKING_PATH"
Feel free to contribute to the wiki pages - enhance and extend the content!
Also browse Discussions & connect in Matrix chat.