mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Merge pull request #254 from danielquinn/mcronce-disable_encryption
Allow encryption to be disabled
This commit is contained in:
commit
3b72d38440
4
.gitignore
vendored
4
.gitignore
vendored
@ -59,8 +59,8 @@ target/
|
|||||||
|
|
||||||
# Stored PDFs
|
# Stored PDFs
|
||||||
media/documents/*.gpg
|
media/documents/*.gpg
|
||||||
media/documents/thumbnails/*.gpg
|
media/documents/thumbnails/*
|
||||||
media/documents/originals/*.gpg
|
media/documents/originals/*
|
||||||
|
|
||||||
# Sqlite database
|
# Sqlite database
|
||||||
db.sqlite3
|
db.sqlite3
|
||||||
|
@ -1,8 +1,9 @@
|
|||||||
# Environment variables to set for Paperless
|
# Environment variables to set for Paperless
|
||||||
# Commented out variables will be replaced by a default within Paperless.
|
# Commented out variables will be replaced by a default within Paperless.
|
||||||
|
|
||||||
# Passphrase Paperless uses to encrypt and decrypt your documents
|
# Passphrase Paperless uses to encrypt and decrypt your documents, if you want
|
||||||
PAPERLESS_PASSPHRASE=CHANGE_ME
|
# encryption at all.
|
||||||
|
# PAPERLESS_PASSPHRASE=CHANGE_ME
|
||||||
|
|
||||||
# The amount of threads to use for text recognition
|
# The amount of threads to use for text recognition
|
||||||
# PAPERLESS_OCR_THREADS=4
|
# PAPERLESS_OCR_THREADS=4
|
||||||
|
@ -1,6 +1,35 @@
|
|||||||
Changelog
|
Changelog
|
||||||
#########
|
#########
|
||||||
|
|
||||||
|
2.0.0
|
||||||
|
=====
|
||||||
|
|
||||||
|
This is a big release as we've changed a core-functionality of Paperless: we no
|
||||||
|
longer encrypt files with GPG by default.
|
||||||
|
|
||||||
|
The reasons for this are many, but it boils down to that the encryption wasn't
|
||||||
|
really all that useful, as files on-disk were still accessible so long as you
|
||||||
|
had the key, and the key was most typically stored in the config file. In
|
||||||
|
other words, your files are only as safe as the ``paperless`` user is. In
|
||||||
|
addition to that, *the contents of the documents were never encrypted*, so
|
||||||
|
important numbers etc. were always accessible simply by querying the database.
|
||||||
|
Still, it was better than nothing, but the consensus from users appears to be
|
||||||
|
that it was more an annoyance than anything else, so this feature is now turned
|
||||||
|
off unless you explicitly set a passphrase in your config file.
|
||||||
|
|
||||||
|
Migrating from 1.x
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Encryption isn't gone, it's just off for new users. So long as you have
|
||||||
|
``PAPERLESS_PASSPHRASE`` set in your config or your environment, Paperless
|
||||||
|
should continue to operate as it always has. If however, you want to drop
|
||||||
|
encryption too, you only need to do two things:
|
||||||
|
|
||||||
|
1. Run ``./manage.py migrate && ./manage.py change_storage_type gpg unencrypted``.
|
||||||
|
This will go through your entire database and Decrypt All The Things.
|
||||||
|
2. Remove ``PAPERLESS_PASSPHRASE`` from your ``paperless.conf`` file, or simply
|
||||||
|
stop declaring it in your environment.
|
||||||
|
|
||||||
1.4.0
|
1.4.0
|
||||||
=====
|
=====
|
||||||
|
|
||||||
|
@ -17,7 +17,8 @@ The primary method of getting documents into your database is by putting them in
|
|||||||
the consumption directory. The ``document_consumer`` script runs in an infinite
|
the consumption directory. The ``document_consumer`` script runs in an infinite
|
||||||
loop looking for new additions to this directory and when it finds them, it goes
|
loop looking for new additions to this directory and when it finds them, it goes
|
||||||
about the process of parsing them with the OCR, indexing what it finds, and
|
about the process of parsing them with the OCR, indexing what it finds, and
|
||||||
encrypting the PDF, storing it in the media directory.
|
encrypting the PDF (if ``PAPERLESS_PASSPHRASE`` is set), storing it in the
|
||||||
|
media directory.
|
||||||
|
|
||||||
Getting stuff into this directory is up to you. If you're running Paperless
|
Getting stuff into this directory is up to you. If you're running Paperless
|
||||||
on your local computer, you might just want to drag and drop files there, but if
|
on your local computer, you might just want to drag and drop files there, but if
|
||||||
|
@ -16,7 +16,7 @@ Backing Up
|
|||||||
----------
|
----------
|
||||||
|
|
||||||
So you're bored of this whole project, or you want to make a remote backup of
|
So you're bored of this whole project, or you want to make a remote backup of
|
||||||
the unencrypted files for whatever reason. This is easy to do, simply use the
|
your files for whatever reason. This is easy to do, simply use the
|
||||||
:ref:`exporter <utilities-exporter>` to dump your documents and database out
|
:ref:`exporter <utilities-exporter>` to dump your documents and database out
|
||||||
into an arbitrary directory.
|
into an arbitrary directory.
|
||||||
|
|
||||||
|
@ -63,17 +63,18 @@ Standard (Bare Metal)
|
|||||||
|
|
||||||
1. Install the requirements as per the :ref:`requirements <requirements>` page.
|
1. Install the requirements as per the :ref:`requirements <requirements>` page.
|
||||||
2. Within the extract of master.zip go to the ``src`` directory.
|
2. Within the extract of master.zip go to the ``src`` directory.
|
||||||
3. Copy ``../paperless.conf.example`` to ``/etc/paperless.conf`` also the virtual
|
3. Copy ``../paperless.conf.example`` to ``/etc/paperless.conf`` and open it in
|
||||||
envrionment look there for it and open it in your favourite editor.
|
your favourite editor. Because this file contains passwords it should only
|
||||||
Because this file contains passwords it should only be readable by user root
|
be readable by user root and paperless! Set the values for:
|
||||||
and paperless ! Set the values for:
|
|
||||||
|
|
||||||
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
||||||
dumped to be consumed by Paperless.
|
dumped to be consumed by Paperless.
|
||||||
* ``PAPERLESS_PASSPHRASE``: this is the passphrase Paperless uses to
|
|
||||||
encrypt/decrypt the original document.
|
|
||||||
* ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
|
* ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
|
||||||
will spawn to process document pages in parallel.
|
will spawn to process document pages in parallel.
|
||||||
|
* ``PAPERLESS_PASSPHRASE``: this is only required if you want to use GPG to
|
||||||
|
encrypt your document files. This is the passphrase Paperless uses to
|
||||||
|
encrypt/decrypt the original documents. Don't worry about defining this
|
||||||
|
if you don't want to use encryption (the default).
|
||||||
|
|
||||||
4. Initialise the SQLite database with ``./manage.py migrate``.
|
4. Initialise the SQLite database with ``./manage.py migrate``.
|
||||||
5. Create a user for your Paperless instance with
|
5. Create a user for your Paperless instance with
|
||||||
@ -139,7 +140,8 @@ Docker Method
|
|||||||
|
|
||||||
``PAPERLESS_PASSPHRASE``
|
``PAPERLESS_PASSPHRASE``
|
||||||
This is the passphrase Paperless uses to encrypt/decrypt the original
|
This is the passphrase Paperless uses to encrypt/decrypt the original
|
||||||
document.
|
document. If you aren't planning on using GPG encryption, you can just
|
||||||
|
leave this undefined.
|
||||||
|
|
||||||
``PAPERLESS_OCR_THREADS``
|
``PAPERLESS_OCR_THREADS``
|
||||||
This is the number of threads the OCR process will spawn to process
|
This is the number of threads the OCR process will spawn to process
|
||||||
@ -265,10 +267,11 @@ Vagrant Method
|
|||||||
3. Run ``vagrant ssh`` and once inside your new vagrant box, edit
|
3. Run ``vagrant ssh`` and once inside your new vagrant box, edit
|
||||||
``/etc/paperless.conf`` and set the values for:
|
``/etc/paperless.conf`` and set the values for:
|
||||||
|
|
||||||
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
* ``PAPERLESS_CONSUMPTION_DIR``: This is where your documents will be
|
||||||
dumped to be consumed by Paperless.
|
dumped to be consumed by Paperless.
|
||||||
* ``PAPERLESS_PASSPHRASE``: this is the passphrase Paperless uses to
|
* ``PAPERLESS_PASSPHRASE``: This is the passphrase Paperless uses to
|
||||||
encrypt/decrypt the original document.
|
encrypt/decrypt the original document. It's only required if you want
|
||||||
|
your original files to be encrypted, otherwise, just leave it unset.
|
||||||
* ``PAPERLESS_EMAIL_SECRET``: this is the "magic word" used when consuming
|
* ``PAPERLESS_EMAIL_SECRET``: this is the "magic word" used when consuming
|
||||||
documents from mail or via the API. If you don't use either, leaving it
|
documents from mail or via the API. If you don't use either, leaving it
|
||||||
blank is just fine.
|
blank is just fine.
|
||||||
|
@ -59,8 +59,8 @@ for documents to parse and index. The process is pretty straightforward:
|
|||||||
4. Attempt to automatically assign document attributes by doing some guesswork.
|
4. Attempt to automatically assign document attributes by doing some guesswork.
|
||||||
Read up on the :ref:`guesswork documentation<guesswork>` for more
|
Read up on the :ref:`guesswork documentation<guesswork>` for more
|
||||||
information about this process.
|
information about this process.
|
||||||
5. Encrypt the document and store it in the ``media`` directory under
|
5. Encrypt the document (if you have a passphrase set) and store it in the
|
||||||
``documents/originals``.
|
``media`` directory under ``documents/originals``.
|
||||||
6. Go to #1.
|
6. Go to #1.
|
||||||
|
|
||||||
|
|
||||||
|
@ -59,19 +59,19 @@ PAPERLESS_EMAIL_SECRET=""
|
|||||||
#### Security ####
|
#### Security ####
|
||||||
###############################################################################
|
###############################################################################
|
||||||
|
|
||||||
# You must have a passphrase in order for Paperless to work at all. If you set
|
# Paperless can be instructed to attempt to encrypt your PDF files with GPG
|
||||||
# this to "", GNUGPG will "encrypt" your PDF by writing it out as a zero-byte
|
# using the PAPERLESS_PASSPHRASE specified below. If however you're not
|
||||||
# file.
|
# concerned about encrypting these files (for example if you have disk
|
||||||
#
|
# encryption locally) then you don't need this and can safely leave this value
|
||||||
# The passphrase you use here will be used when storing your documents in
|
# un-set.
|
||||||
# Paperless, but you can always export them in an unencrypted format by using
|
|
||||||
# document exporter. See the documentation for more information.
|
|
||||||
#
|
#
|
||||||
# One final note about the passphrase. Once you've consumed a document with
|
# One final note about the passphrase. Once you've consumed a document with
|
||||||
# one passphrase, DON'T CHANGE IT. Paperless assumes this to be a constant and
|
# one passphrase, DON'T CHANGE IT. Paperless assumes this to be a constant and
|
||||||
# can't properly export documents that were encrypted with an old passphrase if
|
# can't properly export documents that were encrypted with an old passphrase if
|
||||||
# you've since changed it to a new one.
|
# you've since changed it to a new one.
|
||||||
PAPERLESS_PASSPHRASE="secret"
|
#
|
||||||
|
# The default is to not use encryption at all.
|
||||||
|
#PAPERLESS_PASSPHRASE="secret"
|
||||||
|
|
||||||
|
|
||||||
# The secret key has a default that should be fine so long as you're hosting
|
# The secret key has a default that should be fine so long as you're hosting
|
||||||
|
@ -0,0 +1 @@
|
|||||||
|
from .checks import changed_password_check
|
39
src/documents/checks.py
Normal file
39
src/documents/checks.py
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
import textwrap
|
||||||
|
|
||||||
|
from django.conf import settings
|
||||||
|
from django.core.checks import Error, register
|
||||||
|
from django.db.utils import OperationalError
|
||||||
|
|
||||||
|
|
||||||
|
@register()
|
||||||
|
def changed_password_check(app_configs, **kwargs):
|
||||||
|
|
||||||
|
from documents.models import Document
|
||||||
|
from paperless.db import GnuPG
|
||||||
|
|
||||||
|
try:
|
||||||
|
encrypted_doc = Document.objects.filter(
|
||||||
|
storage_type=Document.STORAGE_TYPE_GPG).first()
|
||||||
|
except OperationalError:
|
||||||
|
return [] # No documents table yet
|
||||||
|
|
||||||
|
if encrypted_doc:
|
||||||
|
|
||||||
|
if not settings.PASSPHRASE:
|
||||||
|
return [Error(
|
||||||
|
"The database contains encrypted documents but no password "
|
||||||
|
"is set."
|
||||||
|
)]
|
||||||
|
|
||||||
|
if not GnuPG.decrypted(encrypted_doc.source_file):
|
||||||
|
return [Error(textwrap.dedent(
|
||||||
|
"""
|
||||||
|
The current password doesn't match the password of the
|
||||||
|
existing documents.
|
||||||
|
|
||||||
|
If you intend to change your password, you must first export
|
||||||
|
all of the old documents, start fresh with the new password
|
||||||
|
and then re-import them."
|
||||||
|
"""))]
|
||||||
|
|
||||||
|
return []
|
@ -29,7 +29,7 @@ class Consumer:
|
|||||||
Loop over every file found in CONSUMPTION_DIR and:
|
Loop over every file found in CONSUMPTION_DIR and:
|
||||||
1. Convert it to a greyscale pnm
|
1. Convert it to a greyscale pnm
|
||||||
2. Use tesseract on the pnm
|
2. Use tesseract on the pnm
|
||||||
3. Encrypt and store the document in the MEDIA_ROOT
|
3. Store the document in the MEDIA_ROOT with optional encryption
|
||||||
4. Store the OCR'd text in the database
|
4. Store the OCR'd text in the database
|
||||||
5. Delete the document and image(s)
|
5. Delete the document and image(s)
|
||||||
"""
|
"""
|
||||||
@ -50,6 +50,10 @@ class Consumer:
|
|||||||
|
|
||||||
os.makedirs(self.scratch, exist_ok=True)
|
os.makedirs(self.scratch, exist_ok=True)
|
||||||
|
|
||||||
|
self.storage_type = Document.STORAGE_TYPE_UNENCRYPTED
|
||||||
|
if settings.PASSPHRASE:
|
||||||
|
self.storage_type = Document.STORAGE_TYPE_GPG
|
||||||
|
|
||||||
if not self.consume:
|
if not self.consume:
|
||||||
raise ConsumerError(
|
raise ConsumerError(
|
||||||
"The CONSUMPTION_DIR settings variable does not appear to be "
|
"The CONSUMPTION_DIR settings variable does not appear to be "
|
||||||
@ -213,7 +217,8 @@ class Consumer:
|
|||||||
file_type=file_info.extension,
|
file_type=file_info.extension,
|
||||||
checksum=hashlib.md5(f.read()).hexdigest(),
|
checksum=hashlib.md5(f.read()).hexdigest(),
|
||||||
created=created,
|
created=created,
|
||||||
modified=created
|
modified=created,
|
||||||
|
storage_type=self.storage_type
|
||||||
)
|
)
|
||||||
|
|
||||||
relevant_tags = set(list(Tag.match_all(text)) + list(file_info.tags))
|
relevant_tags = set(list(Tag.match_all(text)) + list(file_info.tags))
|
||||||
@ -222,22 +227,22 @@ class Consumer:
|
|||||||
self.log("debug", "Tagging with {}".format(tag_names))
|
self.log("debug", "Tagging with {}".format(tag_names))
|
||||||
document.tags.add(*relevant_tags)
|
document.tags.add(*relevant_tags)
|
||||||
|
|
||||||
# Encrypt and store the actual document
|
self._write(document, doc, document.source_path)
|
||||||
with open(doc, "rb") as unencrypted:
|
self._write(document, thumbnail, document.thumbnail_path)
|
||||||
with open(document.source_path, "wb") as encrypted:
|
|
||||||
self.log("debug", "Encrypting the document")
|
|
||||||
encrypted.write(GnuPG.encrypted(unencrypted))
|
|
||||||
|
|
||||||
# Encrypt and store the thumbnail
|
|
||||||
with open(thumbnail, "rb") as unencrypted:
|
|
||||||
with open(document.thumbnail_path, "wb") as encrypted:
|
|
||||||
self.log("debug", "Encrypting the thumbnail")
|
|
||||||
encrypted.write(GnuPG.encrypted(unencrypted))
|
|
||||||
|
|
||||||
self.log("info", "Completed")
|
self.log("info", "Completed")
|
||||||
|
|
||||||
return document
|
return document
|
||||||
|
|
||||||
|
def _write(self, document, source, target):
|
||||||
|
with open(source, "rb") as read_file:
|
||||||
|
with open(target, "wb") as write_file:
|
||||||
|
if document.storage_type == Document.STORAGE_TYPE_UNENCRYPTED:
|
||||||
|
write_file.write(read_file.read())
|
||||||
|
return
|
||||||
|
self.log("debug", "Encrypting")
|
||||||
|
write_file.write(GnuPG.encrypted(read_file))
|
||||||
|
|
||||||
def _cleanup_doc(self, doc):
|
def _cleanup_doc(self, doc):
|
||||||
self.log("debug", "Deleting document {}".format(doc))
|
self.log("debug", "Deleting document {}".format(doc))
|
||||||
os.unlink(doc)
|
os.unlink(doc)
|
||||||
|
119
src/documents/management/commands/change_storage_type.py
Normal file
119
src/documents/management/commands/change_storage_type.py
Normal file
@ -0,0 +1,119 @@
|
|||||||
|
import os
|
||||||
|
|
||||||
|
from django.conf import settings
|
||||||
|
from django.core.management.base import BaseCommand, CommandError
|
||||||
|
from termcolor import colored as coloured
|
||||||
|
|
||||||
|
from documents.models import Document
|
||||||
|
from paperless.db import GnuPG
|
||||||
|
|
||||||
|
|
||||||
|
class Command(BaseCommand):
|
||||||
|
|
||||||
|
help = (
|
||||||
|
"This is how you migrate your stored documents from an encrypted "
|
||||||
|
"state to an unencrypted one (or vice-versa)"
|
||||||
|
)
|
||||||
|
|
||||||
|
def add_arguments(self, parser):
|
||||||
|
|
||||||
|
parser.add_argument(
|
||||||
|
"from",
|
||||||
|
choices=("gpg", "unencrypted"),
|
||||||
|
help="The state you want to change your documents from"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"to",
|
||||||
|
choices=("gpg", "unencrypted"),
|
||||||
|
help="The state you want to change your documents to"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--passphrase",
|
||||||
|
help="If PAPERLESS_PASSPHRASE isn't set already, you need to "
|
||||||
|
"specify it here"
|
||||||
|
)
|
||||||
|
|
||||||
|
def handle(self, *args, **options):
|
||||||
|
|
||||||
|
try:
|
||||||
|
print(coloured(
|
||||||
|
"\n\nWARNING: This script is going to work directly on your "
|
||||||
|
"document originals, so\nWARNING: you probably shouldn't run "
|
||||||
|
"this unless you've got a recent backup\nWARNING: handy. It "
|
||||||
|
"*should* work without a hitch, but be safe and backup your\n"
|
||||||
|
"WARNING: stuff first.\n\nHit Ctrl+C to exit now, or Enter to "
|
||||||
|
"continue.\n\n",
|
||||||
|
"yellow",
|
||||||
|
attrs=("bold",)
|
||||||
|
))
|
||||||
|
__ = input()
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
return
|
||||||
|
|
||||||
|
if options["from"] == options["to"]:
|
||||||
|
raise CommandError(
|
||||||
|
'The "from" and "to" values can\'t be the same.'
|
||||||
|
)
|
||||||
|
|
||||||
|
passphrase = options["passphrase"] or settings.PASSPHRASE
|
||||||
|
if not passphrase:
|
||||||
|
raise CommandError(
|
||||||
|
"Passphrase not defined. Please set it with --passphrase or "
|
||||||
|
"by declaring it in your environment or your config."
|
||||||
|
)
|
||||||
|
|
||||||
|
if options["from"] == "gpg" and options["to"] == "unencrypted":
|
||||||
|
self.__gpg_to_unencrypted(passphrase)
|
||||||
|
elif options["from"] == "unencrypted" and options["to"] == "gpg":
|
||||||
|
self.__unencrypted_to_gpg(passphrase)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def __gpg_to_unencrypted(passphrase):
|
||||||
|
|
||||||
|
encrypted_files = Document.objects.filter(
|
||||||
|
storage_type=Document.STORAGE_TYPE_GPG)
|
||||||
|
|
||||||
|
for document in encrypted_files:
|
||||||
|
|
||||||
|
print(coloured("Decrypting {}".format(document), "green"))
|
||||||
|
|
||||||
|
old_paths = [document.source_path, document.thumbnail_path]
|
||||||
|
raw_document = GnuPG.decrypted(document.source_file, passphrase)
|
||||||
|
raw_thumb = GnuPG.decrypted(document.thumbnail_file, passphrase)
|
||||||
|
|
||||||
|
document.storage_type = Document.STORAGE_TYPE_UNENCRYPTED
|
||||||
|
|
||||||
|
with open(document.source_path, "wb") as f:
|
||||||
|
f.write(raw_document)
|
||||||
|
|
||||||
|
with open(document.thumbnail_path, "wb") as f:
|
||||||
|
f.write(raw_thumb)
|
||||||
|
|
||||||
|
document.save(update_fields=("storage_type",))
|
||||||
|
|
||||||
|
for path in old_paths:
|
||||||
|
os.unlink(path)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def __unencrypted_to_gpg(passphrase):
|
||||||
|
|
||||||
|
unencrypted_files = Document.objects.filter(
|
||||||
|
storage_type=Document.STORAGE_TYPE_UNENCRYPTED)
|
||||||
|
|
||||||
|
for document in unencrypted_files:
|
||||||
|
|
||||||
|
print(coloured("Encrypting {}".format(document), "green"))
|
||||||
|
|
||||||
|
old_paths = [document.source_path, document.thumbnail_path]
|
||||||
|
with open(document.source_path, "rb") as raw_document:
|
||||||
|
with open(document.thumbnail_path, "rb") as raw_thumb:
|
||||||
|
document.storage_type = Document.STORAGE_TYPE_GPG
|
||||||
|
with open(document.source_path, "wb") as f:
|
||||||
|
f.write(GnuPG.encrypted(raw_document, passphrase))
|
||||||
|
with open(document.thumbnail_path, "wb") as f:
|
||||||
|
f.write(GnuPG.encrypted(raw_thumb, passphrase))
|
||||||
|
|
||||||
|
document.save(update_fields=("storage_type",))
|
||||||
|
|
||||||
|
for path in old_paths:
|
||||||
|
os.unlink(path)
|
@ -1,8 +1,8 @@
|
|||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
import time
|
import time
|
||||||
|
import shutil
|
||||||
|
|
||||||
from django.conf import settings
|
|
||||||
from django.core.management.base import BaseCommand, CommandError
|
from django.core.management.base import BaseCommand, CommandError
|
||||||
from django.core import serializers
|
from django.core import serializers
|
||||||
|
|
||||||
@ -45,9 +45,6 @@ class Command(Renderable, BaseCommand):
|
|||||||
if not os.access(self.target, os.W_OK):
|
if not os.access(self.target, os.W_OK):
|
||||||
raise CommandError("That path doesn't appear to be writable")
|
raise CommandError("That path doesn't appear to be writable")
|
||||||
|
|
||||||
if not settings.PASSPHRASE:
|
|
||||||
settings.PASSPHRASE = input("Please enter the passphrase: ")
|
|
||||||
|
|
||||||
if options["legacy"]:
|
if options["legacy"]:
|
||||||
self.dump_legacy()
|
self.dump_legacy()
|
||||||
else:
|
else:
|
||||||
@ -73,13 +70,20 @@ class Command(Renderable, BaseCommand):
|
|||||||
print("Exporting: {}".format(file_target))
|
print("Exporting: {}".format(file_target))
|
||||||
|
|
||||||
t = int(time.mktime(document.created.timetuple()))
|
t = int(time.mktime(document.created.timetuple()))
|
||||||
with open(file_target, "wb") as f:
|
if document.storage_type == Document.STORAGE_TYPE_GPG:
|
||||||
f.write(GnuPG.decrypted(document.source_file))
|
|
||||||
os.utime(file_target, times=(t, t))
|
|
||||||
|
|
||||||
with open(thumbnail_target, "wb") as f:
|
with open(file_target, "wb") as f:
|
||||||
f.write(GnuPG.decrypted(document.thumbnail_file))
|
f.write(GnuPG.decrypted(document.source_file))
|
||||||
os.utime(thumbnail_target, times=(t, t))
|
os.utime(file_target, times=(t, t))
|
||||||
|
|
||||||
|
with open(thumbnail_target, "wb") as f:
|
||||||
|
f.write(GnuPG.decrypted(document.thumbnail_file))
|
||||||
|
os.utime(thumbnail_target, times=(t, t))
|
||||||
|
|
||||||
|
else:
|
||||||
|
|
||||||
|
shutil.copy(document.source_path, file_target)
|
||||||
|
shutil.copy(document.thumbnail_path, thumbnail_target)
|
||||||
|
|
||||||
manifest += json.loads(
|
manifest += json.loads(
|
||||||
serializers.serialize("json", Correspondent.objects.all()))
|
serializers.serialize("json", Correspondent.objects.all()))
|
||||||
|
@ -1,5 +1,6 @@
|
|||||||
import json
|
import json
|
||||||
import os
|
import os
|
||||||
|
import shutil
|
||||||
|
|
||||||
from django.conf import settings
|
from django.conf import settings
|
||||||
from django.core.management.base import BaseCommand, CommandError
|
from django.core.management.base import BaseCommand, CommandError
|
||||||
@ -46,12 +47,6 @@ class Command(Renderable, BaseCommand):
|
|||||||
|
|
||||||
self._check_manifest()
|
self._check_manifest()
|
||||||
|
|
||||||
if not settings.PASSPHRASE:
|
|
||||||
raise CommandError(
|
|
||||||
"You need to define a passphrase before continuing. Please "
|
|
||||||
"consult the documentation for setting up Paperless."
|
|
||||||
)
|
|
||||||
|
|
||||||
# Fill up the database with whatever is in the manifest
|
# Fill up the database with whatever is in the manifest
|
||||||
call_command("loaddata", manifest_path)
|
call_command("loaddata", manifest_path)
|
||||||
|
|
||||||
@ -99,14 +94,21 @@ class Command(Renderable, BaseCommand):
|
|||||||
document_path = os.path.join(self.source, doc_file)
|
document_path = os.path.join(self.source, doc_file)
|
||||||
thumbnail_path = os.path.join(self.source, thumb_file)
|
thumbnail_path = os.path.join(self.source, thumb_file)
|
||||||
|
|
||||||
with open(document_path, "rb") as unencrypted:
|
if document.storage_type == Document.STORAGE_TYPE_GPG:
|
||||||
with open(document.source_path, "wb") as encrypted:
|
|
||||||
print("Encrypting {} and saving it to {}".format(
|
|
||||||
doc_file, document.source_path))
|
|
||||||
encrypted.write(GnuPG.encrypted(unencrypted))
|
|
||||||
|
|
||||||
with open(thumbnail_path, "rb") as unencrypted:
|
with open(document_path, "rb") as unencrypted:
|
||||||
with open(document.thumbnail_path, "wb") as encrypted:
|
with open(document.source_path, "wb") as encrypted:
|
||||||
print("Encrypting {} and saving it to {}".format(
|
print("Encrypting {} and saving it to {}".format(
|
||||||
thumb_file, document.thumbnail_path))
|
doc_file, document.source_path))
|
||||||
encrypted.write(GnuPG.encrypted(unencrypted))
|
encrypted.write(GnuPG.encrypted(unencrypted))
|
||||||
|
|
||||||
|
with open(thumbnail_path, "rb") as unencrypted:
|
||||||
|
with open(document.thumbnail_path, "wb") as encrypted:
|
||||||
|
print("Encrypting {} and saving it to {}".format(
|
||||||
|
thumb_file, document.thumbnail_path))
|
||||||
|
encrypted.write(GnuPG.encrypted(unencrypted))
|
||||||
|
|
||||||
|
else:
|
||||||
|
|
||||||
|
shutil.copy(document_path, document.source_path)
|
||||||
|
shutil.copy(thumbnail_path, document.thumbnail_path)
|
||||||
|
30
src/documents/migrations/0021_document_storage_type.py
Normal file
30
src/documents/migrations/0021_document_storage_type.py
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
# Generated by Django 1.11.10 on 2018-02-04 13:07
|
||||||
|
from __future__ import unicode_literals
|
||||||
|
|
||||||
|
from django.db import migrations, models
|
||||||
|
|
||||||
|
|
||||||
|
class Migration(migrations.Migration):
|
||||||
|
|
||||||
|
dependencies = [
|
||||||
|
('documents', '0020_document_added'),
|
||||||
|
]
|
||||||
|
|
||||||
|
operations = [
|
||||||
|
|
||||||
|
# Add the field with the default GPG-encrypted value
|
||||||
|
migrations.AddField(
|
||||||
|
model_name='document',
|
||||||
|
name='storage_type',
|
||||||
|
field=models.CharField(choices=[('unencrypted', 'Unencrypted'), ('gpg', 'Encrypted with GNU Privacy Guard')], default='gpg', editable=False, max_length=11),
|
||||||
|
),
|
||||||
|
|
||||||
|
# Now that the field is added, change the default to unencrypted
|
||||||
|
migrations.AlterField(
|
||||||
|
model_name='document',
|
||||||
|
name='storage_type',
|
||||||
|
field=models.CharField(choices=[('unencrypted', 'Unencrypted'), ('gpg', 'Encrypted with GNU Privacy Guard')], default='unencrypted', editable=False, max_length=11),
|
||||||
|
),
|
||||||
|
|
||||||
|
]
|
@ -57,7 +57,7 @@ class MatchingModel(models.Model):
|
|||||||
|
|
||||||
is_insensitive = models.BooleanField(default=True)
|
is_insensitive = models.BooleanField(default=True)
|
||||||
|
|
||||||
class Meta(object):
|
class Meta:
|
||||||
abstract = True
|
abstract = True
|
||||||
|
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
@ -156,7 +156,7 @@ class Correspondent(MatchingModel):
|
|||||||
# better safe than sorry.
|
# better safe than sorry.
|
||||||
SAFE_REGEX = re.compile(r"^[\w\- ,.']+$")
|
SAFE_REGEX = re.compile(r"^[\w\- ,.']+$")
|
||||||
|
|
||||||
class Meta(object):
|
class Meta:
|
||||||
ordering = ("name",)
|
ordering = ("name",)
|
||||||
|
|
||||||
|
|
||||||
@ -190,6 +190,13 @@ class Document(models.Model):
|
|||||||
TYPE_TIF = "tiff"
|
TYPE_TIF = "tiff"
|
||||||
TYPES = (TYPE_PDF, TYPE_PNG, TYPE_JPG, TYPE_GIF, TYPE_TIF,)
|
TYPES = (TYPE_PDF, TYPE_PNG, TYPE_JPG, TYPE_GIF, TYPE_TIF,)
|
||||||
|
|
||||||
|
STORAGE_TYPE_UNENCRYPTED = "unencrypted"
|
||||||
|
STORAGE_TYPE_GPG = "gpg"
|
||||||
|
STORAGE_TYPES = (
|
||||||
|
(STORAGE_TYPE_UNENCRYPTED, "Unencrypted"),
|
||||||
|
(STORAGE_TYPE_GPG, "Encrypted with GNU Privacy Guard")
|
||||||
|
)
|
||||||
|
|
||||||
correspondent = models.ForeignKey(
|
correspondent = models.ForeignKey(
|
||||||
Correspondent,
|
Correspondent,
|
||||||
blank=True,
|
blank=True,
|
||||||
@ -229,10 +236,18 @@ class Document(models.Model):
|
|||||||
default=timezone.now, db_index=True)
|
default=timezone.now, db_index=True)
|
||||||
modified = models.DateTimeField(
|
modified = models.DateTimeField(
|
||||||
auto_now=True, editable=False, db_index=True)
|
auto_now=True, editable=False, db_index=True)
|
||||||
|
|
||||||
|
storage_type = models.CharField(
|
||||||
|
max_length=11,
|
||||||
|
choices=STORAGE_TYPES,
|
||||||
|
default=STORAGE_TYPE_UNENCRYPTED,
|
||||||
|
editable=False
|
||||||
|
)
|
||||||
|
|
||||||
added = models.DateTimeField(
|
added = models.DateTimeField(
|
||||||
default=timezone.now, editable=False, db_index=True)
|
default=timezone.now, editable=False, db_index=True)
|
||||||
|
|
||||||
class Meta(object):
|
class Meta:
|
||||||
ordering = ("correspondent", "title")
|
ordering = ("correspondent", "title")
|
||||||
|
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
@ -246,11 +261,16 @@ class Document(models.Model):
|
|||||||
|
|
||||||
@property
|
@property
|
||||||
def source_path(self):
|
def source_path(self):
|
||||||
|
|
||||||
|
file_name = "{:07}.{}".format(self.pk, self.file_type)
|
||||||
|
if self.storage_type == self.STORAGE_TYPE_GPG:
|
||||||
|
file_name += ".gpg"
|
||||||
|
|
||||||
return os.path.join(
|
return os.path.join(
|
||||||
settings.MEDIA_ROOT,
|
settings.MEDIA_ROOT,
|
||||||
"documents",
|
"documents",
|
||||||
"originals",
|
"originals",
|
||||||
"{:07}.{}.gpg".format(self.pk, self.file_type)
|
file_name
|
||||||
)
|
)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
@ -267,11 +287,16 @@ class Document(models.Model):
|
|||||||
|
|
||||||
@property
|
@property
|
||||||
def thumbnail_path(self):
|
def thumbnail_path(self):
|
||||||
|
|
||||||
|
file_name = "{:07}.png".format(self.pk)
|
||||||
|
if self.storage_type == self.STORAGE_TYPE_GPG:
|
||||||
|
file_name += ".gpg"
|
||||||
|
|
||||||
return os.path.join(
|
return os.path.join(
|
||||||
settings.MEDIA_ROOT,
|
settings.MEDIA_ROOT,
|
||||||
"documents",
|
"documents",
|
||||||
"thumbnails",
|
"thumbnails",
|
||||||
"{:07}.png.gpg".format(self.pk)
|
file_name
|
||||||
)
|
)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
@ -301,7 +326,7 @@ class Log(models.Model):
|
|||||||
|
|
||||||
objects = LogManager()
|
objects = LogManager()
|
||||||
|
|
||||||
class Meta(object):
|
class Meta:
|
||||||
ordering = ("-modified",)
|
ordering = ("-modified",)
|
||||||
|
|
||||||
def __str__(self):
|
def __str__(self):
|
||||||
|
25
src/documents/tests/test_checks.py
Normal file
25
src/documents/tests/test_checks.py
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
import unittest
|
||||||
|
|
||||||
|
from django.test import TestCase
|
||||||
|
|
||||||
|
from ..checks import changed_password_check
|
||||||
|
from ..models import Document
|
||||||
|
from .factories import DocumentFactory
|
||||||
|
|
||||||
|
|
||||||
|
class ChecksTestCase(TestCase):
|
||||||
|
|
||||||
|
def test_changed_password_check_empty_db(self):
|
||||||
|
self.assertEqual(changed_password_check(None), [])
|
||||||
|
|
||||||
|
def test_changed_password_check_no_encryption(self):
|
||||||
|
DocumentFactory.create(storage_type=Document.STORAGE_TYPE_UNENCRYPTED)
|
||||||
|
self.assertEqual(changed_password_check(None), [])
|
||||||
|
|
||||||
|
@unittest.skip("I don't know how to test this")
|
||||||
|
def test_changed_password_check_gpg_encryption_with_good_password(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
@unittest.skip("I don't know how to test this")
|
||||||
|
def test_changed_password_check_fail(self):
|
||||||
|
pass
|
@ -52,12 +52,12 @@ class FetchView(SessionOrBasicAuthMixin, DetailView):
|
|||||||
|
|
||||||
if self.kwargs["kind"] == "thumb":
|
if self.kwargs["kind"] == "thumb":
|
||||||
return HttpResponse(
|
return HttpResponse(
|
||||||
GnuPG.decrypted(self.object.thumbnail_file),
|
self._get_raw_data(self.object.thumbnail_file),
|
||||||
content_type=content_types[Document.TYPE_PNG]
|
content_type=content_types[Document.TYPE_PNG]
|
||||||
)
|
)
|
||||||
|
|
||||||
response = HttpResponse(
|
response = HttpResponse(
|
||||||
GnuPG.decrypted(self.object.source_file),
|
self._get_raw_data(self.object.source_file),
|
||||||
content_type=content_types[self.object.file_type]
|
content_type=content_types[self.object.file_type]
|
||||||
)
|
)
|
||||||
response["Content-Disposition"] = 'attachment; filename="{}"'.format(
|
response["Content-Disposition"] = 'attachment; filename="{}"'.format(
|
||||||
@ -65,6 +65,11 @@ class FetchView(SessionOrBasicAuthMixin, DetailView):
|
|||||||
|
|
||||||
return response
|
return response
|
||||||
|
|
||||||
|
def _get_raw_data(self, file_handle):
|
||||||
|
if self.object.storage_type == Document.STORAGE_TYPE_UNENCRYPTED:
|
||||||
|
return file_handle
|
||||||
|
return GnuPG.decrypted(file_handle)
|
||||||
|
|
||||||
|
|
||||||
class PushView(SessionOrBasicAuthMixin, FormView):
|
class PushView(SessionOrBasicAuthMixin, FormView):
|
||||||
"""
|
"""
|
||||||
|
@ -3,16 +3,9 @@ import os
|
|||||||
import sys
|
import sys
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|
||||||
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "paperless.settings")
|
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "paperless.settings")
|
||||||
|
|
||||||
from django.conf import settings
|
|
||||||
from django.core.management import execute_from_command_line
|
from django.core.management import execute_from_command_line
|
||||||
|
|
||||||
# The runserver and consumer need to have access to the passphrase, so it
|
|
||||||
# must be entered at start time to keep it safe.
|
|
||||||
if "runserver" in sys.argv or "document_consumer" in sys.argv:
|
|
||||||
if not settings.PASSPHRASE:
|
|
||||||
settings.PASSPHRASE = input(
|
|
||||||
"settings.PASSPHRASE is unset. Input passphrase: ")
|
|
||||||
|
|
||||||
execute_from_command_line(sys.argv)
|
execute_from_command_line(sys.argv)
|
||||||
|
@ -2,7 +2,7 @@ import os
|
|||||||
import shutil
|
import shutil
|
||||||
|
|
||||||
from django.conf import settings
|
from django.conf import settings
|
||||||
from django.core.checks import Error, register, Warning
|
from django.core.checks import Error, Warning, register
|
||||||
|
|
||||||
|
|
||||||
@register()
|
@register()
|
||||||
@ -84,20 +84,3 @@ def binaries_check(app_configs, **kwargs):
|
|||||||
check_messages.append(Warning(error.format(binary), hint))
|
check_messages.append(Warning(error.format(binary), hint))
|
||||||
|
|
||||||
return check_messages
|
return check_messages
|
||||||
|
|
||||||
|
|
||||||
@register()
|
|
||||||
def config_check(app_configs, **kwargs):
|
|
||||||
warning = (
|
|
||||||
"It looks like you have PAPERLESS_SHARED_SECRET defined. Note that "
|
|
||||||
"in the \npast, this variable was used for both API authentication "
|
|
||||||
"and as the mail \nkeyword. As the API no no longer uses it, this "
|
|
||||||
"variable has been renamed to \nPAPERLESS_EMAIL_SECRET, so if you're "
|
|
||||||
"using the mail feature, you'd best update \nyour variable name.\n\n"
|
|
||||||
"The old variable will stop working in a few months."
|
|
||||||
)
|
|
||||||
|
|
||||||
if os.getenv("PAPERLESS_SHARED_SECRET"):
|
|
||||||
return [Warning(warning)]
|
|
||||||
|
|
||||||
return []
|
|
||||||
|
@ -3,7 +3,7 @@ import gnupg
|
|||||||
from django.conf import settings
|
from django.conf import settings
|
||||||
|
|
||||||
|
|
||||||
class GnuPG(object):
|
class GnuPG:
|
||||||
"""
|
"""
|
||||||
A handy singleton to use when handling encrypted files.
|
A handy singleton to use when handling encrypted files.
|
||||||
"""
|
"""
|
||||||
@ -11,15 +11,22 @@ class GnuPG(object):
|
|||||||
gpg = gnupg.GPG(gnupghome=settings.GNUPG_HOME)
|
gpg = gnupg.GPG(gnupghome=settings.GNUPG_HOME)
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def decrypted(cls, file_handle):
|
def decrypted(cls, file_handle, passphrase=None):
|
||||||
return cls.gpg.decrypt_file(
|
|
||||||
file_handle, passphrase=settings.PASSPHRASE).data
|
if not passphrase:
|
||||||
|
passphrase = settings.PASSPHRASE
|
||||||
|
|
||||||
|
return cls.gpg.decrypt_file(file_handle, passphrase=passphrase).data
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def encrypted(cls, file_handle):
|
def encrypted(cls, file_handle, passphrase=None):
|
||||||
|
|
||||||
|
if not passphrase:
|
||||||
|
passphrase = settings.PASSPHRASE
|
||||||
|
|
||||||
return cls.gpg.encrypt_file(
|
return cls.gpg.encrypt_file(
|
||||||
file_handle,
|
file_handle,
|
||||||
recipients=None,
|
recipients=None,
|
||||||
passphrase=settings.PASSPHRASE,
|
passphrase=passphrase,
|
||||||
symmetric=True
|
symmetric=True
|
||||||
).data
|
).data
|
||||||
|
@ -221,12 +221,12 @@ OCR_LANGUAGE = os.getenv("PAPERLESS_OCR_LANGUAGE", "eng")
|
|||||||
OCR_THREADS = os.getenv("PAPERLESS_OCR_THREADS")
|
OCR_THREADS = os.getenv("PAPERLESS_OCR_THREADS")
|
||||||
|
|
||||||
# OCR all documents?
|
# OCR all documents?
|
||||||
OCR_ALWAYS = bool(os.getenv("PAPERLESS_OCR_ALWAYS", "NO").lower() in ("yes", "y", "1", "t", "true"))
|
OCR_ALWAYS = bool(os.getenv("PAPERLESS_OCR_ALWAYS", "NO").lower() in ("yes", "y", "1", "t", "true")) # NOQA
|
||||||
|
|
||||||
# If this is true, any failed attempts to OCR a PDF will result in the PDF
|
# If this is true, any failed attempts to OCR a PDF will result in the PDF
|
||||||
# being indexed anyway, with whatever we could get. If it's False, the file
|
# being indexed anyway, with whatever we could get. If it's False, the file
|
||||||
# will simply be left in the CONSUMPTION_DIR.
|
# will simply be left in the CONSUMPTION_DIR.
|
||||||
FORGIVING_OCR = bool(os.getenv("PAPERLESS_FORGIVING_OCR", "YES").lower() in ("yes", "y", "1", "t", "true"))
|
FORGIVING_OCR = bool(os.getenv("PAPERLESS_FORGIVING_OCR", "YES").lower() in ("yes", "y", "1", "t", "true")) # NOQA
|
||||||
|
|
||||||
# GNUPG needs a home directory for some reason
|
# GNUPG needs a home directory for some reason
|
||||||
GNUPG_HOME = os.getenv("HOME", "/tmp")
|
GNUPG_HOME = os.getenv("HOME", "/tmp")
|
||||||
@ -253,13 +253,17 @@ CONSUMPTION_DIR = os.getenv("PAPERLESS_CONSUMPTION_DIR")
|
|||||||
# slowly, you may want to use a higher value than the default.
|
# slowly, you may want to use a higher value than the default.
|
||||||
CONSUMER_LOOP_TIME = int(os.getenv("PAPERLESS_CONSUMER_LOOP_TIME", 10))
|
CONSUMER_LOOP_TIME = int(os.getenv("PAPERLESS_CONSUMER_LOOP_TIME", 10))
|
||||||
|
|
||||||
# This is used to encrypt the original documents and decrypt them later when
|
# Pre-2.x versions of Paperless stored your documents locally with GPG
|
||||||
# you want to download them. Set it and change the permissions on this file to
|
# encryption, but that is no longer the default. This behaviour is still
|
||||||
# 0600, or set it to `None` and you'll be prompted for the passphrase at
|
# available, but it must be explicitly enabled by setting
|
||||||
# runtime. The default looks for an environment variable.
|
# `PAPERLESS_PASSPHRASE` in your environment or config file. The default is to
|
||||||
# DON'T FORGET TO SET THIS as leaving it blank may cause some strange things
|
# store these files unencrypted.
|
||||||
# with GPG, including an interesting case where it may "encrypt" zero-byte
|
#
|
||||||
# files.
|
# Translation:
|
||||||
|
# * If you're a new user, you can safely ignore this setting.
|
||||||
|
# * If you're upgrading from 1.x, this must be set, OR you can run
|
||||||
|
# `./manage.py change_storage_type gpg unencrypted` to decrypt your files,
|
||||||
|
# after which you can unset this value.
|
||||||
PASSPHRASE = os.getenv("PAPERLESS_PASSPHRASE")
|
PASSPHRASE = os.getenv("PAPERLESS_PASSPHRASE")
|
||||||
|
|
||||||
# Trigger a script after every successful document consumption?
|
# Trigger a script after every successful document consumption?
|
||||||
|
Loading…
x
Reference in New Issue
Block a user