mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-09 09:58:20 -05:00
Merge branch 'master' into issue/81
This commit is contained in:
commit
49b56425e8
@ -24,8 +24,11 @@ How it Works
|
||||
|
||||
1. Buy a document scanner like `this one`_.
|
||||
2. Set it up to "scan to FTP" or something similar. It should be able to push
|
||||
scanned images to a server without you having to do anything.
|
||||
3. Have the target server run the *Paperless* consumption script to OCR the PDF
|
||||
scanned images to a server without you having to do anything. If your
|
||||
scanner doesn't know how to automatically upload the file somewhere, you can
|
||||
always do that manually. Paperless doesn't care how the documents get into
|
||||
its local consumption directory.
|
||||
3. Have the target server run the Paperless consumption script to OCR the PDF
|
||||
and index it into a local database.
|
||||
4. Use the web frontend to sift through the database and find what you want.
|
||||
5. Download the PDF you need/want via the web interface and do whatever you
|
||||
@ -56,7 +59,7 @@ powerful tools.
|
||||
|
||||
* `ImageMagick`_ converts the images between colour and greyscale.
|
||||
* `Tesseract`_ does the character recognition.
|
||||
* `Unpaper`_ despeckles and and deskews the scanned image.
|
||||
* `Unpaper`_ despeckles and deskews the scanned image.
|
||||
* `GNU Privacy Guard`_ is used as the encryption backend.
|
||||
* `Python 3`_ is the language of the project.
|
||||
|
||||
|
@ -11,6 +11,10 @@ services:
|
||||
- data:/usr/src/paperless/data
|
||||
- media:/usr/src/paperless/media
|
||||
env_file: docker-compose.env
|
||||
# The reason the line is here is so that the webserver that doesn't do
|
||||
# any text recognition and doesn't have to install unnecessary
|
||||
# languages the user might have set in the env-file by overwriting the
|
||||
# value with nothing.
|
||||
environment:
|
||||
- PAPERLESS_OCR_LANGUAGES=
|
||||
command: ["runserver", "0.0.0.0:8000"]
|
||||
|
@ -1,6 +1,15 @@
|
||||
Changelog
|
||||
#########
|
||||
|
||||
* 0.2.0
|
||||
|
||||
* Added support for guessing the date from the file name along with the
|
||||
correspondent, title, and tags. Thanks to `Tikitu de Jager`_ for his pull
|
||||
request that I took forever to merge and to `Pit`_ for his efforts on the
|
||||
regex front.
|
||||
* `#94`_: Restored support for changing the created date in the UI. Thanks
|
||||
to `Martin Honermeyer`_ and `Tim White`_ for working with me on this.
|
||||
|
||||
* 0.1.1
|
||||
|
||||
* Potentially **Breaking Change**: All references to "sender" in the code
|
||||
@ -86,6 +95,8 @@ Changelog
|
||||
.. _Wayne Werner: https://github.com/waynew
|
||||
.. _darkmatter: https://github.com/darkmatter
|
||||
.. _zedster: https://github.com/zedster
|
||||
.. _Martin Honermeyer: https://github.com/djmaze
|
||||
.. _Tim White: https://github.com/timwhite
|
||||
|
||||
.. _#20: https://github.com/danielquinn/paperless/issues/20
|
||||
.. _#44: https://github.com/danielquinn/paperless/issues/44
|
||||
@ -99,3 +110,4 @@ Changelog
|
||||
.. _#67: https://github.com/danielquinn/paperless/issues/67
|
||||
.. _#68: https://github.com/danielquinn/paperless/issues/68
|
||||
.. _#71: https://github.com/danielquinn/paperless/issues/71
|
||||
.. _#94: https://github.com/danielquinn/paperless/issues/71
|
||||
|
@ -45,19 +45,27 @@ you name the file right, it'll automatically set some values in the database
|
||||
for you. This is is the logic the consumer follows:
|
||||
|
||||
1. Try to find the correspondent, title, and tags in the file name following
|
||||
the pattern: ``Date - Correspondent - Title - tag,tag,tag.pdf``. Note that
|
||||
the format of the date is **rigidly defined** as ``YYYYMMDDHHMMSSZ`` or
|
||||
``YYYYMMDDZ``. The ``Z`` is for "Zulu time" AKA "UTC".
|
||||
2. If that doesn't work, we skip the date and try this pattern:
|
||||
the pattern: ``Correspondent - Title - tag,tag,tag.pdf``.
|
||||
2. If that doesn't work, try to find the correspondent and title in the file
|
||||
3. If that doesn't work, we try to find the correspondent and title in the file
|
||||
name following the pattern: ``Correspondent - Title.pdf``.
|
||||
3. If that doesn't work, just assume that the name of the file is the title.
|
||||
4. If that doesn't work, just assume that the name of the file is the title.
|
||||
|
||||
So given the above, the following examples would work as you'd expect:
|
||||
|
||||
* ``20150314000700Z - Some Company Name - Invoice 2016-01-01 - money,invoices.pdf``
|
||||
* ``20150314Z - Some Company Name - Invoice 2016-01-01 - money,invoices.pdf``
|
||||
* ``Some Company Name - Invoice 2016-01-01 - money,invoices.pdf``
|
||||
* ``Another Company - Letter of Reference.jpg``
|
||||
* ``Dad's Recipe for Pancakes.png``
|
||||
|
||||
These however wouldn't work:
|
||||
|
||||
* ``2015-03-14 00:07:00 UTC - Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
|
||||
* ``2015-03-14 - Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
|
||||
* ``Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
|
||||
* ``Another Company- Letter of Reference.jpg``
|
||||
|
||||
@ -128,7 +136,7 @@ following name/value pairs:
|
||||
don't start uploading stuff to your server. The means of generating this
|
||||
signature is defined below.
|
||||
|
||||
Specify ``enctype="multipart/form-data"``, and then POST your file with:::
|
||||
Specify ``enctype="multipart/form-data"``, and then POST your file with::
|
||||
|
||||
Content-Disposition: form-data; name="document"; filename="whatever.pdf"
|
||||
|
||||
|
@ -33,4 +33,5 @@ Contents
|
||||
api
|
||||
utilities
|
||||
migrating
|
||||
troubleshooting
|
||||
changelog
|
||||
|
@ -8,7 +8,7 @@ should work) that has the following software installed on it:
|
||||
|
||||
* `Python3`_ (with development libraries, pip and virtualenv)
|
||||
* `GNU Privacy Guard`_
|
||||
* `Tesseract`_
|
||||
* `Tesseract`_, plus its language files matching your document base.
|
||||
* `Imagemagick`_
|
||||
* `unpaper`_
|
||||
|
||||
@ -52,6 +52,7 @@ well as ImageMagick:
|
||||
|
||||
$ brew install ghostscript
|
||||
$ brew install imagemagick
|
||||
$ brew install libmagic
|
||||
|
||||
|
||||
.. _requirements-baremetal:
|
||||
|
207
docs/setup.rst
207
docs/setup.rst
@ -5,7 +5,8 @@ Setup
|
||||
|
||||
Paperless isn't a very complicated app, but there are a few components, so some
|
||||
basic documentation is in order. If you go follow along in this document and
|
||||
still have trouble, please open an `issue on GitHub`_ so I can fill in the gaps.
|
||||
still have trouble, please open an `issue on GitHub`_ so I can fill in the
|
||||
gaps.
|
||||
|
||||
.. _issue on GitHub: https://github.com/danielquinn/paperless/issues
|
||||
|
||||
@ -15,8 +16,8 @@ still have trouble, please open an `issue on GitHub`_ so I can fill in the gaps.
|
||||
Download
|
||||
--------
|
||||
|
||||
The source is currently only available via GitHub, so grab it from there, either
|
||||
by using ``git``:
|
||||
The source is currently only available via GitHub, so grab it from there,
|
||||
either by using ``git``:
|
||||
|
||||
.. code:: bash
|
||||
|
||||
@ -42,15 +43,16 @@ route`_ is quick & easy, but means you're running a VM which comes with memory
|
||||
consumption etc. We also `support Docker`_, which you can use natively under
|
||||
Linux and in a VM with `Docker Machine`_ (this guide was written for native
|
||||
Docker usage under Linux, you might have to adapt it for Docker Machine.)
|
||||
Alternatively the standard, `bare metal`_ approach is a little more complicated,
|
||||
but worth it because it makes it easier to should you want to contribute some
|
||||
code back.
|
||||
Alternatively the standard, `bare metal`_ approach is a little more
|
||||
complicated, but worth it because it makes it easier to should you want to
|
||||
contribute some code back.
|
||||
|
||||
.. _Vagrant route: setup-installation-vagrant_
|
||||
.. _support Docker: setup-installation-docker_
|
||||
.. _bare metal: setup-installation-standard_
|
||||
.. _Docker Machine: https://docs.docker.com/machine/
|
||||
|
||||
|
||||
.. _setup-installation-standard:
|
||||
|
||||
Standard (Bare Metal)
|
||||
@ -58,19 +60,16 @@ Standard (Bare Metal)
|
||||
|
||||
1. Install the requirements as per the :ref:`requirements <requirements>` page.
|
||||
2. Change to the ``src`` directory in this repo.
|
||||
3. Edit ``paperless/settings.py`` and be sure to set the values for:
|
||||
* ``CONSUMPTION_DIR``: this is where your documents will be dumped to be
|
||||
consumed by Paperless.
|
||||
* ``PASSPHRASE``: this is the passphrase Paperless uses to encrypt/decrypt
|
||||
the original document. The default value attempts to source the
|
||||
passphrase from the environment, so if you don't set it to a static value
|
||||
here, you must set ``PAPERLESS_PASSPHRASE=some-secret-string`` on the
|
||||
command line whenever invoking the consumer or webserver.
|
||||
* ``OCR_THREADS``: this is the number of threads the OCR process will spawn
|
||||
to process document pages in parallel. The default value gets sourced from
|
||||
the environment-variable ``PAPERLESS_OCR_THREADS`` and expects it to be an
|
||||
integer. If the variable is not set, Python determines the core-count of
|
||||
your CPU and uses that value.
|
||||
3. Copy ``paperless.conf.example`` to ``/etc/paperless.conf`` and open it in
|
||||
your favourite editor. Set the values for:
|
||||
|
||||
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
||||
dumped to be consumed by Paperless.
|
||||
* ``PAPERLESS_PASSPHRASE``: this is the passphrase Paperless uses to
|
||||
encrypt/decrypt the original document.
|
||||
* ``PAPERLESS_OCR_THREADS``: this is the number of threads the OCR process
|
||||
will spawn to process document pages in parallel.
|
||||
|
||||
4. Initialise the database with ``./manage.py migrate``.
|
||||
5. Create a user for your Paperless instance with
|
||||
``./manage.py createsuperuser``. Follow the prompts to create your user.
|
||||
@ -79,8 +78,8 @@ Standard (Bare Metal)
|
||||
You should now be able to visit your (empty) `Paperless webserver`_ at
|
||||
``127.0.0.1:8000`` (or whatever you chose). You can login with the
|
||||
user/pass you created in #5.
|
||||
7. In a separate window, change to the ``src`` directory in this repo again, but
|
||||
this time, you should start the consumer script with
|
||||
7. In a separate window, change to the ``src`` directory in this repo again,
|
||||
but this time, you should start the consumer script with
|
||||
``./manage.py document_consumer``.
|
||||
8. Scan something. Put it in the ``CONSUMPTION_DIR``.
|
||||
9. Wait a few minutes
|
||||
@ -100,6 +99,7 @@ Vagrant Method
|
||||
provisioned...
|
||||
3. Run ``vagrant ssh`` and once inside your new vagrant box, edit
|
||||
``/etc/paperless.conf`` and set the values for:
|
||||
|
||||
* ``PAPERLESS_CONSUMPTION_DIR``: this is where your documents will be
|
||||
dumped to be consumed by Paperless.
|
||||
* ``PAPERLESS_PASSPHRASE``: this is the passphrase Paperless uses to
|
||||
@ -107,6 +107,7 @@ Vagrant Method
|
||||
* ``PAPERLESS_SHARED_SECRET``: this is the "magic word" used when consuming
|
||||
documents from mail or via the API. If you don't use either, leaving it
|
||||
blank is just fine.
|
||||
|
||||
4. Exit the vagrant box and re-enter it with ``vagrant ssh`` again. This
|
||||
updates the environment to make use of the changes you made to the config
|
||||
file.
|
||||
@ -140,9 +141,9 @@ Docker Method
|
||||
.. caution::
|
||||
|
||||
As mentioned earlier, this guide assumes that you use Docker natively
|
||||
under Linux. If you are using `Docker Machine`_ under Mac OS X or Windows,
|
||||
you will have to adapt IP addresses, volume-mounting, command execution
|
||||
and maybe more.
|
||||
under Linux. If you are using `Docker Machine`_ under Mac OS X or
|
||||
Windows, you will have to adapt IP addresses, volume-mounting, command
|
||||
execution and maybe more.
|
||||
|
||||
2. Install `docker-compose`_. [#compose]_
|
||||
|
||||
@ -161,14 +162,14 @@ Docker Method
|
||||
.. _Docker installation guide: https://docs.docker.com/engine/installation/
|
||||
.. _docker-compose installation guide: https://docs.docker.com/compose/install/
|
||||
|
||||
3. Create a copy of ``docker-compose.yml.example`` as ``docker-compose.yml`` and
|
||||
a copy of ``docker-compose.env.example`` as ``docker-compose.env``. You'll be
|
||||
editing both these files: taking a copy ensures that you can ``git pull`` to
|
||||
receive updates without risking merge conflicts with your modified versions
|
||||
of the configuration files.
|
||||
4. Modify ``docker-compose.yml`` to your preferences, following the instructions
|
||||
in comments in the file. The only change that is a hard requirement is to
|
||||
specify where the consumption directory should mount.
|
||||
3. Create a copy of ``docker-compose.yml.example`` as ``docker-compose.yml``
|
||||
and a copy of ``docker-compose.env.example`` as ``docker-compose.env``.
|
||||
You'll be editing both these files: taking a copy ensures that you can
|
||||
``git pull`` to receive updates without risking merge conflicts with your
|
||||
modified versions of the configuration files.
|
||||
4. Modify ``docker-compose.yml`` to your preferences, following the
|
||||
instructions in comments in the file. The only change that is a hard
|
||||
requirement is to specify where the consumption directory should mount.
|
||||
5. Modify ``docker-compose.env`` and adapt the following environment variables:
|
||||
|
||||
``PAPERLESS_PASSPHRASE``
|
||||
@ -181,10 +182,11 @@ Docker Method
|
||||
the core-count of your CPU and uses that value.
|
||||
|
||||
``PAPERLESS_OCR_LANGUAGES``
|
||||
If you want the OCR to recognize other languages in addition to the default
|
||||
English, set this parameter to a space separated list of three-letter
|
||||
language-codes after `ISO 639-2/T`_. For a list of available languages --
|
||||
including their three letter codes -- see the `Debian packagelist`_.
|
||||
If you want the OCR to recognize other languages in addition to the
|
||||
default English, set this parameter to a space separated list of
|
||||
three-letter language-codes after `ISO 639-2/T`_. For a list of available
|
||||
languages -- including their three letter codes -- see the
|
||||
`Debian packagelist`_.
|
||||
|
||||
``USERMAP_UID`` and ``USERMAP_GID``
|
||||
If you want to mount the consumption volume (directory ``/consume`` within
|
||||
@ -192,11 +194,11 @@ Docker Method
|
||||
access rights might be an issue. The default user and group ``paperless``
|
||||
in the containers have an id of 1000. The containers will enforce that the
|
||||
owning group of the consumption directory will be ``paperless`` to be able
|
||||
to delete consumed documents. If your host-system has a group with an id of
|
||||
1000 and you don't want this group to have access rights to the consumption
|
||||
directory, you can use ``USERMAP_GID`` to change the id in the container
|
||||
and thus the one of the consumption directory. Furthermore, you can change
|
||||
the id of the default user as well using ``USERMAP_UID``.
|
||||
to delete consumed documents. If your host-system has a group with an ID
|
||||
of 1000 and you don't want this group to have access rights to the
|
||||
consumption directory, you can use ``USERMAP_GID`` to change the id in the
|
||||
container and thus the one of the consumption directory. Furthermore, you
|
||||
can change the id of the default user as well using ``USERMAP_UID``.
|
||||
|
||||
6. Run ``docker-compose up -d``. This will create and start the necessary
|
||||
containers.
|
||||
@ -234,14 +236,14 @@ Docker Method
|
||||
.. danger::
|
||||
|
||||
While the consumption container will ensure at startup that it can
|
||||
**delete** a consumed file from a host-mounted directory, it might not
|
||||
be able to **read** the document in the first place if the access
|
||||
**delete** a consumed file from a host-mounted directory, it might
|
||||
not be able to **read** the document in the first place if the access
|
||||
rights to the file are incorrect.
|
||||
|
||||
Make sure that the documents you put into the consumption directory
|
||||
will either be readable by everyone (``chmod o+r file.pdf``) or
|
||||
readable by the default user or group id 1000 (or the one you have set
|
||||
with ``USERMAP_UID`` or ``USERMAP_GID`` respectively).
|
||||
readable by the default user or group id 1000 (or the one you have
|
||||
set with ``USERMAP_UID`` or ``USERMAP_GID`` respectively).
|
||||
|
||||
2. Use ``docker cp`` to copy your files directly into the container:
|
||||
|
||||
@ -258,8 +260,8 @@ Docker Method
|
||||
|
||||
``docker cp`` is a one-shot-command, just like ``cp``. This means that
|
||||
every time you want to consume a new document, you will have to execute
|
||||
``docker cp`` again. You can of course automate this process, but option 1
|
||||
is generally the preferred one.
|
||||
``docker cp`` again. You can of course automate this process, but option
|
||||
1 is generally the preferred one.
|
||||
|
||||
.. danger::
|
||||
|
||||
@ -267,8 +269,8 @@ Docker Method
|
||||
to the acting user at the destination, which will be ``root``.
|
||||
|
||||
You therefore need to ensure that the documents you want to copy into
|
||||
the container are readable by everyone (``chmod o+r file.pdf``) before
|
||||
copying them.
|
||||
the container are readable by everyone (``chmod o+r file.pdf``)
|
||||
before copying them.
|
||||
|
||||
|
||||
.. _Docker: https://www.docker.com/
|
||||
@ -281,17 +283,108 @@ Docker Method
|
||||
free to tinker around without using compose!
|
||||
|
||||
|
||||
.. _making-things-a-little-more-permanent:
|
||||
.. _setup-permanent:
|
||||
|
||||
Making Things a Little more Permanent
|
||||
-------------------------------------
|
||||
|
||||
Once you've tested things and are happy with the work flow, you can automate the
|
||||
process of starting the webserver and consumer automatically. If you're running
|
||||
on a bare metal system that's using Systemd, you can use the service unit files
|
||||
in the ``scripts`` directory to set this up. If you're on another startup
|
||||
system or are using a Vagrant box, then you're currently on your own. If you are
|
||||
using Docker, you can set a restart-policy_ in the ``docker-compose.yml`` to
|
||||
have the containers automatically start with the Docker daemon.
|
||||
Once you've tested things and are happy with the work flow, you can automate
|
||||
the process of starting the webserver and consumer automatically.
|
||||
|
||||
|
||||
.. _setup-permanent-standard-systemd:
|
||||
|
||||
Standard (Bare Metal, Systemd)
|
||||
..............................
|
||||
|
||||
If you're running on a bare metal system that's using Systemd, you can use the
|
||||
service unit files in the ``scripts`` directory to set this up. You'll need to
|
||||
create a user called ``paperless`` and setup Paperless to be in a place that
|
||||
this new user can read and write to. Then, you can just tell Systemd to enable
|
||||
the two ``.service`` files::
|
||||
|
||||
# systemctl enable /path/to/paperless/scripts/paperless-consumer.service
|
||||
# systemctl enable /path/to/paperless/scripts/paperless-webserver.service
|
||||
# systemctl start /path/to/paperless/scripts/paperless-consumer.service
|
||||
# systemctl start /path/to/paperless/scripts/paperless-webserver.service
|
||||
|
||||
|
||||
.. _setup-permanent-standard-ubuntu14:
|
||||
|
||||
Ubuntu 14.04 (Bare Metal, Upstart)
|
||||
..................................
|
||||
|
||||
Ubuntu 14.04 and earlier use the `Upstart`_ init system to start services
|
||||
during the boot process. To configure Upstart to run Paperless automatically
|
||||
after restarting your system:
|
||||
|
||||
1. Change to the directory where Upstart's configuration files are kept:
|
||||
``cd /etc/init``
|
||||
2. Create a new file: ``sudo nano paperless-server.conf``
|
||||
3. In the newly-created file enter::
|
||||
|
||||
start on (local-filesystems and net-device-up IFACE=eth0)
|
||||
stop on shutdown
|
||||
|
||||
respawn
|
||||
respawn limit 10 5
|
||||
|
||||
script
|
||||
exec /srv/paperless/src/manage.py runserver 0.0.0.0:80
|
||||
end script
|
||||
|
||||
Note that you'll need to replace ``/srv/paperless/src/manage.py`` with the
|
||||
path to the ``manage.py`` script in your installation directory.
|
||||
|
||||
If you are using a network interface other than ``eth0``, you will have to
|
||||
change ``IFACE=eth0``. For example, if you are connected via WiFi, you will
|
||||
likely need to replace ``eth0`` above with ``wlan0``. To see all interfaces,
|
||||
run ``ifconfig``.
|
||||
|
||||
Save the file.
|
||||
|
||||
4. Create a new file: ``sudo nano paperless-consumer.conf``
|
||||
|
||||
5. In the newly-created file enter::
|
||||
|
||||
start on (local-filesystems and net-device-up IFACE=eth0)
|
||||
stop on shutdown
|
||||
|
||||
respawn
|
||||
respawn limit 10 5
|
||||
|
||||
script
|
||||
exec /srv/paperless/src/manage.py document_consumer
|
||||
end script
|
||||
|
||||
Replace ``/srv/paperless/src/manage.py`` with the same values as in step 3
|
||||
above and replace ``eth0`` with the appropriate value, if necessary. Save the
|
||||
file.
|
||||
|
||||
These two configuration files together will start both the Paperless webserver
|
||||
and document consumer processes when the file system and network interface
|
||||
specified is available after boot. Furthermore, if either process ever exits
|
||||
unexpectedly, Upstart will try to restart it a maximum of 10 times within a 5
|
||||
second period.
|
||||
|
||||
.. _Upstart: http://upstart.ubuntu.com/
|
||||
|
||||
|
||||
.. _setup-permanent-vagrant:
|
||||
|
||||
Vagrant
|
||||
.......
|
||||
|
||||
You're currently on your own, but the Ubuntu explanation above may be enough.
|
||||
|
||||
|
||||
.. _setup-permanent-docker:
|
||||
|
||||
Docker
|
||||
......
|
||||
|
||||
If you're using Docker, you can set a restart-policy_ in the
|
||||
``docker-compose.yml`` to have the containers automatically start with the
|
||||
Docker daemon.
|
||||
|
||||
.. _restart-policy: https://docs.docker.com/engine/reference/commandline/run/#restart-policies-restart
|
||||
|
19
docs/troubleshooting.rst
Normal file
19
docs/troubleshooting.rst
Normal file
@ -0,0 +1,19 @@
|
||||
.. _troubleshooting:
|
||||
|
||||
Troubleshooting
|
||||
===============
|
||||
|
||||
.. _troubleshooting_ocr_language_files_missing:
|
||||
|
||||
Consumer warns ``OCR for XX failed``
|
||||
------------------------------------
|
||||
|
||||
If you find the OCR accuracy to be too low, and/or the document consumer warns that ``OCR for
|
||||
XX failed, but we're going to stick with what we've got since FORGIVING_OCR is enabled``, then you
|
||||
might need to install the `Tesseract language files
|
||||
<http://packages.ubuntu.com/search?keywords=tesseract-ocr>`_ marching your documents languages.
|
||||
|
||||
As an example, if you are running Paperless from the Vagrant setup provided (or from any Ubuntu or Debian
|
||||
box), and your documents are written in Spanish you may need to run::
|
||||
|
||||
apt-get install -y tesseract-ocr-spa
|
@ -20,7 +20,7 @@ PAPERLESS_CONSUME_MAIL_PASS=""
|
||||
#
|
||||
# The passphrase you use here will be used when storing your documents in
|
||||
# Paperless, but you can always export them in an unencrypted format by using
|
||||
# document exporter. See the documentaiton for more information.
|
||||
# document exporter. See the documentation for more information.
|
||||
#
|
||||
# One final note about the passphrase. Once you've consumed a document with
|
||||
# one passphrase, DON'T CHANGE IT. Paperless assumes this to be a constant and
|
||||
@ -31,3 +31,8 @@ PAPERLESS_PASSPHRASE="secret"
|
||||
# If you intend to consume documents either via HTTP POST or by email, you must
|
||||
# have a shared secret here.
|
||||
PAPERLESS_SHARED_SECRET=""
|
||||
|
||||
# By default, Paperless will attempt to use all available CPU cores to process
|
||||
# a document, but if you would like to limit that, you can set this value to
|
||||
# an integer:
|
||||
#PAPERLESS_OCR_THREADS=1
|
||||
|
BIN
presentation/img/kitten.jpg
Normal file
BIN
presentation/img/kitten.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 92 KiB |
@ -148,12 +148,12 @@
|
||||
|
||||
<section data-background="img/pony.png">
|
||||
<h2>Demo!</h2>
|
||||
<p>(Time to sacrifice a kitten)</p>
|
||||
<img src="img/kitten.jpg" style="width: 50%;" />
|
||||
</section>
|
||||
|
||||
<section>
|
||||
<h2>TODO</h2>
|
||||
<p>It works, but it could use polish</p>
|
||||
<p>It works, but it needs polish</p>
|
||||
<ul>
|
||||
<li>The UI is the Django admin</li>
|
||||
<li>Mail consumption is really raw</li>
|
||||
@ -163,11 +163,11 @@
|
||||
<aside class="notes">
|
||||
<ul>
|
||||
<li>
|
||||
<strong>Plugin architecture</strong>: there've been requests for
|
||||
some overly custom stuff to happen before and after consumption,
|
||||
but in the UNIX spirit of "do one job well", I think this sort
|
||||
of thing is better written as a plugin -- which means I need to
|
||||
figure out a best practise for that.
|
||||
<strong>Plugin architecture</strong>: there've been requests
|
||||
for some overly custom stuff to happen before and after
|
||||
consumption, but in the UNIX spirit of "do one job well", I
|
||||
think this sort of thing is better written as a plugin -- which
|
||||
means I need to figure out a best practise for that.
|
||||
</li>
|
||||
</ul>
|
||||
</aside>
|
||||
|
@ -1,4 +1,4 @@
|
||||
Django==1.9.2
|
||||
Django==1.9.4
|
||||
Pillow==3.1.1
|
||||
django-crispy-forms==1.6.0
|
||||
django-extensions==1.6.1
|
||||
|
@ -19,12 +19,11 @@ from PIL import Image
|
||||
|
||||
from django.conf import settings
|
||||
from django.utils import timezone
|
||||
from django.template.defaultfilters import slugify
|
||||
from pyocr.tesseract import TesseractError
|
||||
|
||||
from paperless.db import GnuPG
|
||||
|
||||
from .models import Correspondent, Tag, Document, Log
|
||||
from .models import Tag, Document, Log, FileInfo
|
||||
from .languages import ISO639
|
||||
from .signals import (
|
||||
document_consumption_started, document_consumption_finished)
|
||||
@ -56,19 +55,6 @@ class Consumer(object):
|
||||
|
||||
DEFAULT_OCR_LANGUAGE = settings.OCR_LANGUAGE
|
||||
|
||||
REGEX_TITLE = re.compile(
|
||||
r"^.*/(.*)\.(pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)
|
||||
REGEX_CORRESPONDENT_TITLE = re.compile(
|
||||
r"^.*/(.+) - (.*)\.(pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)
|
||||
REGEX_CORRESPONDENT_TITLE_TAGS = re.compile(
|
||||
r"^.*/(.*) - (.*) - ([a-z0-9\-,]*)\.(pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)
|
||||
|
||||
def __init__(self):
|
||||
|
||||
self.logger = logging.getLogger(__name__)
|
||||
@ -107,7 +93,7 @@ class Consumer(object):
|
||||
if not os.path.isfile(doc):
|
||||
continue
|
||||
|
||||
if not re.match(self.REGEX_TITLE, doc):
|
||||
if not re.match(FileInfo.REGEXES["title"], doc):
|
||||
continue
|
||||
|
||||
if doc in self._ignore:
|
||||
@ -282,72 +268,20 @@ class Consumer(object):
|
||||
# Strip out excess white space to allow matching to go smoother
|
||||
return re.sub(r"\s+", " ", r)
|
||||
|
||||
def _guess_attributes_from_name(self, parseable):
|
||||
"""
|
||||
We use a crude naming convention to make handling the correspondent,
|
||||
title, and tags easier:
|
||||
"<correspondent> - <title> - <tags>.<suffix>"
|
||||
"<correspondent> - <title>.<suffix>"
|
||||
"<title>.<suffix>"
|
||||
"""
|
||||
|
||||
def get_correspondent(correspondent_name):
|
||||
return Correspondent.objects.get_or_create(
|
||||
name=correspondent_name,
|
||||
defaults={"slug": slugify(correspondent_name)}
|
||||
)[0]
|
||||
|
||||
def get_tags(tags):
|
||||
r = []
|
||||
for t in tags.split(","):
|
||||
r.append(
|
||||
Tag.objects.get_or_create(slug=t, defaults={"name": t})[0])
|
||||
return tuple(r)
|
||||
|
||||
def get_suffix(suffix):
|
||||
suffix = suffix.lower()
|
||||
if suffix == "jpeg":
|
||||
return "jpg"
|
||||
return suffix
|
||||
|
||||
# First attempt: "<correspondent> - <title> - <tags>.<suffix>"
|
||||
m = re.match(self.REGEX_CORRESPONDENT_TITLE_TAGS, parseable)
|
||||
if m:
|
||||
return (
|
||||
get_correspondent(m.group(1)),
|
||||
m.group(2),
|
||||
get_tags(m.group(3)),
|
||||
get_suffix(m.group(4))
|
||||
)
|
||||
|
||||
# Second attempt: "<correspondent> - <title>.<suffix>"
|
||||
m = re.match(self.REGEX_CORRESPONDENT_TITLE, parseable)
|
||||
if m:
|
||||
return (
|
||||
get_correspondent(m.group(1)),
|
||||
m.group(2),
|
||||
(),
|
||||
get_suffix(m.group(3))
|
||||
)
|
||||
|
||||
# That didn't work, so we assume correspondent and tags are None
|
||||
m = re.match(self.REGEX_TITLE, parseable)
|
||||
return None, m.group(1), (), get_suffix(m.group(2))
|
||||
|
||||
def _store(self, text, doc, thumbnail):
|
||||
|
||||
sender, title, tags, file_type = self._guess_attributes_from_name(doc)
|
||||
relevant_tags = set(list(Tag.match_all(text)) + list(tags))
|
||||
file_info = FileInfo.from_path(doc)
|
||||
relevant_tags = set(list(Tag.match_all(text)) + list(file_info.tags))
|
||||
|
||||
stats = os.stat(doc)
|
||||
|
||||
self.log("debug", "Saving record to database")
|
||||
|
||||
document = Document.objects.create(
|
||||
correspondent=sender,
|
||||
title=title,
|
||||
correspondent=file_info.correspondent,
|
||||
title=file_info.title,
|
||||
content=text,
|
||||
file_type=file_type,
|
||||
file_type=file_info.extension,
|
||||
created=timezone.make_aware(
|
||||
datetime.datetime.fromtimestamp(stats.st_mtime)),
|
||||
modified=timezone.make_aware(
|
||||
|
@ -96,11 +96,16 @@ class Command(Renderable, BaseCommand):
|
||||
|
||||
@staticmethod
|
||||
def _get_legacy_file_name(doc):
|
||||
if doc.correspondent and doc.title:
|
||||
tags = ",".join([t.slug for t in doc.tags.all()])
|
||||
if tags:
|
||||
return "{} - {} - {}.{}".format(
|
||||
doc.correspondent, doc.title, tags, doc.file_type)
|
||||
return "{} - {}.{}".format(
|
||||
doc.correspondent, doc.title, doc.file_type)
|
||||
return os.path.basename(doc.source_path)
|
||||
|
||||
if not doc.correspondent and not doc.title:
|
||||
return os.path.basename(doc.source_path)
|
||||
|
||||
created = doc.created.strftime("%Y%m%d%H%M%SZ")
|
||||
tags = ",".join([t.slug for t in doc.tags.all()])
|
||||
|
||||
if tags:
|
||||
return "{} - {} - {} - {}.{}".format(
|
||||
created, doc.correspondent, doc.title, tags, doc.file_type)
|
||||
|
||||
return "{} - {} - {}.{}".format(
|
||||
created, doc.correspondent, doc.title, doc.file_type)
|
||||
|
@ -1,8 +1,11 @@
|
||||
import dateutil.parser
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import uuid
|
||||
|
||||
from collections import OrderedDict
|
||||
|
||||
from django.conf import settings
|
||||
from django.core.urlresolvers import reverse
|
||||
from django.db import models
|
||||
@ -152,7 +155,7 @@ class Document(models.Model):
|
||||
)
|
||||
tags = models.ManyToManyField(
|
||||
Tag, related_name="documents", blank=True)
|
||||
created = models.DateTimeField(default=timezone.now, editable=False)
|
||||
created = models.DateTimeField(default=timezone.now)
|
||||
modified = models.DateTimeField(auto_now=True, editable=False)
|
||||
|
||||
class Meta(object):
|
||||
@ -250,3 +253,136 @@ class Log(models.Model):
|
||||
self.group = uuid.uuid4()
|
||||
|
||||
models.Model.save(self, *args, **kwargs)
|
||||
|
||||
|
||||
class FileInfo(object):
|
||||
|
||||
# This epic regex *almost* worked for our needs, so I'm keeping it here for
|
||||
# posterity, in the hopes that we might find a way to make it work one day.
|
||||
ALMOST_REGEX = re.compile(
|
||||
r"^((?P<date>\d\d\d\d\d\d\d\d\d\d\d\d\d\dZ){separator})?"
|
||||
r"((?P<correspondent>{non_separated_word}+){separator})??"
|
||||
r"(?P<title>{non_separated_word}+)"
|
||||
r"({separator}(?P<tags>[a-z,0-9-]+))?"
|
||||
r"\.(?P<extension>[a-zA-Z.-]+)$".format(
|
||||
separator=r"\s+-\s+",
|
||||
non_separated_word=r"([\w,. ]|([^\s]-))"
|
||||
)
|
||||
)
|
||||
|
||||
REGEXES = OrderedDict([
|
||||
("created-correspondent-title-tags", re.compile(
|
||||
r"^(?P<created>\d\d\d\d\d\d\d\d(\d\d\d\d\d\d)?Z) - "
|
||||
r"(?P<correspondent>.*) - "
|
||||
r"(?P<title>.*) - "
|
||||
r"(?P<tags>[a-z0-9\-,]*)"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)),
|
||||
("created-title-tags", re.compile(
|
||||
r"^(?P<created>\d\d\d\d\d\d\d\d(\d\d\d\d\d\d)?Z) - "
|
||||
r"(?P<title>.*) - "
|
||||
r"(?P<tags>[a-z0-9\-,]*)"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)),
|
||||
("created-correspondent-title", re.compile(
|
||||
r"^(?P<created>\d\d\d\d\d\d\d\d(\d\d\d\d\d\d)?Z) - "
|
||||
r"(?P<correspondent>.*) - "
|
||||
r"(?P<title>.*)"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)),
|
||||
("created-title", re.compile(
|
||||
r"^(?P<created>\d\d\d\d\d\d\d\d(\d\d\d\d\d\d)?Z) - "
|
||||
r"(?P<title>.*)"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)),
|
||||
("correspondent-title-tags", re.compile(
|
||||
r"(?P<correspondent>.*) - "
|
||||
r"(?P<title>.*) - "
|
||||
r"(?P<tags>[a-z0-9\-,]*)"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)),
|
||||
("correspondent-title", re.compile(
|
||||
r"(?P<correspondent>.*) - "
|
||||
r"(?P<title>.*)?"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
)),
|
||||
("title", re.compile(
|
||||
r"(?P<title>.*)"
|
||||
r"\.(?P<extension>pdf|jpe?g|png|gif|tiff)$",
|
||||
flags=re.IGNORECASE
|
||||
))
|
||||
])
|
||||
|
||||
def __init__(self, created=None, correspondent=None, title=None, tags=(),
|
||||
extension=None):
|
||||
|
||||
self.created = created
|
||||
self.title = title
|
||||
self.extension = extension
|
||||
self.correspondent = correspondent
|
||||
self.tags = tags
|
||||
|
||||
@classmethod
|
||||
def _get_created(cls, created):
|
||||
return dateutil.parser.parse("{:0<14}Z".format(created[:-1]))
|
||||
|
||||
@classmethod
|
||||
def _get_correspondent(cls, name):
|
||||
if not name:
|
||||
return None
|
||||
return Correspondent.objects.get_or_create(name=name, defaults={
|
||||
"slug": slugify(name)
|
||||
})[0]
|
||||
|
||||
@classmethod
|
||||
def _get_title(cls, title):
|
||||
return title
|
||||
|
||||
@classmethod
|
||||
def _get_tags(cls, tags):
|
||||
r = []
|
||||
for t in tags.split(","):
|
||||
r.append(
|
||||
Tag.objects.get_or_create(slug=t, defaults={"name": t})[0])
|
||||
return tuple(r)
|
||||
|
||||
@classmethod
|
||||
def _get_extension(cls, extension):
|
||||
r = extension.lower()
|
||||
if r == "jpeg":
|
||||
return "jpg"
|
||||
return r
|
||||
|
||||
@classmethod
|
||||
def _mangle_property(cls, properties, name):
|
||||
if name in properties:
|
||||
properties[name] = getattr(cls, "_get_{}".format(name))(
|
||||
properties[name]
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def from_path(cls, path):
|
||||
"""
|
||||
We use a crude naming convention to make handling the correspondent,
|
||||
title, and tags easier:
|
||||
"<correspondent> - <title> - <tags>.<suffix>"
|
||||
"<correspondent> - <title>.<suffix>"
|
||||
"<title>.<suffix>"
|
||||
"""
|
||||
|
||||
for regex in cls.REGEXES.values():
|
||||
m = regex.match(os.path.basename(path))
|
||||
if m:
|
||||
properties = m.groupdict()
|
||||
cls._mangle_property(properties, "created")
|
||||
cls._mangle_property(properties, "correspondent")
|
||||
cls._mangle_property(properties, "title")
|
||||
cls._mangle_property(properties, "tags")
|
||||
cls._mangle_property(properties, "extension")
|
||||
return cls(**properties)
|
||||
|
@ -1,29 +1,36 @@
|
||||
from django.test import TestCase
|
||||
|
||||
from ..consumer import Consumer
|
||||
from ..models import Document, FileInfo
|
||||
|
||||
|
||||
class TestAttachment(TestCase):
|
||||
|
||||
TAGS = ("tag1", "tag2", "tag3")
|
||||
CONSUMER = Consumer()
|
||||
SUFFIXES = (
|
||||
EXTENSIONS = (
|
||||
"pdf", "png", "jpg", "jpeg", "gif",
|
||||
"PDF", "PNG", "JPG", "JPEG", "GIF",
|
||||
"PdF", "PnG", "JpG", "JPeG", "GiF",
|
||||
)
|
||||
|
||||
def _test_guess_attributes_from_name(self, path, sender, title, tags):
|
||||
for suffix in self.SUFFIXES:
|
||||
f = path.format(suffix)
|
||||
results = self.CONSUMER._guess_attributes_from_name(f)
|
||||
self.assertEqual(results[0].name, sender, f)
|
||||
self.assertEqual(results[1], title, f)
|
||||
self.assertEqual(tuple([t.slug for t in results[2]]), tags, f)
|
||||
if suffix.lower() == "jpeg":
|
||||
self.assertEqual(results[3], "jpg", f)
|
||||
|
||||
for extension in self.EXTENSIONS:
|
||||
|
||||
f = path.format(extension)
|
||||
file_info = FileInfo.from_path(f)
|
||||
|
||||
if sender:
|
||||
self.assertEqual(file_info.correspondent.name, sender, f)
|
||||
else:
|
||||
self.assertEqual(results[3], suffix.lower(), f)
|
||||
self.assertIsNone(file_info.correspondent, f)
|
||||
|
||||
self.assertEqual(file_info.title, title, f)
|
||||
|
||||
self.assertEqual(tuple([t.slug for t in file_info.tags]), tags, f)
|
||||
if extension.lower() == "jpeg":
|
||||
self.assertEqual(file_info.extension, "jpg", f)
|
||||
else:
|
||||
self.assertEqual(file_info.extension, extension.lower(), f)
|
||||
|
||||
def test_guess_attributes_from_name0(self):
|
||||
self._test_guess_attributes_from_name(
|
||||
@ -92,3 +99,206 @@ class TestAttachment(TestCase):
|
||||
"Τιτλε",
|
||||
self.TAGS
|
||||
)
|
||||
|
||||
def test_guess_attributes_from_name_when_correspondent_empty(self):
|
||||
self._test_guess_attributes_from_name(
|
||||
'/path/to/ - weird empty correspondent but should not break.{}',
|
||||
None,
|
||||
'weird empty correspondent but should not break',
|
||||
()
|
||||
)
|
||||
|
||||
def test_guess_attributes_from_name_when_title_starts_with_dash(self):
|
||||
self._test_guess_attributes_from_name(
|
||||
'/path/to/- weird but should not break.{}',
|
||||
None,
|
||||
'- weird but should not break',
|
||||
()
|
||||
)
|
||||
|
||||
def test_guess_attributes_from_name_when_title_ends_with_dash(self):
|
||||
self._test_guess_attributes_from_name(
|
||||
'/path/to/weird but should not break -.{}',
|
||||
None,
|
||||
'weird but should not break -',
|
||||
()
|
||||
)
|
||||
|
||||
def test_guess_attributes_from_name_when_title_is_empty(self):
|
||||
self._test_guess_attributes_from_name(
|
||||
'/path/to/weird correspondent but should not break - .{}',
|
||||
'weird correspondent but should not break',
|
||||
'',
|
||||
()
|
||||
)
|
||||
|
||||
|
||||
class Permutations(TestCase):
|
||||
|
||||
valid_dates = (
|
||||
"20150102030405Z",
|
||||
"20150102Z",
|
||||
)
|
||||
valid_correspondents = [
|
||||
"timmy",
|
||||
"Dr. McWheelie",
|
||||
"Dash Gor-don",
|
||||
"ο Θερμαστής",
|
||||
""
|
||||
]
|
||||
valid_titles = ["title", "Title w Spaces", "Title a-dash", "Τίτλος", ""]
|
||||
valid_tags = ["tag", "tig,tag", "tag1,tag2,tag-3"]
|
||||
valid_extensions = ["pdf", "png", "jpg", "jpeg", "gif"]
|
||||
|
||||
def _test_guessed_attributes(self, filename, created=None,
|
||||
correspondent=None, title=None,
|
||||
extension=None, tags=None):
|
||||
|
||||
# print(filename)
|
||||
info = FileInfo.from_path(filename)
|
||||
|
||||
# Created
|
||||
if created is None:
|
||||
self.assertIsNone(info.created, filename)
|
||||
else:
|
||||
self.assertEqual(info.created.year, int(created[:4]), filename)
|
||||
self.assertEqual(info.created.month, int(created[4:6]), filename)
|
||||
self.assertEqual(info.created.day, int(created[6:8]), filename)
|
||||
|
||||
# Correspondent
|
||||
if correspondent:
|
||||
self.assertEqual(info.correspondent.name, correspondent, filename)
|
||||
else:
|
||||
self.assertEqual(info.correspondent, None, filename)
|
||||
|
||||
# Title
|
||||
self.assertEqual(info.title, title, filename)
|
||||
|
||||
# Tags
|
||||
if tags is None:
|
||||
self.assertEqual(info.tags, (), filename)
|
||||
else:
|
||||
self.assertEqual(
|
||||
[t.slug for t in info.tags], tags.split(','),
|
||||
filename
|
||||
)
|
||||
|
||||
# Extension
|
||||
if extension == 'jpeg':
|
||||
extension = 'jpg'
|
||||
self.assertEqual(info.extension, extension, filename)
|
||||
|
||||
def test_just_title(self):
|
||||
template = '/path/to/{title}.{extension}'
|
||||
for title in self.valid_titles:
|
||||
for extension in self.valid_extensions:
|
||||
spec = dict(title=title, extension=extension)
|
||||
filename = template.format(**spec)
|
||||
self._test_guessed_attributes(filename, **spec)
|
||||
|
||||
def test_title_and_correspondent(self):
|
||||
template = '/path/to/{correspondent} - {title}.{extension}'
|
||||
for correspondent in self.valid_correspondents:
|
||||
for title in self.valid_titles:
|
||||
for extension in self.valid_extensions:
|
||||
spec = dict(correspondent=correspondent, title=title,
|
||||
extension=extension)
|
||||
filename = template.format(**spec)
|
||||
self._test_guessed_attributes(filename, **spec)
|
||||
|
||||
def test_title_and_correspondent_and_tags(self):
|
||||
template = '/path/to/{correspondent} - {title} - {tags}.{extension}'
|
||||
for correspondent in self.valid_correspondents:
|
||||
for title in self.valid_titles:
|
||||
for tags in self.valid_tags:
|
||||
for extension in self.valid_extensions:
|
||||
spec = dict(correspondent=correspondent, title=title,
|
||||
tags=tags, extension=extension)
|
||||
filename = template.format(**spec)
|
||||
self._test_guessed_attributes(filename, **spec)
|
||||
|
||||
def test_created_and_correspondent_and_title_and_tags(self):
|
||||
|
||||
template = ("/path/to/{created} - "
|
||||
"{correspondent} - "
|
||||
"{title} - "
|
||||
"{tags}"
|
||||
".{extension}")
|
||||
|
||||
for created in self.valid_dates:
|
||||
for correspondent in self.valid_correspondents:
|
||||
for title in self.valid_titles:
|
||||
for tags in self.valid_tags:
|
||||
for extension in self.valid_extensions:
|
||||
spec = {
|
||||
"created": created,
|
||||
"correspondent": correspondent,
|
||||
"title": title,
|
||||
"tags": tags,
|
||||
"extension": extension
|
||||
}
|
||||
self._test_guessed_attributes(
|
||||
template.format(**spec), **spec)
|
||||
|
||||
def test_created_and_correspondent_and_title(self):
|
||||
|
||||
template = ("/path/to/{created} - "
|
||||
"{correspondent} - "
|
||||
"{title}"
|
||||
".{extension}")
|
||||
|
||||
for created in self.valid_dates:
|
||||
for correspondent in self.valid_correspondents:
|
||||
for title in self.valid_titles:
|
||||
|
||||
# Skip cases where title looks like a tag as we can't
|
||||
# accommodate such cases.
|
||||
if title.lower() == title:
|
||||
continue
|
||||
|
||||
for extension in self.valid_extensions:
|
||||
spec = {
|
||||
"created": created,
|
||||
"correspondent": correspondent,
|
||||
"title": title,
|
||||
"extension": extension
|
||||
}
|
||||
self._test_guessed_attributes(
|
||||
template.format(**spec), **spec)
|
||||
|
||||
def test_created_and_title(self):
|
||||
|
||||
template = ("/path/to/{created} - "
|
||||
"{title}"
|
||||
".{extension}")
|
||||
|
||||
for created in self.valid_dates:
|
||||
for title in self.valid_titles:
|
||||
for extension in self.valid_extensions:
|
||||
spec = {
|
||||
"created": created,
|
||||
"title": title,
|
||||
"extension": extension
|
||||
}
|
||||
self._test_guessed_attributes(
|
||||
template.format(**spec), **spec)
|
||||
|
||||
def test_created_and_title_and_tags(self):
|
||||
|
||||
template = ("/path/to/{created} - "
|
||||
"{title} - "
|
||||
"{tags}"
|
||||
".{extension}")
|
||||
|
||||
for created in self.valid_dates:
|
||||
for title in self.valid_titles:
|
||||
for tags in self.valid_tags:
|
||||
for extension in self.valid_extensions:
|
||||
spec = {
|
||||
"created": created,
|
||||
"title": title,
|
||||
"tags": tags,
|
||||
"extension": extension
|
||||
}
|
||||
self._test_guessed_attributes(
|
||||
template.format(**spec), **spec)
|
||||
|
Loading…
x
Reference in New Issue
Block a user