mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-07-28 18:24:38 -05:00
Merge branch 'master' of github.com:danielquinn/paperless into ENH_filename_date_parsing
This commit is contained in:
@@ -1,6 +1,83 @@
|
||||
Changelog
|
||||
#########
|
||||
|
||||
2.6.0
|
||||
=====
|
||||
|
||||
* Allow an infinite number of logs to be deleted. Thanks to `Ulli`_ for noting
|
||||
the problem in `#433`_.
|
||||
* Fix the ``RecentCorrespondentsFilter`` correspondents filter that was added
|
||||
in 2.4 to play nice with the defaults. Thanks to `tsia`_ and `Sblop`_ who
|
||||
pointed this out. `#423`_.
|
||||
* Updated dependencies to include (among other things) a security patch to
|
||||
requests.
|
||||
|
||||
|
||||
2.5.0
|
||||
=====
|
||||
|
||||
* **New dependency**: Paperless now optimises thumbnail generation with
|
||||
`optipng`_, so you'll need to install that somewhere in your PATH or declare
|
||||
its location in ``PAPERLESS_OPTIPNG_BINARY``. The Docker image has already
|
||||
been updated on the Docker Hub, so you just need to pull the latest one from
|
||||
there if you're a Docker user.
|
||||
|
||||
* "Login free" instances of Paperless were breaking whenever you tried to edit
|
||||
objects in the admin: adding/deleting tags or correspondents, or even fixing
|
||||
spelling. This was due to the "user hack" we were applying to sessions that
|
||||
weren't using a login, as that hack user didn't have a valid id. The fix was
|
||||
to attribute the first user id in the system to this hack user. `#394`_
|
||||
|
||||
* A problem in how we handle slug values on Tags and Correspondents required a
|
||||
few changes to how we handle this field `#393`_:
|
||||
|
||||
1. Slugs are no longer editable. They're derived from the name of the tag or
|
||||
correspondent at save time, so if you wanna change the slug, you have to
|
||||
change the name, and even then you're restricted to the rules of the
|
||||
``slugify()`` function. The slug value is still visible in the admin
|
||||
though.
|
||||
2. I've added a migration to go over all existing tags & correspondents and
|
||||
rewrite the ``.slug`` values to ones conforming to the ``slugify()``
|
||||
rules.
|
||||
3. The consumption process now uses the same rules as ``.save()`` in
|
||||
determining a slug and using that to check for an existing
|
||||
tag/correspondent.
|
||||
|
||||
* An annoying bug in the date capture code was causing some bogus dates to be
|
||||
attached to documents, which in turn busted the UI. Thanks to `Andrew Peng`_
|
||||
for reporting this. `#414`_.
|
||||
|
||||
* A bug in the Dockerfile meant that Tesseract language files weren't being
|
||||
installed correctly. `euri10`_ was quick to provide a fix: `#406`_, `#413`_.
|
||||
|
||||
* Document consumption is now wrapped in a transaction as per an old ticket
|
||||
`#262`_.
|
||||
|
||||
* The ``get_date()`` functionality of the parsers has been consolidated onto
|
||||
the ``DocumentParser`` class since much of that code was redundant anyway.
|
||||
|
||||
|
||||
2.4.0
|
||||
=====
|
||||
|
||||
* A new set of actions are now available thanks to `jonaswinkler`_'s very first
|
||||
pull request! You can now do nifty things like tag documents in bulk, or set
|
||||
correspondents in bulk. `#405`_
|
||||
* The import/export system is now a little smarter. By default, documents are
|
||||
tagged as ``unencrypted``, since exports are by their nature unencrypted.
|
||||
It's now in the import step that we decide the storage type. This allows you
|
||||
to export from an encrypted system and import into an unencrypted one, or
|
||||
vice-versa.
|
||||
* The migration history has been slightly modified to accommodate PostgreSQL
|
||||
users. Additionally, you can now tell paperless to use PostgreSQL simply by
|
||||
declaring ``PAPERLESS_DBUSER`` in your environment. This will attempt to
|
||||
connect to your Postgres database without a password unless you also set
|
||||
``PAPERLESS_DBPASS``.
|
||||
* A bug was found in the REST API filter system that was the result of an
|
||||
update of django-filter some time ago. This has now been patched in `#412`_.
|
||||
Thanks to `thepill`_ for spotting it!
|
||||
|
||||
|
||||
2.3.0
|
||||
=====
|
||||
|
||||
@@ -15,7 +92,8 @@ Changelog
|
||||
* As his last bit of effort on this release, Joshua also added some code to
|
||||
allow you to view the documents inline rather than download them as an
|
||||
attachment. `#400`_
|
||||
* Finally, `ahyear`_ found a slip in the Docker documentation and patched it. `#401`_
|
||||
* Finally, `ahyear`_ found a slip in the Docker documentation and patched it.
|
||||
`#401`_
|
||||
|
||||
|
||||
2.2.1
|
||||
@@ -32,14 +110,14 @@ Changelog
|
||||
version of Paperless that supports Django 2.0! As a result of their hard
|
||||
work, you can now also run Paperless on Python 3.7 as well: `#386`_ &
|
||||
`#390`_.
|
||||
* `Stéphane Brunner`_ added a few lines of code that made tagging interface a lot
|
||||
easier on those of us with lots of different tags: `#391`_.
|
||||
* `Stéphane Brunner`_ added a few lines of code that made tagging interface a
|
||||
lot easier on those of us with lots of different tags: `#391`_.
|
||||
* `Kilian Koeltzsch`_ noticed a bug in how we capture & automatically create
|
||||
tags, so that's fixed now too: `#384`_.
|
||||
* `erikarvstedt`_ tweaked the behaviour of the test suite to be better behaved
|
||||
for packaging environments: `#383`_.
|
||||
* `Lukasz Soluch`_ added CORS support to make building a new Javascript-based front-end
|
||||
cleaner & easier: `#387`_.
|
||||
* `Lukasz Soluch`_ added CORS support to make building a new Javascript-based
|
||||
front-end cleaner & easier: `#387`_.
|
||||
|
||||
|
||||
2.1.0
|
||||
@@ -499,8 +577,15 @@ bulk of the work on this big change.
|
||||
.. _Kilian Koeltzsch: https://github.com/kiliankoe
|
||||
.. _Lukasz Soluch: https://github.com/LukaszSolo
|
||||
.. _Joshua Taillon: https://github.com/jat255
|
||||
.. _dubit0: https://github.com/dubit0
|
||||
.. _ahyear: https://github.com/ahyear
|
||||
.. _dubit0: https://github.com/dubit0
|
||||
.. _ahyear: https://github.com/ahyear
|
||||
.. _jonaswinkler: https://github.com/jonaswinkler
|
||||
.. _thepill: https://github.com/thepill
|
||||
.. _Andrew Peng: https://github.com/pengc99
|
||||
.. _euri10: https://github.com/euri10
|
||||
.. _Ulli: https://github.com/Ulli2k
|
||||
.. _tsia: https://github.com/tsia
|
||||
.. _Sblop: https://github.com/Sblop
|
||||
|
||||
.. _#20: https://github.com/danielquinn/paperless/issues/20
|
||||
.. _#44: https://github.com/danielquinn/paperless/issues/44
|
||||
@@ -566,6 +651,7 @@ bulk of the work on this big change.
|
||||
.. _#322: https://github.com/danielquinn/paperless/pull/322
|
||||
.. _#328: https://github.com/danielquinn/paperless/pull/328
|
||||
.. _#253: https://github.com/danielquinn/paperless/issues/253
|
||||
.. _#262: https://github.com/danielquinn/paperless/issues/262
|
||||
.. _#323: https://github.com/danielquinn/paperless/issues/323
|
||||
.. _#344: https://github.com/danielquinn/paperless/pull/344
|
||||
.. _#351: https://github.com/danielquinn/paperless/pull/351
|
||||
@@ -582,11 +668,21 @@ bulk of the work on this big change.
|
||||
.. _#391: https://github.com/danielquinn/paperless/pull/391
|
||||
.. _#390: https://github.com/danielquinn/paperless/pull/390
|
||||
.. _#392: https://github.com/danielquinn/paperless/issues/392
|
||||
.. _#393: https://github.com/danielquinn/paperless/issues/393
|
||||
.. _#395: https://github.com/danielquinn/paperless/pull/395
|
||||
.. _#394: https://github.com/danielquinn/paperless/issues/394
|
||||
.. _#396: https://github.com/danielquinn/paperless/pull/396
|
||||
.. _#399: https://github.com/danielquinn/paperless/pull/399
|
||||
.. _#400: https://github.com/danielquinn/paperless/pull/400
|
||||
.. _#401: https://github.com/danielquinn/paperless/pull/401
|
||||
.. _#405: https://github.com/danielquinn/paperless/pull/405
|
||||
.. _#406: https://github.com/danielquinn/paperless/issues/406
|
||||
.. _#412: https://github.com/danielquinn/paperless/issues/412
|
||||
.. _#413: https://github.com/danielquinn/paperless/pull/413
|
||||
.. _#414: https://github.com/danielquinn/paperless/issues/414
|
||||
.. _#423: https://github.com/danielquinn/paperless/issues/423
|
||||
.. _#433: https://github.com/danielquinn/paperless/issues/433
|
||||
|
||||
.. _pipenv: https://docs.pipenv.org/
|
||||
.. _a new home on Docker Hub: https://hub.docker.com/r/danielquinn/paperless/
|
||||
.. _optipng: http://optipng.sourceforge.net/
|
||||
|
@@ -76,6 +76,31 @@ Pre-consumption script
|
||||
|
||||
* Document file name
|
||||
|
||||
A simple but common example for this would be creating a simple script like
|
||||
this:
|
||||
|
||||
``/usr/local/bin/ocr-pdf``
|
||||
|
||||
.. code:: bash
|
||||
|
||||
#!/usr/bin/env bash
|
||||
pdf2pdfocr.py -i ${1}
|
||||
|
||||
``/etc/paperless.conf``
|
||||
|
||||
.. code:: bash
|
||||
|
||||
...
|
||||
PAPERLESS_PRE_CONSUME_SCRIPT="/usr/local/bin/ocr-pdf"
|
||||
...
|
||||
|
||||
This will pass the path to the document about to be consumed to ``/usr/local/bin/ocr-pdf``,
|
||||
which will in turn call `pdf2pdfocr.py`_ on your document, which will then
|
||||
overwrite the file with an OCR'd version of the file and exit. At which point,
|
||||
the consumption process will begin with the newly modified file.
|
||||
|
||||
.. _pdf2pdfocr.py: https://github.com/LeoFCardoso/pdf2pdfocr
|
||||
|
||||
|
||||
.. _consumption-director-hook-variables-post:
|
||||
|
||||
|
141
docs/contributing.rst
Normal file
141
docs/contributing.rst
Normal file
@@ -0,0 +1,141 @@
|
||||
.. _contributing:
|
||||
|
||||
Contributing to Paperless
|
||||
#########################
|
||||
|
||||
Maybe you've been using Paperless for a while and want to add a feature or two,
|
||||
or maybe you've come across a bug that you have some ideas how to solve. The
|
||||
beauty of Free software is that you can see what's wrong and help to get it
|
||||
fixed for everyone!
|
||||
|
||||
|
||||
How to Get Your Changes Rolled Into Paperless
|
||||
=============================================
|
||||
|
||||
If you've found a bug, but don't know how to fix it, you can always post an
|
||||
issue on `GitHub`_ in the hopes that someone will have the time to fix it for
|
||||
you. If however you're the one with the time, pull requests are always
|
||||
welcome, you just have to make sure that your code conforms to a few standards:
|
||||
|
||||
Pep8
|
||||
----
|
||||
|
||||
It's the standard for all Python development, so it's `very well documented`_.
|
||||
The short version is:
|
||||
|
||||
* Lines should wrap at 79 characters
|
||||
* Use ``snake_case`` for variables, ``CamelCase`` for classes, and ``ALL_CAPS``
|
||||
for constants.
|
||||
* Space out your operators: ``stuff + 7`` instead of ``stuff+7``
|
||||
* Two empty lines between classes, and functions, but 1 empty line between
|
||||
class methods.
|
||||
|
||||
There's more to it than that, but if you follow those, you'll probably be
|
||||
alright. When you submit your pull request, there's a pep8 checker that'll
|
||||
look at your code to see if anything is off. If it finds anything, it'll
|
||||
complain at you until you fix it.
|
||||
|
||||
|
||||
Additional Style Guides
|
||||
-----------------------
|
||||
|
||||
Where pep8 is ambiguous, I've tried to be a little more specific. These rules
|
||||
aren't hard-and-fast, but if you can conform to them, I'll appreciate it and
|
||||
spend less time trying to conform your PR before merging:
|
||||
|
||||
|
||||
Function calls
|
||||
..............
|
||||
|
||||
If you're calling a function and that necessitates more than one line of code,
|
||||
please format it like this:
|
||||
|
||||
.. code:: python
|
||||
|
||||
my_function(
|
||||
argument1,
|
||||
kwarg1="x",
|
||||
kwarg2="y"
|
||||
another_really_long_kwarg="some big value"
|
||||
a_kwarg_calling_another_long_function=another_function(
|
||||
another_arg,
|
||||
another_kwarg="kwarg!"
|
||||
)
|
||||
)
|
||||
|
||||
This is all in the interest of code uniformity rather than anything else. If
|
||||
we stick to a style, everything is understandable in the same way.
|
||||
|
||||
|
||||
Quoting Strings
|
||||
...............
|
||||
|
||||
pep8 is a little too open-minded on this for my liking. Python strings should
|
||||
be quoted with double quotes (``"``) except in cases where the resulting string
|
||||
would require too much escaping of a double quote, in which case, a single
|
||||
quoted, or triple-quoted string will do:
|
||||
|
||||
.. code:: python
|
||||
|
||||
my_string = "This is my string"
|
||||
problematic_string = 'This is a "string" with "quotes" in it'
|
||||
|
||||
In HTML templates, please use double-quotes for tag attributes, and single
|
||||
quotes for arguments passed to Django tempalte tags:
|
||||
|
||||
.. code:: html
|
||||
|
||||
<div class="stuff">
|
||||
<a href="{% url 'some-url-name' pk='w00t' %}">link this</a>
|
||||
</div>
|
||||
|
||||
This is to keep linters happy they look at an HTML file and see an attribute
|
||||
closing the ``"`` before it should have been.
|
||||
|
||||
--
|
||||
|
||||
That's all there is in terms of guidelines, so I hope it's not too daunting.
|
||||
|
||||
|
||||
Indentation & Spacing
|
||||
.....................
|
||||
|
||||
When it comes to indentation:
|
||||
|
||||
* For Python, the rule is: follow pep8 and use 4 spaces.
|
||||
* For Javascript, CSS, and HTML, please use 1 tab.
|
||||
|
||||
Additionally, Django templates making use of block elements like ``{% if %}``,
|
||||
``{% for %}``, and ``{% block %}`` etc. should be indented:
|
||||
|
||||
Good:
|
||||
|
||||
.. code:: html
|
||||
|
||||
{% block stuff %}
|
||||
<h1>This is the stuff</h1>
|
||||
{% endblock %}
|
||||
|
||||
Bad:
|
||||
|
||||
.. code:: html
|
||||
|
||||
{% block stuff %}
|
||||
<h1>This is the stuff</h1>
|
||||
{% endblock %}
|
||||
|
||||
|
||||
The Code of Conduct
|
||||
===================
|
||||
|
||||
Paperless has a `code of conduct`_. It's a lot like the other ones you see out
|
||||
there, with a few small changes, but basically it boils down to:
|
||||
|
||||
> Don't be an ass, or you might get banned.
|
||||
|
||||
I'm proud to say that the CoC has never had to be enforced because everyone has
|
||||
been awesome, friendly, and professional.
|
||||
|
||||
.. _GitHub: https://github.com/danielquinn/paperless/issues
|
||||
.. _very well documented: https://www.python.org/dev/peps/pep-0008/
|
||||
.. _code of conduct: https://github.com/danielquinn/paperless/blob/master/CODE_OF_CONDUCT.md
|
@@ -43,6 +43,16 @@ These however wouldn't work:
|
||||
* ``Some Company Name, Invoice 2016-01-01, money, invoices.pdf``
|
||||
* ``Another Company- Letter of Reference.jpg``
|
||||
|
||||
Do I have to be so strict about naming?
|
||||
---------------------------------------
|
||||
Rather than using the strict document naming rules, one can also set the option
|
||||
``PAPERLESS_FILENAME_DATE_ORDER`` in ``paperless.conf`` to any date order
|
||||
that is accepted by dateparser_. Doing so will cause ``paperless`` to default
|
||||
to any date format that is found in the title, instead of a date pulled from
|
||||
the document's text, without requiring the strict formatting of the document
|
||||
filename as described above.
|
||||
|
||||
.. _dateparser: https://github.com/scrapinghub/dateparser/blob/v0.7.0/docs/usage.rst#settings
|
||||
|
||||
.. _guesswork-content:
|
||||
|
||||
|
@@ -43,5 +43,6 @@ Contents
|
||||
customising
|
||||
extending
|
||||
troubleshooting
|
||||
contributing
|
||||
scanners
|
||||
changelog
|
||||
|
@@ -82,6 +82,7 @@ rolled in as part of the update:
|
||||
|
||||
$ cd /path/to/project
|
||||
$ git pull
|
||||
$ pip install -r requirements.txt
|
||||
$ cd src
|
||||
$ ./manage.py migrate
|
||||
|
||||
|
@@ -33,7 +33,7 @@ In addition to the above, there are a number of Python requirements, all of
|
||||
which are listed in a file called ``requirements.txt`` in the project root
|
||||
directory.
|
||||
|
||||
If you're not working on a virtual environment (like Vagrant or Docker), you
|
||||
If you're not working on a virtual environment (like Docker), you
|
||||
should probably be using a virtualenv, but that's your call. The reasons why
|
||||
you might choose a virtualenv or not aren't really within the scope of this
|
||||
document. Needless to say if you don't know what a virtualenv is, you should
|
||||
|
@@ -42,18 +42,14 @@ Installation & Configuration
|
||||
You can go multiple routes with setting up and running Paperless:
|
||||
|
||||
* The `bare metal route`_
|
||||
* The `vagrant route`_
|
||||
* The `docker route`_
|
||||
|
||||
|
||||
The `Vagrant route`_ is quick & easy, but means you're running a VM which comes
|
||||
with memory consumption, cpu overhead etc. The `docker route`_ offers the same
|
||||
simplicity as Vagrant with lower resource consumption.
|
||||
The `docker route`_ is quick & easy.
|
||||
|
||||
The `bare metal route`_ is a bit more complicated to setup but makes it easier
|
||||
should you want to contribute some code back.
|
||||
|
||||
.. _Vagrant route: setup-installation-vagrant_
|
||||
.. _docker route: setup-installation-docker_
|
||||
.. _bare metal route: setup-installation-bare-metal_
|
||||
.. _Docker Machine: https://docs.docker.com/machine/
|
||||
@@ -267,54 +263,6 @@ Docker Method
|
||||
newer ``docker-compose.yml.example`` file
|
||||
|
||||
|
||||
.. _setup-installation-vagrant:
|
||||
|
||||
Vagrant Method
|
||||
++++++++++++++
|
||||
|
||||
1. Install `Vagrant`_. How you do that is really between you and your OS.
|
||||
2. Run ``vagrant up``. An instance will start up for you. When it's ready and
|
||||
provisioned...
|
||||
3. Run ``vagrant ssh`` and once inside your new vagrant box, edit
|
||||
``/etc/paperless.conf`` and set the values for:
|
||||
|
||||
* ``PAPERLESS_CONSUMPTION_DIR``: This is where your documents will be
|
||||
dumped to be consumed by Paperless.
|
||||
* ``PAPERLESS_PASSPHRASE``: This is the passphrase Paperless uses to
|
||||
encrypt/decrypt the original document. It's only required if you want
|
||||
your original files to be encrypted, otherwise, just leave it unset.
|
||||
* ``PAPERLESS_EMAIL_SECRET``: this is the "magic word" used when consuming
|
||||
documents from mail or via the API. If you don't use either, leaving it
|
||||
blank is just fine.
|
||||
|
||||
4. Exit the vagrant box and re-enter it with ``vagrant ssh`` again. This
|
||||
updates the environment to make use of the changes you made to the config
|
||||
file.
|
||||
5. Initialise the database with ``/opt/paperless/src/manage.py migrate``.
|
||||
6. Still inside your vagrant box, create a user for your Paperless instance
|
||||
with ``/opt/paperless/src/manage.py createsuperuser``. Follow the prompts to
|
||||
create your user.
|
||||
7. Start the webserver with
|
||||
``/opt/paperless/src/manage.py runserver 0.0.0.0:8000``. You should now be
|
||||
able to visit your (empty) `Paperless webserver`_ at ``172.28.128.4:8000``.
|
||||
You can login with the user/pass you created in #6.
|
||||
8. In a separate window, run ``vagrant ssh`` again, but this time once inside
|
||||
your vagrant instance, you should start the consumer script with
|
||||
``/opt/paperless/src/manage.py document_consumer``.
|
||||
9. Scan something. Put it in the ``CONSUMPTION_DIR``.
|
||||
10. Wait a few minutes
|
||||
11. Visit the document list on your webserver, and it should be there, indexed
|
||||
and downloadable.
|
||||
|
||||
.. caution::
|
||||
|
||||
This installation is not secure. Once everything is working head up to
|
||||
`Making things more permanent`_
|
||||
|
||||
.. _Vagrant: https://vagrantup.com/
|
||||
.. _Paperless server: http://172.28.128.4:8000
|
||||
|
||||
|
||||
.. _setup-permanent:
|
||||
|
||||
Making Things a Little more Permanent
|
||||
@@ -398,7 +346,7 @@ instance listening on localhost port 8000.
|
||||
location /static {
|
||||
|
||||
autoindex on;
|
||||
alias <path-to-paperless-static-directory>
|
||||
alias <path-to-paperless-static-directory>;
|
||||
|
||||
}
|
||||
|
||||
@@ -409,7 +357,7 @@ instance listening on localhost port 8000.
|
||||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
||||
proxy_set_header X-Forwarded-Proto $scheme;
|
||||
|
||||
proxy_pass http://127.0.0.1:8000
|
||||
proxy_pass http://127.0.0.1:8000;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -513,13 +461,6 @@ second period.
|
||||
.. _Upstart: http://upstart.ubuntu.com/
|
||||
|
||||
|
||||
Vagrant
|
||||
~~~~~~~
|
||||
|
||||
You may use the Ubuntu explanation above. Replace
|
||||
``(local-filesystems and net-device-up IFACE=eth0)`` with ``vagrant-mounted``.
|
||||
|
||||
|
||||
.. _setup-permanent-docker:
|
||||
|
||||
Docker
|
||||
|
@@ -14,9 +14,8 @@ FORGIVING_OCR is enabled``, then you might need to install the
|
||||
`Tesseract language files <http://packages.ubuntu.com/search?keywords=tesseract-ocr>`_
|
||||
marching your document's languages.
|
||||
|
||||
As an example, if you are running Paperless from the Vagrant setup provided
|
||||
(or from any Ubuntu or Debian box), and your documents are written in Spanish
|
||||
you may need to run::
|
||||
As an example, if you are running Paperless from any Ubuntu or Debian
|
||||
box, and your documents are written in Spanish you may need to run::
|
||||
|
||||
apt-get install -y tesseract-ocr-spa
|
||||
|
||||
|
Reference in New Issue
Block a user