Fix text formatting

This commit is contained in:
Daniel Quinn 2018-01-30 20:27:40 +00:00 committed by Wolf-Bastian Pöttner
parent 87e466c47c
commit e9fff764cb

View File

@ -1,254 +1,296 @@
Changelog Changelog
######### #########
* 1.2.0 1.2.0
* New Docker image, now based on Alpine, thanks to the efforts of `addadi`_ =====
and `Pit`_.
* `BastianPoe`_ has added the long-awaited feature to automatically skip the
OCR step when the PDF already contains text. This can be overridden by
setting ``PAPERLESS_OCR_ALWAYS=YES`` either in your ``paperless.conf`` or
in the environment. Note that this also means that Paperless now requires
``libpoppler-cpp-dev`` to be installed. **You'll need to run
``pip install -r requirements.txt`` after the usual ``git pull`` to
properly update**.
* 1.1.0 * New Docker image, now based on Alpine, thanks to the efforts of `addadi`_
* Fix for `#283`_, a redirect bug which broke interactions with and `Pit`_.
paperless-desktop. Thanks to `chris-aeviator`_ for reporting it. * `BastianPoe`_ has added the long-awaited feature to automatically skip the
* Addition of an optional new financial year filter, courtesy of OCR step when the PDF already contains text. This can be overridden by
`David Martin`_ `#256`_ setting ``PAPERLESS_OCR_ALWAYS=YES`` either in your ``paperless.conf`` or
* Fixed a typo in how thumbnails were named in exports `#285`_, courtesy of in the environment. Note that this also means that Paperless now requires
`Dan Panzarella`_ ``libpoppler-cpp-dev`` to be installed. **Important**: You'll need to run
``pip install -r requirements.txt`` after the usual ``git pull`` to
properly update.
* 1.0.0 1.1.0
* Upgrade to Django 1.11. **You'll need to run =====
``pip install -r requirements.txt`` after the usual ``git pull`` to
properly update**.
* Replace the templatetag-based hack we had for document listing in favour of
a slightly less ugly solution in the form of another template tag with less
copypasta.
* Support for multi-word-matches for auto-tagging thanks to an excellent
patch from `ishirav`_ `#277`_.
* Fixed a CSS bug reported by `Stefan Hagen`_ that caused an overlapping of
the text and checkboxes under some resolutions `#272`_.
* Patched the Docker config to force the serving of static files. Credit for
this one goes to `dev-rke`_ via `#248`_.
* Fix file permissions during Docker start up thanks to `Pit`_ on `#268`_.
* Date fields in the admin are now expressed as HTML5 date fields thanks to
`Lukas Winkler`_'s issue `#278`_
* 0.8.0 * Fix for `#283`_, a redirect bug which broke interactions with
* Paperless can now run in a subdirectory on a host (``/paperless``), rather paperless-desktop. Thanks to `chris-aeviator`_ for reporting it.
than always running in the root (``/``) thanks to `maphy-psd`_'s work on * Addition of an optional new financial year filter, courtesy of
`#255`_. `David Martin`_ `#256`_
* Fixed a typo in how thumbnails were named in exports `#285`_, courtesy of
`Dan Panzarella`_
* 0.7.0 1.0.0
* **Potentially breaking change**: As per `#235`_, Paperless will no longer =====
automatically delete documents attached to correspondents when those
correspondents are themselves deleted. This was Django's default
behaviour, but didn't make much sense in Paperless' case. Thanks to
`Thomas Brueggemann`_ and `David Martin`_ for their input on this one.
* Fix for `#232`_ wherein Paperless wasn't recognising ``.tif`` files
properly. Thanks to `ayounggun`_ for reporting this one and to
`Kusti Skytén`_ for posting the correct solution in the Github issue.
* 0.6.0 * Upgrade to Django 1.11. **You'll need to run
* Abandon the shared-secret trick we were using for the POST API in favour ``pip install -r requirements.txt`` after the usual ``git pull`` to
of BasicAuth or Django session. properly update**.
* Fix the POST API so it actually works. `#236`_ * Replace the templatetag-based hack we had for document listing in favour of
* **Breaking change**: We've dropped the use of ``PAPERLESS_SHARED_SECRET`` a slightly less ugly solution in the form of another template tag with less
as it was being used both for the API (now replaced with a normal auth) copypasta.
and form email polling. Now that we're only using it for email, this * Support for multi-word-matches for auto-tagging thanks to an excellent
variable has been renamed to ``PAPERLESS_EMAIL_SECRET``. The old value patch from `ishirav`_ `#277`_.
will still work for a while, but you should change your config if you've * Fixed a CSS bug reported by `Stefan Hagen`_ that caused an overlapping of
been using the email polling feature. Thanks to `Joshua Gilman`_ for all the text and checkboxes under some resolutions `#272`_.
the help with this feature. * Patched the Docker config to force the serving of static files. Credit for
* 0.5.0 this one goes to `dev-rke`_ via `#248`_.
* Support for fuzzy matching in the auto-tagger & auto-correspondent systems * Fix file permissions during Docker start up thanks to `Pit`_ on `#268`_.
thanks to `Jake Gysland`_'s patch `#220`_. * Date fields in the admin are now expressed as HTML5 date fields thanks to
* Modified the Dockerfile to prepare an export directory (`#212`_). Thanks `Lukas Winkler`_'s issue `#278`_
to combined efforts from `Pit`_ and `Strubbl`_ in working out the kinks on
this one.
* Updated the import/export scripts to include support for thumbnails. Big
thanks to `CkuT`_ for finding this shortcoming and doing the work to get
it fixed in `#224`_.
* All of the following changes are thanks to `David Martin`_:
* Bumped the dependency on pyocr to 0.4.7 so new users can make use of
Tesseract 4 if they so prefer (`#226`_).
* Fixed a number of issues with the automated mail handler (`#227`_, `#228`_)
* Amended the documentation for better handling of systemd service files (`#229`_)
* Amended the Django Admin configuration to have nice headers (`#230`_)
* 0.4.1 0.8.0
* Fix for `#206`_ wherein the pluggable parser didn't recognise files with =====
all-caps suffixes like ``.PDF``
* 0.4.0 * Paperless can now run in a subdirectory on a host (``/paperless``), rather
* Introducing reminders. See `#199`_ for more information, but the short than always running in the root (``/``) thanks to `maphy-psd`_'s work on
explanation is that you can now attach simple notes & times to documents `#255`_.
which are made available via the API. Currently, the default API
(basically just the Django admin) doesn't really make use of this, but
`Thomas Brueggemann`_ over at `Paperless Desktop`_ has said that he would
like to make use of this feature in his project.
* 0.3.6 0.7.0
* Fix for `#200`_ (!!) where the API wasn't configured to allow updating the =====
correspondent or the tags for a document.
* The ``content`` field is now optional, to allow for the edge case of a
purely graphical document.
* You can no longer add documents via the admin. This never worked in the
first place, so all I've done here is remove the link to the broken form.
* The consumer code has been heavily refactored to support a pluggable
interface. Install a paperless consumer via pip and tell paperless about
it with an environment variable, and you're good to go. Proper
documentation is on its way.
* 0.3.5 * **Potentially breaking change**: As per `#235`_, Paperless will no longer
* A serious facelift for the documents listing page wherein we drop the automatically delete documents attached to correspondents when those
tabular layout in favour of a tiled interface. correspondents are themselves deleted. This was Django's default
* Users can now configure the number of items per page. behaviour, but didn't make much sense in Paperless' case. Thanks to
* Fix for `#171`_: Allow users to specify their own ``SECRET_KEY`` value. `Thomas Brueggemann`_ and `David Martin`_ for their input on this one.
* Moved the dotenv loading to the top of settings.py * Fix for `#232`_ wherein Paperless wasn't recognising ``.tif`` files
* Fix for `#112`_: Added checks for binaries required for document properly. Thanks to `ayounggun`_ for reporting this one and to
consumption. `Kusti Skytén`_ for posting the correct solution in the Github issue.
* 0.3.4 0.6.0
* Removal of django-suit due to a licensing conflict I bumped into in 0.3.3. =====
Note that you *can* use Django Suit with Paperless, but only in a
non-profit situation as their free license prohibits for-profit use. As a
result, I can't bundle Suit with Paperless without conflicting with the
GPL. Further development will be done against the stock Django admin.
* I shrunk the thumbnails a little 'cause they were too big for me, even on
my high-DPI monitor.
* BasicAuth support for document and thumbnail downloads, as well as the Push
API thanks to @thomasbrueggemann. See `#179`_.
* 0.3.3 * Abandon the shared-secret trick we were using for the POST API in favour
* Thumbnails in the UI and a Django-suit -based face-lift courtesy of @ekw! of BasicAuth or Django session.
* Timezone, items per page, and default language are now all configurable, * Fix the POST API so it actually works. `#236`_
also thanks to @ekw. * **Breaking change**: We've dropped the use of ``PAPERLESS_SHARED_SECRET``
as it was being used both for the API (now replaced with a normal auth)
and form email polling. Now that we're only using it for email, this
variable has been renamed to ``PAPERLESS_EMAIL_SECRET``. The old value
will still work for a while, but you should change your config if you've
been using the email polling feature. Thanks to `Joshua Gilman`_ for all
the help with this feature.
* 0.3.2 0.5.0
* Fix for `#172`_: defaulting ALLOWED_HOSTS to ``["*"]`` and allowing the =====
user to set her own value via ``PAPERLESS_ALLOWED_HOSTS`` should the need
arise.
* 0.3.1 * Support for fuzzy matching in the auto-tagger & auto-correspondent systems
* Added a default value for ``CONVERT_BINARY`` thanks to `Jake Gysland`_'s patch `#220`_.
* Modified the Dockerfile to prepare an export directory (`#212`_). Thanks
to combined efforts from `Pit`_ and `Strubbl`_ in working out the kinks on
this one.
* Updated the import/export scripts to include support for thumbnails. Big
thanks to `CkuT`_ for finding this shortcoming and doing the work to get
it fixed in `#224`_.
* All of the following changes are thanks to `David Martin`_:
* Bumped the dependency on pyocr to 0.4.7 so new users can make use of
Tesseract 4 if they so prefer (`#226`_).
* Fixed a number of issues with the automated mail handler (`#227`_, `#228`_)
* Amended the documentation for better handling of systemd service files (`#229`_)
* Amended the Django Admin configuration to have nice headers (`#230`_)
* 0.3.0 0.4.1
* Updated to using django-filter 1.x =====
* Added some system checks so new users aren't confused by misconfigurations.
* Consumer loop time is now configurable for systems with slow writes. Just
set ``PAPERLESS_CONSUMER_LOOP_TIME`` to a number of seconds. The default
is 10.
* As per `#44`_, we've removed support for ``PAPERLESS_CONVERT``,
``PAPERLESS_CONSUME``, and ``PAPERLESS_SECRET``. Please use
``PAPERLESS_CONVERT_BINARY``, ``PAPERLESS_CONSUMPTION_DIR``, and
``PAPERLESS_SHARED_SECRET`` respectively instead.
* 0.2.0 * Fix for `#206`_ wherein the pluggable parser didn't recognise files with
all-caps suffixes like ``.PDF``
* `#150`_: The media root is now a variable you can set in 0.4.0
``paperless.conf``. =====
* `#148`_: The database location (sqlite) is now a variable you can set in
``paperless.conf``.
* `#146`_: Fixed a bug that allowed unauthorised access to the ``/fetch``
URL.
* `#131`_: Document files are now automatically removed from disk when
they're deleted in Paperless.
* `#121`_: Fixed a bug where Paperless wasn't setting document creation time
based on the file naming scheme.
* `#81`_: Added a hook to run an arbitrary script after every document is
consumed.
* `#98`_: Added optional environment variables for ImageMagick so that it
doesn't explode when handling Very Large Documents or when it's just
running on a low-memory system. Thanks to `Florian Harr`_ for his help on
this one.
* `#89`_ Ported the auto-tagging code to correspondents as well. Thanks to
`Justin Snyman`_ for the pointers in the issue queue.
* Added support for guessing the date from the file name along with the
correspondent, title, and tags. Thanks to `Tikitu de Jager`_ for his pull
request that I took forever to merge and to `Pit`_ for his efforts on the
regex front.
* `#94`_: Restored support for changing the created date in the UI. Thanks
to `Martin Honermeyer`_ and `Tim White`_ for working with me on this.
* 0.1.1 * Introducing reminders. See `#199`_ for more information, but the short
explanation is that you can now attach simple notes & times to documents
which are made available via the API. Currently, the default API
(basically just the Django admin) doesn't really make use of this, but
`Thomas Brueggemann`_ over at `Paperless Desktop`_ has said that he would
like to make use of this feature in his project.
* Potentially **Breaking Change**: All references to "sender" in the code 0.3.6
have been renamed to "correspondent" to better reflect the nature of the =====
property (one could quite reasonably scan a document before sending it to
someone.)
* `#67`_: Rewrote the document exporter and added a new importer that allows
for full metadata retention without depending on the file name and
modification time. A big thanks to `Tikitu de Jager`_, `Pit`_,
`Florian Jung`_, and `Christopher Luu`_ for their code snippets and
contributing conversation that lead to this change.
* `#20`_: Added *unpaper* support to help in cleaning up the scanned image
before it's OCR'd. Thanks to `Pit`_ for this one.
* `#71`_ Added (encrypted) thumbnails in anticipation of a proper UI.
* `#68`_: Added support for using a proper config file at
``/etc/paperless.conf`` and modified the systemd unit files to use it.
* Refactored the Vagrant installation process to use environment variables
rather than asking the user to modify ``settings.py``.
* `#44`_: Harmonise environment variable names with constant names.
* `#60`_: Setup logging to actually use the Python native logging framework.
* `#53`_: Fixed an annoying bug that caused ``.jpeg`` and ``.JPG`` images
to be imported but made unavailable.
* 0.1.0 * Fix for `#200`_ (!!) where the API wasn't configured to allow updating the
correspondent or the tags for a document.
* The ``content`` field is now optional, to allow for the edge case of a
purely graphical document.
* You can no longer add documents via the admin. This never worked in the
first place, so all I've done here is remove the link to the broken form.
* The consumer code has been heavily refactored to support a pluggable
interface. Install a paperless consumer via pip and tell paperless about
it with an environment variable, and you're good to go. Proper
documentation is on its way.
* Docker support! Big thanks to `Wayne Werner`_, `Brian Conn`_, and 0.3.5
`Tikitu de Jager`_ for this one, and especially to `Pit`_ =====
who spearheadded this effort.
* A simple REST API is in place, but it should be considered unstable.
* Cleaned up the consumer to use temporary directories instead of a single
scratch space. (Thanks `Pit`_)
* Improved the efficiency of the consumer by parsing pages more intelligently
and introducing a threaded OCR process (thanks again `Pit`_).
* `#45`_: Cleaned up the logic for tag matching. Reported by `darkmatter`_.
* `#47`_: Auto-rotate landscape documents. Reported by `Paul`_ and fixed by
`Pit`_.
* `#48`_: Matching algorithms should do so on a word boundary (`darkmatter`_)
* `#54`_: Documented the re-tagger (`zedster`_)
* `#57`_: Make sure file is preserved on import failure (`darkmatter`_)
* Added tox with pep8 checking
* 0.0.6 * A serious facelift for the documents listing page wherein we drop the
tabular layout in favour of a tiled interface.
* Users can now configure the number of items per page.
* Fix for `#171`_: Allow users to specify their own ``SECRET_KEY`` value.
* Moved the dotenv loading to the top of settings.py
* Fix for `#112`_: Added checks for binaries required for document
consumption.
* Added support for parallel OCR (significant work from `Pit`_) 0.3.4
* Sped up the language detection (significant work from `Pit`_) =====
* Added simple logging
* 0.0.5 * Removal of django-suit due to a licensing conflict I bumped into in 0.3.3.
Note that you *can* use Django Suit with Paperless, but only in a
non-profit situation as their free license prohibits for-profit use. As a
result, I can't bundle Suit with Paperless without conflicting with the
GPL. Further development will be done against the stock Django admin.
* I shrunk the thumbnails a little 'cause they were too big for me, even on
my high-DPI monitor.
* BasicAuth support for document and thumbnail downloads, as well as the Push
API thanks to @thomasbrueggemann. See `#179`_.
* Added support for image files as documents (png, jpg, gif, tiff) 0.3.3
* Added a crude means of HTTP POST for document imports =====
* Added IMAP mail support
* Added a re-tagging utility
* Documentation for the above as well as data migration
* 0.0.4 * Thumbnails in the UI and a Django-suit -based face-lift courtesy of @ekw!
* Timezone, items per page, and default language are now all configurable,
also thanks to @ekw.
* Added automated tagging basted on keyword matching 0.3.2
* Cleaned up the document listing page =====
* Removed ``User`` and ``Group`` from the admin
* Added ``pytz`` to the list of requirements
* 0.0.3 * Fix for `#172`_: defaulting ALLOWED_HOSTS to ``["*"]`` and allowing the
user to set her own value via ``PAPERLESS_ALLOWED_HOSTS`` should the need
arise.
* Added basic tagging 0.3.1
=====
* 0.0.2 * Added a default value for ``CONVERT_BINARY``
* Added language detection 0.3.0
* Added datestamps to ``document_exporter``. =====
* Changed ``settings.TESSERACT_LANGUAGE`` to ``settings.OCR_LANGUAGE``.
* 0.0.1 * Updated to using django-filter 1.x
* Added some system checks so new users aren't confused by misconfigurations.
* Consumer loop time is now configurable for systems with slow writes. Just
set ``PAPERLESS_CONSUMER_LOOP_TIME`` to a number of seconds. The default
is 10.
* As per `#44`_, we've removed support for ``PAPERLESS_CONVERT``,
``PAPERLESS_CONSUME``, and ``PAPERLESS_SECRET``. Please use
``PAPERLESS_CONVERT_BINARY``, ``PAPERLESS_CONSUMPTION_DIR``, and
``PAPERLESS_SHARED_SECRET`` respectively instead.
* Initial release 0.2.0
=====
* `#150`_: The media root is now a variable you can set in
``paperless.conf``.
* `#148`_: The database location (sqlite) is now a variable you can set in
``paperless.conf``.
* `#146`_: Fixed a bug that allowed unauthorised access to the ``/fetch``
URL.
* `#131`_: Document files are now automatically removed from disk when
they're deleted in Paperless.
* `#121`_: Fixed a bug where Paperless wasn't setting document creation time
based on the file naming scheme.
* `#81`_: Added a hook to run an arbitrary script after every document is
consumed.
* `#98`_: Added optional environment variables for ImageMagick so that it
doesn't explode when handling Very Large Documents or when it's just
running on a low-memory system. Thanks to `Florian Harr`_ for his help on
this one.
* `#89`_ Ported the auto-tagging code to correspondents as well. Thanks to
`Justin Snyman`_ for the pointers in the issue queue.
* Added support for guessing the date from the file name along with the
correspondent, title, and tags. Thanks to `Tikitu de Jager`_ for his pull
request that I took forever to merge and to `Pit`_ for his efforts on the
regex front.
* `#94`_: Restored support for changing the created date in the UI. Thanks
to `Martin Honermeyer`_ and `Tim White`_ for working with me on this.
0.1.1
=====
* Potentially **Breaking Change**: All references to "sender" in the code
have been renamed to "correspondent" to better reflect the nature of the
property (one could quite reasonably scan a document before sending it to
someone.)
* `#67`_: Rewrote the document exporter and added a new importer that allows
for full metadata retention without depending on the file name and
modification time. A big thanks to `Tikitu de Jager`_, `Pit`_,
`Florian Jung`_, and `Christopher Luu`_ for their code snippets and
contributing conversation that lead to this change.
* `#20`_: Added *unpaper* support to help in cleaning up the scanned image
before it's OCR'd. Thanks to `Pit`_ for this one.
* `#71`_ Added (encrypted) thumbnails in anticipation of a proper UI.
* `#68`_: Added support for using a proper config file at
``/etc/paperless.conf`` and modified the systemd unit files to use it.
* Refactored the Vagrant installation process to use environment variables
rather than asking the user to modify ``settings.py``.
* `#44`_: Harmonise environment variable names with constant names.
* `#60`_: Setup logging to actually use the Python native logging framework.
* `#53`_: Fixed an annoying bug that caused ``.jpeg`` and ``.JPG`` images
to be imported but made unavailable.
0.1.0
=====
* Docker support! Big thanks to `Wayne Werner`_, `Brian Conn`_, and
`Tikitu de Jager`_ for this one, and especially to `Pit`_
who spearheadded this effort.
* A simple REST API is in place, but it should be considered unstable.
* Cleaned up the consumer to use temporary directories instead of a single
scratch space. (Thanks `Pit`_)
* Improved the efficiency of the consumer by parsing pages more intelligently
and introducing a threaded OCR process (thanks again `Pit`_).
* `#45`_: Cleaned up the logic for tag matching. Reported by `darkmatter`_.
* `#47`_: Auto-rotate landscape documents. Reported by `Paul`_ and fixed by
`Pit`_.
* `#48`_: Matching algorithms should do so on a word boundary (`darkmatter`_)
* `#54`_: Documented the re-tagger (`zedster`_)
* `#57`_: Make sure file is preserved on import failure (`darkmatter`_)
* Added tox with pep8 checking
0.0.6
=====
* Added support for parallel OCR (significant work from `Pit`_)
* Sped up the language detection (significant work from `Pit`_)
* Added simple logging
0.0.5
=====
* Added support for image files as documents (png, jpg, gif, tiff)
* Added a crude means of HTTP POST for document imports
* Added IMAP mail support
* Added a re-tagging utility
* Documentation for the above as well as data migration
0.0.4
=====
* Added automated tagging basted on keyword matching
* Cleaned up the document listing page
* Removed ``User`` and ``Group`` from the admin
* Added ``pytz`` to the list of requirements
0.0.3
=====
* Added basic tagging
0.0.2
=====
* Added language detection
* Added datestamps to ``document_exporter``.
* Changed ``settings.TESSERACT_LANGUAGE`` to ``settings.OCR_LANGUAGE``.
0.0.1
=====
* Initial release
.. _Brian Conn: https://github.com/TheConnMan .. _Brian Conn: https://github.com/TheConnMan
.. _Christopher Luu: https://github.com/nuudles .. _Christopher Luu: https://github.com/nuudles