shamoon
2d52226732
Enhancement: system status report sanity check, simpler classifier check, styling updates ( #9106 )
2025-02-26 22:12:20 +00:00
Trenton H
827fcba277
Chore: Reduce imports for a slight memory improvement ( #9217 )
2025-02-24 15:06:14 -08:00
shamoon
3314c59828
Tweak: more accurate classifier last trained time ( #9004 )
2025-02-06 10:54:31 -08:00
dependabot[bot]
20ec8cb57b
Chore(deps-dev): Bump the development group with 2 updates ( #8841 )
...
* Chore(deps-dev): Bump the development group with 2 updates
Bumps the development group with 2 updates: [ruff](https://github.com/astral-sh/ruff ) and [mkdocs-material](https://github.com/squidfunk/mkdocs-material ).
Updates `ruff` from 0.8.6 to 0.9.2
- [Release notes](https://github.com/astral-sh/ruff/releases )
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md )
- [Commits](https://github.com/astral-sh/ruff/compare/0.8.6...0.9.2 )
Updates `mkdocs-material` from 9.5.49 to 9.5.50
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases )
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG )
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.5.49...9.5.50 )
---
updated-dependencies:
- dependency-name: ruff
dependency-type: direct:development
update-type: version-update:semver-minor
dependency-group: development
- dependency-name: mkdocs-material
dependency-type: direct:development
update-type: version-update:semver-patch
dependency-group: development
...
Signed-off-by: dependabot[bot] <support@github.com>
* Update .pre-commit-config.yaml
* Run new ruff format
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2025-01-21 19:22:25 +00:00
Trenton H
fd425aa618
Fix: Enforce classifier training ordering to prevent extra training ( #8822 )
2025-01-19 20:52:03 +00:00
shamoon
cd50f20a20
Fix: its paths not pathes
2025-01-18 07:43:02 -08:00
Sebastian Steinbeißer
935d077836
Chore: Switch from os.path to pathlib.Path ( #8325 )
...
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2025-01-06 12:12:27 -08:00
Trenton H
e6f59472e4
Chore: Drop Python 3.9 support ( #7774 )
2024-09-26 12:22:24 -07:00
Trenton H
7be7185418
Handcrafts SQL queries a little more to reduce the query count and/or the amount of returned data ( #6489 )
2024-04-30 07:37:09 -07:00
dependabot[bot]
a196c14a58
Chore(deps-dev): Bump the development group with 3 updates ( #6079 )
...
* Chore(deps-dev): Bump the development group with 3 updates
Bumps the development group with 3 updates: [ruff](https://github.com/astral-sh/ruff ), [pytest](https://github.com/pytest-dev/pytest ) and [mkdocs-material](https://github.com/squidfunk/mkdocs-material ).
Updates `ruff` from 0.3.0 to 0.3.2
- [Release notes](https://github.com/astral-sh/ruff/releases )
- [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md )
- [Commits](https://github.com/astral-sh/ruff/compare/v0.3.0...v0.3.2 )
Updates `pytest` from 8.0.2 to 8.1.1
- [Release notes](https://github.com/pytest-dev/pytest/releases )
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst )
- [Commits](https://github.com/pytest-dev/pytest/compare/8.0.2...8.1.1 )
Updates `mkdocs-material` from 9.5.12 to 9.5.13
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases )
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG )
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.5.12...9.5.13 )
---
updated-dependencies:
- dependency-name: ruff
dependency-type: direct:development
update-type: version-update:semver-patch
dependency-group: development
- dependency-name: pytest
dependency-type: direct:development
update-type: version-update:semver-minor
dependency-group: development
- dependency-name: mkdocs-material
dependency-type: direct:development
update-type: version-update:semver-patch
dependency-group: development
...
Signed-off-by: dependabot[bot] <support@github.com>
* Updates pre-commit hook versions and runs it against all files
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Trenton H <797416+stumpylog@users.noreply.github.com>
2024-03-12 07:56:01 -07:00
Trenton H
4813a7bc70
Chore: Adds additional rules for Ruff linter ( #5660 )
2024-02-05 21:46:59 +00:00
Trenton H
25542c56b9
Feature: Cache metadata and suggestions in Redis ( #5638 )
2024-02-04 10:42:21 -08:00
Trenton H
e16645b146
Feature: Add additional caching support to suggestions and metadata ( #5414 )
...
* Adds ETag and Last-Modified headers to suggestions, metadata and previews
* Slight update to the suggestions etag
* Small user message for why classifier didn't train again
2024-01-16 17:01:07 +00:00
Trenton H
41a3c7c89b
Fix: Catch new warning when loading the classifier ( #5395 )
2024-01-14 13:21:17 -08:00
shamoon
f525ac0af6
Chore: add pre-commit hook for codespell ( #5324 )
2024-01-08 13:03:05 -08:00
Trenton H
061f33fb05
Feature: Allow setting backend configuration settings via the UI ( #5126 )
...
* Saving some start on this
* At least partially working for the tesseract parser
* Problems with migration testing need to figure out
* Work around that error
* Fixes max m_pixels
* Moving the settings to main paperless application
* Starting some consumer options
* More fixes and work
* Fixes these last tests
* Fix max_length on OcrSettings.mode field
* Fix all fields on Common & Ocr settings serializers
* Umbrellla config view
* Revert "Umbrellla config view"
This reverts commit fbaf9f4be30f89afeb509099180158a3406416a5.
* Updates to use a single configuration object for all settings
* Squashed commit of the following:
commit 8a0a49dd5766094f60462fbfbe62e9921fbd2373
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 23:02:47 2023 -0800
Fix formatting
commit 66b2d90c507b8afd9507813ff555e46198ea33b9
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 22:36:35 2023 -0800
Refactor frontend data models
commit 5723bd8dd823ee855625e250df39393e26709d48
Author: Adam Bogdał <adam@bogdal.pl>
Date: Wed Dec 20 01:17:43 2023 +0100
Fix: speed up admin panel for installs with a large number of documents (#5052 )
commit 9b08ce176199bf9011a6634bb88f616846150d2b
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 15:18:51 2023 -0800
Update PULL_REQUEST_TEMPLATE.md
commit a6248bec2d793b7690feed95fcaf5eb34a75bfb6
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 15:02:05 2023 -0800
Chore: Update Angular to v17 (#4980 )
commit b1f6f52486d5ba5c04af99b41315eb6428fd1fa8
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 13:53:56 2023 -0800
Fix: Dont allow null custom_fields property via API (#5063 )
commit 638d9970fd468d8c02c91d19bd28f8b0796bdcb1
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 13:43:50 2023 -0800
Enhancement: symmetric document links (#4907 )
commit 5e8de4c1da6eb4eb8f738b20962595c7536b30ec
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date: Tue Dec 19 12:45:04 2023 -0800
Enhancement: shared icon & shared by me filter (#4859 )
commit 088bad90306025d3f6b139cbd0ad264a1cbecfe5
Author: Trenton H <797416+stumpylog@users.noreply.github.com>
Date: Tue Dec 19 12:04:03 2023 -0800
Bulk updates all the backend libraries (#5061 )
* Saving some work on frontend config
* Very basic but dynamically-generated config form
* Saving work on slightly less ugly frontend config
* JSON validation for user_args field
* Fully dynamic config form
* Adds in some additional validators for a nicer error message
* Cleaning up the testing and coverage more
* Reverts unintentional change
* Adds documentation about the settings and the precedence
* Couple more commenting and style fixes
---------
Co-authored-by: shamoon <4887959+shamoon@users.noreply.github.com>
2023-12-29 15:42:56 -08:00
Trenton H
facb7226fe
Chore: Backend bulk updates ( #4509 )
2023-11-13 17:09:56 +00:00
Trenton Holmes
650c816a7b
Removes support for Python 3.8 and lower from the code base
2023-09-10 11:42:59 -07:00
Trenton Holmes
d376f9e7a3
Adding more typing around the classification and matching
2023-07-26 07:03:43 -07:00
Trenton H
8aa5ecde62
Updates some Python dependencies and the hooks
2023-07-20 18:30:11 -07:00
Trenton H
c1641f6fb8
Just in case, catch a sometimes nltk error and return the basic processed content instead
2023-05-24 19:34:49 -07:00
Trenton H
6f163111ce
Upgrades black to v23, upgrades ruff
2023-04-26 09:35:27 -07:00
Trenton H
3bcbd05252
Fixes ruff not running isort against the codebase
2023-04-26 09:35:27 -07:00
Trenton H
ce41ac9158
Configures ruff as the one stop linter and resolves warnings it raised
2023-04-01 17:03:52 -07:00
Trenton H
41bcfcaffe
Changes out the settings and a decent amount of test code to be pathlib compatible
2023-03-06 09:16:07 -08:00
Trenton Holmes
6b939f7567
Returns to using hashing against primary keys, at least for fields. Improves testing coverage
2023-02-28 08:13:10 -08:00
Trenton Holmes
c958a7c593
Changes from a hash based system to a time based system to prevent extra retrains
2023-02-28 08:13:10 -08:00
Trenton H
8709ea4df0
Changes classifier training to hold less data in memory at the same time
2023-02-28 08:13:10 -08:00
Trenton H
1e891414a3
Allows disabling NLTK, adds it as a consideration for low power devices
2022-10-10 08:58:23 -07:00
Trenton Holmes
c44c914d3d
Changes the NLTK language to be based on the Tesseract OCR language, with fallback to the default processing
2022-10-10 08:58:23 -07:00
Trenton H
d10d2f5a54
Allows configuration of the NLTK processing language
2022-10-10 08:58:23 -07:00
Trenton Holmes
6523cf0c4b
Fixes the download and usage of the downloaded data
2022-10-10 08:58:23 -07:00
Trenton Holmes
d856e48045
Updates the pre-processing of document content to be much more robust, with tokenization, stemming and stop word removal
2022-10-10 08:58:23 -07:00
Trenton Holmes
b70e21a6d5
When raising an exception during exception handling, chain them together for slightly cleaner logs
2022-08-03 09:00:56 -07:00
Trenton Holmes
55dadea98e
No need for a branch here, the loop takes care of it
2022-07-05 08:20:35 +02:00
Trenton Holmes
77fbbe95ff
Updates the classifier to catch warnings from scikit-learn and rebuild the model file when this happens
2022-07-05 08:20:35 +02:00
Markus
69ef26dab0
Feature: Dynamic document storage pathes ( #916 )
...
* Added devcontainer
* Add feature storage pathes
* Exclude tests and add versioning
* Check escaping
* Check escaping
* Check quoting
* Echo
* Escape
* Escape :
* Double escape \
* Escaping
* Remove if
* Escape colon
* Missing \
* Esacpe :
* Escape all
* test
* Remove sed
* Fix exclude
* Remove SED command
* Add LD_LIBRARY_PATH
* Adjusted to v1.7
* Updated test-cases
* Remove devcontainer
* Removed internal build-file
* Run pre-commit
* Corrected flak8 error
* Adjusted to v1.7
* Updated test-cases
* Corrected flak8 error
* Adjusted to new plural translations
* Small adjustments due to code-review backend
* Adjusted line-break
* Removed PAPERLESS prefix from settings variables
* Corrected style change due to search+replace
* First documentation draft
* Revert changes to Pipfile
* Add sphinx-autobuild with keep-outdated
* Revert merge error that results in wrong storage path is evaluated
* Adjust styles of generated files ...
* Adds additional testing to cover dynamic storage path functionality
* Remove unnecessary condition
* Add hint to edit storage path dialog
* Correct spelling of pathes to paths
* Minor documentation tweaks
* Minor typo
* improving wrapping of filter editor buttons with new storage path button
* Update .gitignore
* Fix select border radius in non input-groups
* Better storage path edit hint
* Add note to edit storage path dialog re document_renamer
* Add note to bulk edit storage path re document_renamer
* Rename FILTER_STORAGE_DIRECTORY to PATH
* Fix broken filter rule parsing
* Show default storage if unspecified
* Remove note re storage path on bulk edit
* Add basic validation of filename variables
Co-authored-by: Markus Kling <markus@markus-kling.net>
Co-authored-by: Trenton Holmes <holmes.trenton@gmail.com>
Co-authored-by: Michael Shamoon <4887959+shamoon@users.noreply.github.com>
Co-authored-by: Quinn Casey <quinn@quinncasey.com>
2022-05-19 14:42:25 -07:00
Trenton Holmes
3003bdd507
Runs pyupgrade to Python 3.8+ and adds a hook for it
2022-05-06 09:04:08 -07:00
Trenton Holmes
9bb5568d8e
Un-pickle and re-pickle the test models to resolve the version difference warning
2022-03-22 09:37:17 +01:00
Johann Bauer
cffdaefe2f
Fix model test
2022-03-21 18:53:53 +01:00
Johann Bauer
9de4ca61e8
Increase FORMAT_VERSION to force model re-creation
2022-03-21 18:11:18 +01:00
Trenton Holmes
1771d18a21
Runs the pre-commit hooks over all the Python files
2022-03-11 11:34:28 -08:00
kpj
fc695896dd
Format Python code with black
2022-02-27 15:26:41 +01:00
jonaswinkler
a3dae02cfb
write classifier model to temporary file before copying to final location
2021-06-13 12:03:20 +02:00
jonaswinkler
635c96accf
better exception handling
2021-05-19 23:11:24 +02:00
jonaswinkler
ca1e838c52
catch another exception regarding classifier loading
2021-05-19 22:57:52 +02:00
Jonas Winkler
61b47e358f
correct file mode
2021-05-16 01:22:51 +02:00
jonaswinkler
12235cc853
fixes #689
2021-03-03 23:35:26 +01:00
jonaswinkler
7e88085377
load sklearn modules only when training data has changed
2021-02-15 11:25:25 +01:00
jonaswinkler
b48e67d714
revert a faulty change that caused memory usage to explode #537
2021-02-13 19:51:04 +01:00