mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-10-30 03:56:23 -05:00 
			
		
		
		
	Updated v3 Ideas List (markdown)
| @@ -9,7 +9,7 @@ | ||||
| ### Settings Updates | ||||
|  | ||||
| - Remove all but Django settings from the environment | ||||
| - Separate OCR vs other settings | ||||
| - Separate OCR vs other settings (call them site setting?) | ||||
| - Create multiple levels of OCR settings: | ||||
|   - A default system configuration, controlled by staff/superusers | ||||
|   - A user specific settings set | ||||
| @@ -29,18 +29,24 @@ | ||||
| - An initial task takes the file, waits for it to be unmodified, then determines the next task to start. | ||||
| - Or alternatively, the initial task builds a pipeline and starts that. | ||||
| - Handles deciding if the file can be consumed, rather than when a new file is seen (see plugin ideas) | ||||
| - Make each step along the well a well defined status update, sent over websocket, but also configure something like apprise/ntfy | ||||
| - TODO: If something fails along the chain, the DB shouldn't be updated.  Maybe 1 task, multiple steps, wrapped in a transaction? | ||||
|  | ||||
| ### Actual Plugins | ||||
|  | ||||
| - Design a system to allow plugins, while splitting apart the current code into plugins | ||||
| - I can see the following being plugins: | ||||
|   - Parsers (obviously.  Includes things like AI/cloud OCR to get the content or even could talk to a remote, but local network API) | ||||
|   - Parsers (obviously.  Includes things like AI/cloud OCR to get the content or even could talk to a remote API) | ||||
|   - Archive generation (example, use Gotenberg to convert a PDF to PDF/A instead of ocrmypdf) | ||||
|   - Thumbnail generation (maybe you want to handle PDFs differently than JPEGs?) | ||||
|   - Date parsing (handling non-latin dates, for example) | ||||
|   - Machine learning (provides an interface which returns the proposed tags, type, etc) | ||||
| - Ideally, plugins should be registered when installed, declaring what mime types they support | ||||
| - Ideally, plugins should be registered when installed, declaring what mime types they support, with some sort of conflict resolution | ||||
| - With the settings updates above, a workflow could also be used to set the parser based on matching certain values | ||||
| - Provide "paperless", a core set of functionality, including models | ||||
| - Provide the existing parsers, re-configured to match the new format | ||||
| - Rework the other parts to conform to the plugin API spec | ||||
|  | ||||
|  | ||||
| ### Simpler consumer | ||||
|  | ||||
| @@ -85,23 +91,6 @@ | ||||
| - The getting of a image or PDF document content should be separated from the generation of an archive file | ||||
| - Just too many interactions between them, leading to odd combinations | ||||
|  | ||||
| ## Break apart consumer | ||||
|  | ||||
| - The consumer does so much stuff, break it apart into smaller, more discrete steps | ||||
| - Make each step well defined with possible status/states to report over the websocket and/or notifications | ||||
| - Make it a chain of tasks, passing a package through which accumulates data, etc, before being saved | ||||
|  | ||||
| ## Settings Manager | ||||
|  | ||||
| - Allow multiple levels of settings to be defined | ||||
|   - From matching, apply certain settings | ||||
|   - From the user (if known), apply their settings | ||||
|   - From the system wide settings | ||||
|   - From environment variable settings | ||||
|   - Then defaults | ||||
| - settings at lower levels have less priority, so a matched setting is never changed | ||||
| - Settings travel through the new consumer with the document | ||||
|  | ||||
| ## Django Ninja | ||||
|  | ||||
| - Really like the OpenAPI spec it generates | ||||
| @@ -114,6 +103,7 @@ | ||||
|   - Could track, with some resolution, when a token was last used.  Might be nice to display and allow removing old tokens which haven't been used | ||||
|   - Could implement expiration too | ||||
| - Async pagination isn't working quite yet | ||||
| - No idea about allauth/oidc integration | ||||
|  | ||||
| ## Vector Embeddings | ||||
|  | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 Trenton H
					Trenton H