mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
Updated Backend Ideas List (markdown)
parent
5a6d1680e4
commit
58095100a6
@ -6,34 +6,27 @@
|
||||
- Provides no benefit
|
||||
- Does still linger in the code base here and there
|
||||
|
||||
## Formatting Language
|
||||
|
||||
- Combine and standardize the formatting used for titles and filenames
|
||||
- Add some basic operations?
|
||||
- Ensure dates and locale are set to use proper locale
|
||||
|
||||
## Context Managers
|
||||
|
||||
Updating the consumer and maybe the parsers to be context managers.
|
||||
- allows them to be used in `with` statements, persist some values until the exit
|
||||
- Single location to clean up temporary directories.
|
||||
- allow a connection to a server to be maintained throughout the life, which would slightly shorten connections to Tika
|
||||
|
||||
## Migration to s6-overlay
|
||||
|
||||
- supervisord isn't meant to run as PID 1, S6 is
|
||||
- s8 startup can be separated into independent units, with dependencies between them, which could slightly improve startup time
|
||||
- Initial work done in https://github.com/paperless-ngx/paperless-ngx/tree/feature-s6-overlay
|
||||
|
||||
## Integrate `apprise`
|
||||
|
||||
- all in one library for notifications across multiple services, from email to self hosted instances
|
||||
- need to standardize what is notified and how it is tagged (ie always include `paperless-ngx`, and maybe a level like `warning`, `error`, etc)
|
||||
- Probably the user provides a filepath to the config
|
||||
- as much as possible, would likely want to persistent the client through a consumption, to prevent extra work
|
||||
|
||||
## External Services
|
||||
|
||||
- External OCR services, using an API, could provide more recent tesseract and ghostscript versions, potentially fixing issues faster than Debian updates (thinking Alpine)
|
||||
- Is time consuming, so might need celery there? And a database?
|
||||
- fastapi could easily set this up
|
||||
### External OCR
|
||||
|
||||
- External OCR services, using an API, could provide more recent tesseract and ghostscript versions, potentially fixing issues faster than Debian updates (thinking Alpine based image)
|
||||
- This would be streamed the document, eventually return the content and an optional archive file
|
||||
- Is time consuming, so might need celery/huey/task queue there? And a database?
|
||||
- fastapi could easily set this up, if there is no need for a database.
|
||||
|
||||
## Separate OCR from Archive
|
||||
|
||||
- The getting of a image or PDF document content should be separated from the generation of an archive file
|
||||
- Just too many interactions between them, leading to odd combinations
|
||||
|
||||
## Break apart consumer
|
||||
|
||||
- The consumer does so much stuff, break it apart into smaller, more discrete steps
|
||||
- Make each step well defined with possible status/states to report over the websocket and/or notifications
|
Loading…
x
Reference in New Issue
Block a user