mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2026-01-14 21:54:22 -06:00
Feature: Paperless AI (#10319)
This commit is contained in:
@@ -1824,3 +1824,67 @@ password. All of these options come from their similarly-named [Django settings]
|
||||
: The endpoint to use for the remote OCR engine. This is required for Azure AI.
|
||||
|
||||
Defaults to None.
|
||||
|
||||
## AI {#ai}
|
||||
|
||||
#### [`PAPERLESS_AI_ENABLED=<bool>`](#PAPERLESS_AI_ENABLED) {#PAPERLESS_AI_ENABLED}
|
||||
|
||||
: Enables the AI features in Paperless. This includes the AI-based
|
||||
suggestions. This setting is required to be set to true in order to use the AI features.
|
||||
|
||||
Defaults to false.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_EMBEDDING_BACKEND=<str>`](#PAPERLESS_AI_LLM_EMBEDDING_BACKEND) {#PAPERLESS_AI_LLM_EMBEDDING_BACKEND}
|
||||
|
||||
: The embedding backend to use for RAG. This can be either "openai" or "huggingface".
|
||||
|
||||
Defaults to None.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_EMBEDDING_MODEL=<str>`](#PAPERLESS_AI_LLM_EMBEDDING_MODEL) {#PAPERLESS_AI_LLM_EMBEDDING_MODEL}
|
||||
|
||||
: The model to use for the embedding backend for RAG. This can be set to any of the embedding models supported by the current embedding backend. If not supplied, defaults to "text-embedding-3-small" for OpenAI and "sentence-transformers/all-MiniLM-L6-v2" for Huggingface.
|
||||
|
||||
Defaults to None.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_BACKEND=<str>`](#PAPERLESS_AI_LLM_BACKEND) {#PAPERLESS_AI_LLM_BACKEND}
|
||||
|
||||
: The AI backend to use. This can be either "openai" or "ollama". If set to "ollama", the AI
|
||||
features will be run locally on your machine. If set to "openai", the AI features will be run
|
||||
using the OpenAI API. This setting is required to be set to use the AI features.
|
||||
|
||||
Defaults to None.
|
||||
|
||||
!!! note
|
||||
|
||||
The OpenAI API is a paid service. You will need to set up an OpenAI account and
|
||||
will be charged for usage incurred by Paperless-ngx features and your document data
|
||||
will (of course) be sent to the OpenAI API. Paperless-ngx does not endorse the use of the
|
||||
OpenAI API in any way.
|
||||
|
||||
Refer to the OpenAI terms of service, and use at your own risk.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_MODEL=<str>`](#PAPERLESS_AI_LLM_MODEL) {#PAPERLESS_AI_LLM_MODEL}
|
||||
|
||||
: The model to use for the AI backend, i.e. "gpt-3.5-turbo", "gpt-4" or any of the models supported by the
|
||||
current backend. If not supplied, defaults to "gpt-3.5-turbo" for OpenAI and "llama3" for Ollama.
|
||||
|
||||
Defaults to None.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_API_KEY=<str>`](#PAPERLESS_AI_LLM_API_KEY) {#PAPERLESS_AI_LLM_API_KEY}
|
||||
|
||||
: The API key to use for the AI backend. This is required for the OpenAI backend (optional for others).
|
||||
|
||||
Defaults to None.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_ENDPOINT=<str>`](#PAPERLESS_AI_LLM_ENDPOINT) {#PAPERLESS_AI_LLM_ENDPOINT}
|
||||
|
||||
: The endpoint / url to use for the AI backend. This is required for the Ollama backend (optional for others).
|
||||
|
||||
Defaults to None.
|
||||
|
||||
#### [`PAPERLESS_AI_LLM_INDEX_TASK_CRON=<cron expression>`](#PAPERLESS_AI_LLM_INDEX_TASK_CRON) {#PAPERLESS_AI_LLM_INDEX_TASK_CRON}
|
||||
|
||||
: Configures the schedule to update the AI embeddings of text content and metadata for all documents. Only performed if
|
||||
AI is enabled and the LLM embedding backend is set.
|
||||
|
||||
Defaults to `10 2 * * *`, once per day.
|
||||
|
||||
@@ -31,6 +31,7 @@ physical documents into a searchable online archive so you can keep, well, _less
|
||||
- _New!_ Supports remote OCR with Azure AI (opt-in).
|
||||
- Documents are saved as PDF/A format which is designed for long term storage, alongside the unaltered originals.
|
||||
- Uses machine-learning to automatically add tags, correspondents and document types to your documents.
|
||||
- **New**: Paperless-ngx can now leverage AI (Large Language Models or LLMs) for document suggestions. This is an optional feature that can be enabled (and is disabled by default).
|
||||
- Supports PDF documents, images, plain text files, Office documents (Word, Excel, PowerPoint, and LibreOffice equivalents)[^1] and more.
|
||||
- Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely with different configurations assigned to different documents.
|
||||
- **Beautiful, modern web application** that features:
|
||||
|
||||
@@ -278,6 +278,28 @@ Once setup, navigating to the email settings page in Paperless-ngx will allow yo
|
||||
You can also submit a document using the REST API, see [POSTing documents](api.md#file-uploads)
|
||||
for details.
|
||||
|
||||
## Document Suggestions
|
||||
|
||||
Paperless-ngx can suggest tags, correspondents, document types and storage paths for documents based on the content of the document. This is done using a (non-LLM) machine learning model that is trained on the documents in your database. The suggestions are shown in the document detail page and can be accepted or rejected by the user.
|
||||
|
||||
## AI Features
|
||||
|
||||
Paperless-ngx includes several features that use AI to enhance the document management experience. These features are optional and can be enabled or disabled in the settings. If you are using the AI features, you may want to also enable the "LLM index" feature, which supports Retrieval-Augmented Generation (RAG) designed to improve the quality of AI responses. The LLM index feature is not enabled by default and requires additional configuration.
|
||||
|
||||
!!! warning
|
||||
|
||||
Remember that Paperless-ngx will send document content to the AI provider you have configured, so consider the privacy implications of using these features, especially if using a remote model (e.g. OpenAI), instead of the default local model.
|
||||
|
||||
The AI features work by creating an embedding of the text content and metadata of documents, which is then used for various tasks such as similarity search and question answering. This uses the FAISS vector store.
|
||||
|
||||
### AI-Enhanced Suggestions
|
||||
|
||||
If enabled, Paperless-ngx can use an AI LLM model to suggest document titles, dates, tags, correspondents and document types for documents. This feature will always be "opt-in" and does not disable the existing classifier-based suggestion system. Currently, both remote (via the OpenAI API) and local (via Ollama) models are supported, see [configuration](configuration.md#ai) for details.
|
||||
|
||||
### Document Chat
|
||||
|
||||
Paperless-ngx can use an AI LLM model to answer questions about a document or across multiple documents. Again, this feature works best when RAG is enabled. The chat feature is available in the upper app toolbar and will switch between chatting across multiple documents or a single document based on the current view.
|
||||
|
||||
## Sharing documents from Paperless-ngx
|
||||
|
||||
Paperless-ngx supports sharing documents with other users by assigning them [permissions](#object-permissions)
|
||||
|
||||
Reference in New Issue
Block a user