Compare commits

...

11 Commits

205 changed files with 4768 additions and 14176 deletions

View File

@@ -4,8 +4,7 @@
set -eu
for command in decrypt_documents \
document_archiver \
for command in document_archiver \
document_exporter \
document_importer \
mail_fetcher \

View File

@@ -1,14 +0,0 @@
#!/command/with-contenv /usr/bin/bash
# shellcheck shell=bash
set -e
cd "${PAPERLESS_SRC_DIR}"
if [[ $(id -u) == 0 ]]; then
s6-setuidgid paperless python3 manage.py decrypt_documents "$@"
elif [[ $(id -un) == "paperless" ]]; then
python3 manage.py decrypt_documents "$@"
else
echo "Unknown user."
fi

View File

@@ -580,36 +580,6 @@ document.
documents, such as encrypted PDF documents. The archiver will skip over
these documents each time it sees them.
### Managing encryption {#encryption}
!!! warning
Encryption was removed in [paperless-ng 0.9](changelog.md#paperless-ng-090)
because it did not really provide any additional security, the passphrase
was stored in a configuration file on the same system as the documents.
Furthermore, the entire text content of the documents is stored plain in
the database, even if your documents are encrypted. Filenames are not
encrypted as well. Finally, the web server provides transparent access to
your encrypted documents.
Consider running paperless on an encrypted filesystem instead, which
will then at least provide security against physical hardware theft.
#### Enabling encryption
Enabling encryption is no longer supported.
#### Disabling encryption
Basic usage to disable encryption of your document store:
(Note: If `PAPERLESS_PASSPHRASE` isn't set already, you need to specify
it here)
```
decrypt_documents [--passphrase SECR3TP4SSPHRA$E]
```
### Detecting duplicates {#fuzzy_duplicate}
Paperless already catches and prevents upload of exactly matching documents,

View File

@@ -501,7 +501,7 @@ The `datetime` filter formats a datetime string or datetime object using Python'
See the [strftime format code documentation](https://docs.python.org/3.13/library/datetime.html#strftime-and-strptime-format-codes)
for the possible codes and their meanings.
##### Date Localization
##### Date Localization {#date-localization}
The `localize_date` filter formats a date or datetime object into a localized string using Babel internationalization.
This takes into account the provided locale for translation. Since this must be used on a date or datetime object,
@@ -851,8 +851,8 @@ followed by the even pages.
It's important that the scan files get consumed in the correct order, and one at a time.
You therefore need to make sure that Paperless is running while you upload the files into
the directory; and if you're using [polling](configuration.md#polling), make sure that
`CONSUMER_POLLING` is set to a value lower than it takes for the second scan to appear,
the directory; and if you're using polling, make sure that
`CONSUMER_POLLING_INTERVAL` is set to a value lower than it takes for the second scan to appear,
like 5-10 or even lower.
Another thing that might happen is that you start a double sided scan, but then forget

View File

@@ -1175,21 +1175,45 @@ don't exist yet.
#### [`PAPERLESS_CONSUMER_IGNORE_PATTERNS=<json>`](#PAPERLESS_CONSUMER_IGNORE_PATTERNS) {#PAPERLESS_CONSUMER_IGNORE_PATTERNS}
: By default, paperless ignores certain files and folders in the
consumption directory, such as system files created by the Mac OS
or hidden folders some tools use to store data.
: Additional regex patterns for files to ignore in the consumption directory. Patterns are matched against filenames only (not full paths)
using Python's `re.match()`, which anchors at the start of the filename.
This can be adjusted by configuring a custom json array with
patterns to exclude.
See the [watchfiles documentation](https://watchfiles.helpmanual.io/api/filters/#watchfiles.BaseFilter.ignore_entity_patterns)
For example, `.DS_STORE/*` will ignore any files found in a folder
named `.DS_STORE`, including `.DS_STORE/bar.pdf` and `foo/.DS_STORE/bar.pdf`
This setting is for additional patterns beyond the built-in defaults. Common system files and directories are already ignored automatically.
The patterns will be compiled via Python's standard `re` module.
A pattern like `._*` will ignore anything starting with `._`, including:
`._foo.pdf` and `._bar/foo.pdf`
Example custom patterns:
Defaults to
`[".DS_Store", ".DS_STORE", "._*", ".stfolder/*", ".stversions/*", ".localized/*", "desktop.ini", "@eaDir/*", "Thumbs.db"]`.
```json
["^temp_", "\\.bak$", "^~"]
```
This would ignore:
- Files starting with `temp_` (e.g., `temp_scan.pdf`)
- Files ending with `.bak` (e.g., `document.pdf.bak`)
- Files starting with `~` (e.g., `~$document.docx`)
Defaults to `[]` (empty list, uses only built-in defaults).
The default ignores are `[.DS_Store, .DS_STORE, ._*, desktop.ini, Thumbs.db]` and cannot be overridden.
#### [`PAPERLESS_CONSUMER_IGNORE_DIRS=<json>`](#PAPERLESS_CONSUMER_IGNORE_DIRS) {#PAPERLESS_CONSUMER_IGNORE_DIRS}
: Additional directory names to ignore in the consumption directory. Directories matching these names (and all their contents) will be skipped.
This setting is for additional directories beyond the built-in defaults. Matching is done by directory name only, not full path.
Example:
```json
["temp", "incoming", ".hidden"]
```
Defaults to `[]` (empty list, uses only built-in defaults).
The default ignores are `[.stfolder, .stversions, .localized, @eaDir, .Spotlight-V100, .Trashes, __MACOSX]` and cannot be overridden.
#### [`PAPERLESS_CONSUMER_BARCODE_SCANNER=<string>`](#PAPERLESS_CONSUMER_BARCODE_SCANNER) {#PAPERLESS_CONSUMER_BARCODE_SCANNER}
@@ -1288,48 +1312,24 @@ within your documents.
Defaults to false.
### Polling {#polling}
#### [`PAPERLESS_CONSUMER_POLLING_INTERVAL=<num>`](#PAPERLESS_CONSUMER_POLLING_INTERVAL) {#PAPERLESS_CONSUMER_POLLING_INTERVAL}
#### [`PAPERLESS_CONSUMER_POLLING=<num>`](#PAPERLESS_CONSUMER_POLLING) {#PAPERLESS_CONSUMER_POLLING}
: Configures how the consumer detects new files in the consumption directory.
: If paperless won't find documents added to your consume folder, it
might not be able to automatically detect filesystem changes. In
that case, specify a polling interval in seconds here, which will
then cause paperless to periodically check your consumption
directory for changes. This will also disable listening for file
system changes with `inotify`.
When set to `0` (default), paperless uses native filesystem notifications for efficient, immediate detection of new files.
Defaults to 0, which disables polling and uses filesystem
notifications.
When set to a positive number, paperless polls the consumption directory at that interval in seconds. Use polling for network filesystems (NFS, SMB/CIFS) where native notifications may not work reliably.
#### [`PAPERLESS_CONSUMER_POLLING_RETRY_COUNT=<num>`](#PAPERLESS_CONSUMER_POLLING_RETRY_COUNT) {#PAPERLESS_CONSUMER_POLLING_RETRY_COUNT}
Defaults to 0.
: If consumer polling is enabled, sets the maximum number of times
paperless will check for a file to remain unmodified. If a file's
modification time and size are identical for two consecutive checks, it
will be consumed.
#### [`PAPERLESS_CONSUMER_STABILITY_DELAY=<num>`](#PAPERLESS_CONSUMER_STABILITY_DELAY) {#PAPERLESS_CONSUMER_STABILITY_DELAY}
Defaults to 5.
: Sets the time in seconds that a file must remain unchanged (same size and modification time) before paperless will begin consuming it.
#### [`PAPERLESS_CONSUMER_POLLING_DELAY=<num>`](#PAPERLESS_CONSUMER_POLLING_DELAY) {#PAPERLESS_CONSUMER_POLLING_DELAY}
Increase this value if you experience issues with files being consumed before they are fully written, particularly on slower network storage or
with certain scanner quirks
: If consumer polling is enabled, sets the delay in seconds between
each check (above) paperless will do while waiting for a file to
remain unmodified.
Defaults to 5.
### iNotify {#inotify}
#### [`PAPERLESS_CONSUMER_INOTIFY_DELAY=<num>`](#PAPERLESS_CONSUMER_INOTIFY_DELAY) {#PAPERLESS_CONSUMER_INOTIFY_DELAY}
: Sets the time in seconds the consumer will wait for additional
events from inotify before the consumer will consider a file ready
and begin consumption. Certain scanners or network setups may
generate multiple events for a single file, leading to multiple
consumers working on the same file. Configure this to prevent that.
Defaults to 0.5 seconds.
Defaults to 5.0 seconds.
## Workflow webhooks

25
docs/migration.md Normal file
View File

@@ -0,0 +1,25 @@
# v3 Migration Guide
## Consumer Settings Changes
The v3 consumer command uses a [different library](https://watchfiles.helpmanual.io/) to unify
the watching for new files in the consume directory. For the user, this removes several configuration options related to delays and retries
and replaces with a single unified setting. It also adjusts how the consumer ignore filtering happens, replaced `fnmatch` with `regex` and
separating the directory ignore from the file ignore.
### Summary
| Old Setting | New Setting | Notes |
| ------------------------------ | ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------ |
| `CONSUMER_POLLING` | [`CONSUMER_POLLING_INTERVAL`](configuration.md#PAPERLESS_CONSUMER_POLLING_INTERVAL) | Renamed for clarity |
| `CONSUMER_INOTIFY_DELAY` | [`CONSUMER_STABILITY_DELAY`](configuration.md#PAPERLESS_CONSUMER_STABILITY_DELAY) | Unified for all modes |
| `CONSUMER_POLLING_DELAY` | _Removed_ | Use `CONSUMER_STABILITY_DELAY` |
| `CONSUMER_POLLING_RETRY_COUNT` | _Removed_ | Automatic with stability tracking |
| `CONSUMER_IGNORE_PATTERNS` | [`CONSUMER_IGNORE_PATTERNS`](configuration.md#PAPERLESS_CONSUMER_IGNORE_PATTERNS) | **Now regex, not fnmatch**; user patterns are added to (not replacing) default ones |
| _New_ | [`CONSUMER_IGNORE_DIRS`](configuration.md#PAPERLESS_CONSUMER_IGNORE_DIRS) | Additional directories to ignore; user entries are added to (not replacing) defaults |
## Encryption Support
Document and thumbnail encryption is no longer supported. This was previously deprecated in [paperless-ng 0.9.3](https://github.com/paperless-ngx/paperless-ngx/blob/dev/docs/changelog.md#paperless-ng-093)
Users must decrypt their document using the `decrypt_documents` command before upgrading.

View File

@@ -124,8 +124,7 @@ account. The script essentially automatically performs the steps described in [D
system notifications with `inotify`. When storing the consumption
directory on such a file system, paperless will not pick up new
files with the default configuration. You will need to use
[`PAPERLESS_CONSUMER_POLLING`](configuration.md#PAPERLESS_CONSUMER_POLLING), which will disable inotify. See
[here](configuration.md#polling).
[`PAPERLESS_CONSUMER_POLLING_INTERVAL`](configuration.md#PAPERLESS_CONSUMER_POLLING_INTERVAL), which will disable inotify.
5. Run `docker compose pull`. This will pull the image from the GitHub container registry
by default but you can change the image to pull from Docker Hub by changing the `image`

View File

@@ -46,9 +46,9 @@ run:
If you notice that the consumer will only pickup files in the
consumption directory at startup, but won't find any other files added
later, you will need to enable filesystem polling with the configuration
option [`PAPERLESS_CONSUMER_POLLING`](configuration.md#PAPERLESS_CONSUMER_POLLING).
option [`PAPERLESS_CONSUMER_POLLING_INTERVAL`](configuration.md#PAPERLESS_CONSUMER_POLLING_INTERVAL).
This will disable listening to filesystem changes with inotify and
This will disable automatic listening for filesystem changes and
paperless will manually check the consumption directory for changes
instead.
@@ -234,47 +234,9 @@ FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.yhk3zb
This probably indicates paperless tried to consume the same file twice.
This can happen for a number of reasons, depending on how documents are
placed into the consume folder. If paperless is using inotify (the
default) to check for documents, try adjusting the
[inotify configuration](configuration.md#inotify). If polling is enabled, try adjusting the
[polling configuration](configuration.md#polling).
## Consumer fails waiting for file to remain unmodified.
You might find messages like these in your log files:
```
[ERROR] [paperless.management.consumer] Timeout while waiting on file /usr/src/paperless/src/../consume/SCN_0001.pdf to remain unmodified.
```
This indicates paperless timed out while waiting for the file to be
completely written to the consume folder. Adjusting
[polling configuration](configuration.md#polling) values should resolve the issue.
!!! note
The user will need to manually move the file out of the consume folder
and back in, for the initial failing file to be consumed.
## Consumer fails reporting "OS reports file as busy still".
You might find messages like these in your log files:
```
[WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/src/../consume/SCN_0001.pdf: OS reports file as busy still
```
This indicates paperless was unable to open the file, as the OS reported
the file as still being in use. To prevent a crash, paperless did not
try to consume the file. If paperless is using inotify (the default) to
check for documents, try adjusting the
[inotify configuration](configuration.md#inotify). If polling is enabled, try adjusting the
[polling configuration](configuration.md#polling).
!!! note
The user will need to manually move the file out of the consume folder
and back in, for the initial failing file to be consumed.
placed into the consume folder, such as how a scanner may modify a file multiple times as it scans.
Try adjusting the
[file stability delay](configuration.md#PAPERLESS_CONSUMER_STABILITY_DELAY) to a larger value.
## Log reports "Creating PaperlessTask failed".

View File

@@ -565,7 +565,7 @@ This allows for complex logic to be used to generate the title, including [logic
and [filters](https://jinja.palletsprojects.com/en/3.1.x/templates/#id11).
The template is provided as a string.
Using Jinja2 Templates is also useful for [Date localization](advanced_usage.md#Date-Localization) in the title.
Using Jinja2 Templates is also useful for [Date localization](advanced_usage.md#date-localization) in the title.
The available inputs differ depending on the type of workflow trigger.
This is because at the time of consumption (when the text is to be set), no automatic tags etc. have been
@@ -597,6 +597,7 @@ The following placeholders are only available for "added" or "updated" triggers
- `{{created_day}}`: created day
- `{{created_time}}`: created time in HH:MM format
- `{{doc_url}}`: URL to the document in the web UI. Requires the `PAPERLESS_URL` setting to be set.
- `{{doc_id}}`: Document ID
##### Examples

View File

@@ -69,8 +69,9 @@ nav:
- development.md
- 'FAQs': faq.md
- troubleshooting.md
- 'Migration to v3': migration.md
- changelog.md
copyright: Copyright &copy; 2016 - 2023 Daniel Quinn, Jonas Winkler, and the Paperless-ngx team
copyright: Copyright &copy; 2016 - 2026 Daniel Quinn, Jonas Winkler, and the Paperless-ngx team
extra:
social:
- icon: fontawesome/brands/github

View File

@@ -55,10 +55,10 @@
#PAPERLESS_TASK_WORKERS=1
#PAPERLESS_THREADS_PER_WORKER=1
#PAPERLESS_TIME_ZONE=UTC
#PAPERLESS_CONSUMER_POLLING=10
#PAPERLESS_CONSUMER_POLLING_INTERVAL=10
#PAPERLESS_CONSUMER_DELETE_DUPLICATES=false
#PAPERLESS_CONSUMER_RECURSIVE=false
#PAPERLESS_CONSUMER_IGNORE_PATTERNS=[".DS_STORE/*", "._*", ".stfolder/*", ".stversions/*", ".localized/*", "desktop.ini"]
#PAPERLESS_CONSUMER_IGNORE_PATTERNS=[] # Defaults are built in; add filename regexes, e.g. ["^\\.DS_Store$", "^desktop\\.ini$"]
#PAPERLESS_CONSUMER_SUBDIRS_AS_TAGS=false
#PAPERLESS_CONSUMER_ENABLE_BARCODES=false
#PAPERLESS_CONSUMER_BARCODE_STRING=PATCHT

View File

@@ -27,7 +27,7 @@ dependencies = [
# WARNING: django does not use semver.
# Only patch versions are guaranteed to not introduce breaking changes.
"django~=5.2.5",
"django-allauth[mfa,socialaccount]~=65.12.1",
"django-allauth[mfa,socialaccount]~=65.13.1",
"django-auditlog~=3.4.1",
"django-cachalot~=2.8.0",
"django-celery-results~=2.6.0",
@@ -50,7 +50,6 @@ dependencies = [
"gotenberg-client~=0.13.1",
"httpx-oauth~=0.16",
"imap-tools~=1.11.0",
"inotifyrecursive~=0.3",
"jinja2~=3.1.5",
"langdetect~=1.0.9",
"llama-index-core>=0.14.12",
@@ -79,7 +78,7 @@ dependencies = [
"tika-client~=0.10.0",
"torch~=2.9.1",
"tqdm~=4.67.1",
"watchdog~=6.0",
"watchfiles>=1.1.1",
"whitenoise~=6.9",
"whoosh-reloaded>=2.7.5",
"zxing-cpp~=2.3.0",

View File

@@ -412,6 +412,9 @@ describe('WorkflowEditDialogComponent', () => {
return newFilter
}
const correspondentAny = addFilterOfType(TriggerFilterType.CorrespondentAny)
correspondentAny.get('values').setValue([11])
const correspondentIs = addFilterOfType(TriggerFilterType.CorrespondentIs)
correspondentIs.get('values').setValue(1)
@@ -421,12 +424,18 @@ describe('WorkflowEditDialogComponent', () => {
const documentTypeIs = addFilterOfType(TriggerFilterType.DocumentTypeIs)
documentTypeIs.get('values').setValue(1)
const documentTypeAny = addFilterOfType(TriggerFilterType.DocumentTypeAny)
documentTypeAny.get('values').setValue([12])
const documentTypeNot = addFilterOfType(TriggerFilterType.DocumentTypeNot)
documentTypeNot.get('values').setValue([1])
const storagePathIs = addFilterOfType(TriggerFilterType.StoragePathIs)
storagePathIs.get('values').setValue(1)
const storagePathAny = addFilterOfType(TriggerFilterType.StoragePathAny)
storagePathAny.get('values').setValue([13])
const storagePathNot = addFilterOfType(TriggerFilterType.StoragePathNot)
storagePathNot.get('values').setValue([1])
@@ -441,10 +450,13 @@ describe('WorkflowEditDialogComponent', () => {
expect(formValues.triggers[0].filter_has_tags).toEqual([1])
expect(formValues.triggers[0].filter_has_all_tags).toEqual([2, 3])
expect(formValues.triggers[0].filter_has_not_tags).toEqual([4])
expect(formValues.triggers[0].filter_has_any_correspondents).toEqual([11])
expect(formValues.triggers[0].filter_has_correspondent).toEqual(1)
expect(formValues.triggers[0].filter_has_not_correspondents).toEqual([1])
expect(formValues.triggers[0].filter_has_any_document_types).toEqual([12])
expect(formValues.triggers[0].filter_has_document_type).toEqual(1)
expect(formValues.triggers[0].filter_has_not_document_types).toEqual([1])
expect(formValues.triggers[0].filter_has_any_storage_paths).toEqual([13])
expect(formValues.triggers[0].filter_has_storage_path).toEqual(1)
expect(formValues.triggers[0].filter_has_not_storage_paths).toEqual([1])
expect(formValues.triggers[0].filter_custom_field_query).toEqual(
@@ -507,16 +519,22 @@ describe('WorkflowEditDialogComponent', () => {
setFilter(TriggerFilterType.TagsAll, 11)
setFilter(TriggerFilterType.TagsNone, 12)
setFilter(TriggerFilterType.CorrespondentAny, 16)
setFilter(TriggerFilterType.CorrespondentNot, 13)
setFilter(TriggerFilterType.DocumentTypeAny, 17)
setFilter(TriggerFilterType.DocumentTypeNot, 14)
setFilter(TriggerFilterType.StoragePathAny, 18)
setFilter(TriggerFilterType.StoragePathNot, 15)
const formValues = component['getFormValues']()
expect(formValues.triggers[0].filter_has_all_tags).toEqual([11])
expect(formValues.triggers[0].filter_has_not_tags).toEqual([12])
expect(formValues.triggers[0].filter_has_any_correspondents).toEqual([16])
expect(formValues.triggers[0].filter_has_not_correspondents).toEqual([13])
expect(formValues.triggers[0].filter_has_any_document_types).toEqual([17])
expect(formValues.triggers[0].filter_has_not_document_types).toEqual([14])
expect(formValues.triggers[0].filter_has_any_storage_paths).toEqual([18])
expect(formValues.triggers[0].filter_has_not_storage_paths).toEqual([15])
})
@@ -640,8 +658,11 @@ describe('WorkflowEditDialogComponent', () => {
filter_has_tags: [],
filter_has_all_tags: [],
filter_has_not_tags: [],
filter_has_any_correspondents: [],
filter_has_not_correspondents: [],
filter_has_any_document_types: [],
filter_has_not_document_types: [],
filter_has_any_storage_paths: [],
filter_has_not_storage_paths: [],
filter_has_correspondent: null,
filter_has_document_type: null,
@@ -699,11 +720,14 @@ describe('WorkflowEditDialogComponent', () => {
trigger.filter_has_tags = [1]
trigger.filter_has_all_tags = [2, 3]
trigger.filter_has_not_tags = [4]
trigger.filter_has_any_correspondents = [10] as any
trigger.filter_has_correspondent = 5 as any
trigger.filter_has_not_correspondents = [6] as any
trigger.filter_has_document_type = 7 as any
trigger.filter_has_any_document_types = [11] as any
trigger.filter_has_not_document_types = [8] as any
trigger.filter_has_storage_path = 9 as any
trigger.filter_has_any_storage_paths = [12] as any
trigger.filter_has_not_storage_paths = [10] as any
trigger.filter_custom_field_query = JSON.stringify([
'AND',
@@ -714,8 +738,8 @@ describe('WorkflowEditDialogComponent', () => {
component.ngOnInit()
const triggerGroup = component.triggerFields.at(0) as FormGroup
const filters = component.getFiltersFormArray(triggerGroup)
expect(filters.length).toBe(10)
const customFieldFilter = filters.at(9) as FormGroup
expect(filters.length).toBe(13)
const customFieldFilter = filters.at(12) as FormGroup
expect(customFieldFilter.get('type').value).toBe(
TriggerFilterType.CustomFieldQuery
)
@@ -724,12 +748,27 @@ describe('WorkflowEditDialogComponent', () => {
})
it('should expose select metadata helpers', () => {
expect(component.isSelectMultiple(TriggerFilterType.CorrespondentAny)).toBe(
true
)
expect(component.isSelectMultiple(TriggerFilterType.CorrespondentNot)).toBe(
true
)
expect(component.isSelectMultiple(TriggerFilterType.CorrespondentIs)).toBe(
false
)
expect(component.isSelectMultiple(TriggerFilterType.DocumentTypeAny)).toBe(
true
)
expect(component.isSelectMultiple(TriggerFilterType.DocumentTypeIs)).toBe(
false
)
expect(component.isSelectMultiple(TriggerFilterType.StoragePathAny)).toBe(
true
)
expect(component.isSelectMultiple(TriggerFilterType.StoragePathIs)).toBe(
false
)
component.correspondents = [{ id: 1, name: 'C1' } as any]
component.documentTypes = [{ id: 2, name: 'DT' } as any]
@@ -741,9 +780,15 @@ describe('WorkflowEditDialogComponent', () => {
expect(
component.getFilterSelectItems(TriggerFilterType.DocumentTypeIs)
).toEqual(component.documentTypes)
expect(
component.getFilterSelectItems(TriggerFilterType.DocumentTypeAny)
).toEqual(component.documentTypes)
expect(
component.getFilterSelectItems(TriggerFilterType.StoragePathIs)
).toEqual(component.storagePaths)
expect(
component.getFilterSelectItems(TriggerFilterType.StoragePathAny)
).toEqual(component.storagePaths)
expect(component.getFilterSelectItems(TriggerFilterType.TagsAll)).toEqual(
[]
)

View File

@@ -145,10 +145,13 @@ export enum TriggerFilterType {
TagsAny = 'tags_any',
TagsAll = 'tags_all',
TagsNone = 'tags_none',
CorrespondentAny = 'correspondent_any',
CorrespondentIs = 'correspondent_is',
CorrespondentNot = 'correspondent_not',
DocumentTypeAny = 'document_type_any',
DocumentTypeIs = 'document_type_is',
DocumentTypeNot = 'document_type_not',
StoragePathAny = 'storage_path_any',
StoragePathIs = 'storage_path_is',
StoragePathNot = 'storage_path_not',
CustomFieldQuery = 'custom_field_query',
@@ -172,8 +175,11 @@ type TriggerFilterAggregate = {
filter_has_tags: number[]
filter_has_all_tags: number[]
filter_has_not_tags: number[]
filter_has_any_correspondents: number[]
filter_has_not_correspondents: number[]
filter_has_any_document_types: number[]
filter_has_not_document_types: number[]
filter_has_any_storage_paths: number[]
filter_has_not_storage_paths: number[]
filter_has_correspondent: number | null
filter_has_document_type: number | null
@@ -219,6 +225,14 @@ const TRIGGER_FILTER_DEFINITIONS: TriggerFilterDefinition[] = [
allowMultipleEntries: false,
allowMultipleValues: true,
},
{
id: TriggerFilterType.CorrespondentAny,
name: $localize`Has any of these correspondents`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: true,
selectItems: 'correspondents',
},
{
id: TriggerFilterType.CorrespondentIs,
name: $localize`Has correspondent`,
@@ -243,6 +257,14 @@ const TRIGGER_FILTER_DEFINITIONS: TriggerFilterDefinition[] = [
allowMultipleValues: false,
selectItems: 'documentTypes',
},
{
id: TriggerFilterType.DocumentTypeAny,
name: $localize`Has any of these document types`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: true,
selectItems: 'documentTypes',
},
{
id: TriggerFilterType.DocumentTypeNot,
name: $localize`Does not have document types`,
@@ -259,6 +281,14 @@ const TRIGGER_FILTER_DEFINITIONS: TriggerFilterDefinition[] = [
allowMultipleValues: false,
selectItems: 'storagePaths',
},
{
id: TriggerFilterType.StoragePathAny,
name: $localize`Has any of these storage paths`,
inputType: 'select',
allowMultipleEntries: false,
allowMultipleValues: true,
selectItems: 'storagePaths',
},
{
id: TriggerFilterType.StoragePathNot,
name: $localize`Does not have storage paths`,
@@ -306,6 +336,15 @@ const FILTER_HANDLERS: Record<TriggerFilterType, FilterHandler> = {
extract: (trigger) => trigger.filter_has_not_tags,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.CorrespondentAny]: {
apply: (aggregate, values) => {
aggregate.filter_has_any_correspondents = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_any_correspondents,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.CorrespondentIs]: {
apply: (aggregate, values) => {
aggregate.filter_has_correspondent = Array.isArray(values)
@@ -333,6 +372,15 @@ const FILTER_HANDLERS: Record<TriggerFilterType, FilterHandler> = {
extract: (trigger) => trigger.filter_has_document_type,
hasValue: (value) => value !== null && value !== undefined,
},
[TriggerFilterType.DocumentTypeAny]: {
apply: (aggregate, values) => {
aggregate.filter_has_any_document_types = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_any_document_types,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.DocumentTypeNot]: {
apply: (aggregate, values) => {
aggregate.filter_has_not_document_types = Array.isArray(values)
@@ -351,6 +399,15 @@ const FILTER_HANDLERS: Record<TriggerFilterType, FilterHandler> = {
extract: (trigger) => trigger.filter_has_storage_path,
hasValue: (value) => value !== null && value !== undefined,
},
[TriggerFilterType.StoragePathAny]: {
apply: (aggregate, values) => {
aggregate.filter_has_any_storage_paths = Array.isArray(values)
? [...values]
: [values]
},
extract: (trigger) => trigger.filter_has_any_storage_paths,
hasValue: (value) => Array.isArray(value) && value.length > 0,
},
[TriggerFilterType.StoragePathNot]: {
apply: (aggregate, values) => {
aggregate.filter_has_not_storage_paths = Array.isArray(values)
@@ -642,8 +699,11 @@ export class WorkflowEditDialogComponent
filter_has_tags: [],
filter_has_all_tags: [],
filter_has_not_tags: [],
filter_has_any_correspondents: [],
filter_has_not_correspondents: [],
filter_has_any_document_types: [],
filter_has_not_document_types: [],
filter_has_any_storage_paths: [],
filter_has_not_storage_paths: [],
filter_has_correspondent: null,
filter_has_document_type: null,
@@ -670,10 +730,16 @@ export class WorkflowEditDialogComponent
trigger.filter_has_tags = aggregate.filter_has_tags
trigger.filter_has_all_tags = aggregate.filter_has_all_tags
trigger.filter_has_not_tags = aggregate.filter_has_not_tags
trigger.filter_has_any_correspondents =
aggregate.filter_has_any_correspondents
trigger.filter_has_not_correspondents =
aggregate.filter_has_not_correspondents
trigger.filter_has_any_document_types =
aggregate.filter_has_any_document_types
trigger.filter_has_not_document_types =
aggregate.filter_has_not_document_types
trigger.filter_has_any_storage_paths =
aggregate.filter_has_any_storage_paths
trigger.filter_has_not_storage_paths =
aggregate.filter_has_not_storage_paths
trigger.filter_has_correspondent =
@@ -856,8 +922,11 @@ export class WorkflowEditDialogComponent
case TriggerFilterType.TagsAny:
case TriggerFilterType.TagsAll:
case TriggerFilterType.TagsNone:
case TriggerFilterType.CorrespondentAny:
case TriggerFilterType.CorrespondentNot:
case TriggerFilterType.DocumentTypeAny:
case TriggerFilterType.DocumentTypeNot:
case TriggerFilterType.StoragePathAny:
case TriggerFilterType.StoragePathNot:
return true
default:
@@ -1179,8 +1248,11 @@ export class WorkflowEditDialogComponent
filter_has_tags: [],
filter_has_all_tags: [],
filter_has_not_tags: [],
filter_has_any_correspondents: [],
filter_has_not_correspondents: [],
filter_has_any_document_types: [],
filter_has_not_document_types: [],
filter_has_any_storage_paths: [],
filter_has_not_storage_paths: [],
filter_custom_field_query: null,
filter_has_correspondent: null,

View File

@@ -44,10 +44,16 @@ export interface WorkflowTrigger extends ObjectWithId {
filter_has_not_tags?: number[] // Tag.id[]
filter_has_any_correspondents?: number[] // Correspondent.id[]
filter_has_not_correspondents?: number[] // Correspondent.id[]
filter_has_any_document_types?: number[] // DocumentType.id[]
filter_has_not_document_types?: number[] // DocumentType.id[]
filter_has_any_storage_paths?: number[] // StoragePath.id[]
filter_has_not_storage_paths?: number[] // StoragePath.id[]
filter_custom_field_query?: string

View File

@@ -73,6 +73,7 @@ $form-check-radio-checked-bg-image-dark: url("data:image/svg+xml,<svg xmlns='htt
}
@mixin dark-mode {
color-scheme: dark;
--pngx-body-color-accent: #{$text-color-dark-bg-accent};
--pngx-bg-alt: #242529;
--pngx-bg-alt2: #232323;

View File

@@ -1,5 +1,4 @@
# this is here so that django finds the checks.
from documents.checks import changed_password_check
from documents.checks import parser_check
__all__ = ["changed_password_check", "parser_check"]
__all__ = ["parser_check"]

View File

@@ -60,7 +60,6 @@ class DocumentAdmin(GuardedModelAdmin):
"added",
"modified",
"mime_type",
"storage_type",
"filename",
"checksum",
"archive_filename",

View File

@@ -1,60 +1,12 @@
import textwrap
from django.conf import settings
from django.core.checks import Error
from django.core.checks import Warning
from django.core.checks import register
from django.core.exceptions import FieldError
from django.db.utils import OperationalError
from django.db.utils import ProgrammingError
from documents.signals import document_consumer_declaration
from documents.templating.utils import convert_format_str_to_template_format
@register()
def changed_password_check(app_configs, **kwargs):
from documents.models import Document
from paperless.db import GnuPG
try:
encrypted_doc = (
Document.objects.filter(
storage_type=Document.STORAGE_TYPE_GPG,
)
.only("pk", "storage_type")
.first()
)
except (OperationalError, ProgrammingError, FieldError):
return [] # No documents table yet
if encrypted_doc:
if not settings.PASSPHRASE:
return [
Error(
"The database contains encrypted documents but no password is set.",
),
]
if not GnuPG.decrypted(encrypted_doc.source_file):
return [
Error(
textwrap.dedent(
"""
The current password doesn't match the password of the
existing documents.
If you intend to change your password, you must first export
all of the old documents, start fresh with the new password
and then re-import them."
""",
),
),
]
return []
@register()
def parser_check(app_configs, **kwargs):
parsers = []

View File

@@ -128,7 +128,7 @@ def thumbnail_last_modified(request, pk: int) -> datetime | None:
Cache should be (slightly?) faster than filesystem
"""
try:
doc = Document.objects.only("storage_type").get(pk=pk)
doc = Document.objects.only("pk").get(pk=pk)
if not doc.thumbnail_path.exists():
return None
doc_key = get_thumbnail_modified_key(pk)

View File

@@ -497,7 +497,6 @@ class ConsumerPlugin(
create_source_path_directory(document.source_path)
self._write(
document.storage_type,
self.unmodified_original
if self.unmodified_original is not None
else self.working_copy,
@@ -505,7 +504,6 @@ class ConsumerPlugin(
)
self._write(
document.storage_type,
thumbnail,
document.thumbnail_path,
)
@@ -517,7 +515,6 @@ class ConsumerPlugin(
)
create_source_path_directory(document.archive_path)
self._write(
document.storage_type,
archive_path,
document.archive_path,
)
@@ -637,8 +634,6 @@ class ConsumerPlugin(
)
self.log.debug(f"Creation date from st_mtime: {create_date}")
storage_type = Document.STORAGE_TYPE_UNENCRYPTED
if self.metadata.filename:
title = Path(self.metadata.filename).stem
else:
@@ -665,7 +660,6 @@ class ConsumerPlugin(
checksum=hashlib.md5(file_for_checksum.read_bytes()).hexdigest(),
created=create_date,
modified=create_date,
storage_type=storage_type,
page_count=page_count,
original_filename=self.filename,
)
@@ -736,7 +730,7 @@ class ConsumerPlugin(
}
CustomFieldInstance.objects.create(**args) # adds to document
def _write(self, storage_type, source, target):
def _write(self, source, target):
with (
Path(source).open("rb") as read_file,
Path(target).open("wb") as write_file,

View File

@@ -126,7 +126,6 @@ def generate_filename(
doc: Document,
*,
counter=0,
append_gpg=True,
archive_filename=False,
) -> Path:
base_path: Path | None = None
@@ -170,8 +169,4 @@ def generate_filename(
final_filename = f"{doc.pk:07}{counter_str}{filetype_str}"
full_path = Path(final_filename)
# Add GPG extension if needed
if append_gpg and doc.storage_type == doc.STORAGE_TYPE_GPG:
full_path = full_path.with_suffix(full_path.suffix + ".gpg")
return full_path

View File

@@ -1,93 +0,0 @@
from pathlib import Path
from django.conf import settings
from django.core.management.base import BaseCommand
from django.core.management.base import CommandError
from documents.models import Document
from paperless.db import GnuPG
class Command(BaseCommand):
help = (
"This is how you migrate your stored documents from an encrypted "
"state to an unencrypted one (or vice-versa)"
)
def add_arguments(self, parser) -> None:
parser.add_argument(
"--passphrase",
help=(
"If PAPERLESS_PASSPHRASE isn't set already, you need to specify it here"
),
)
def handle(self, *args, **options) -> None:
try:
self.stdout.write(
self.style.WARNING(
"\n\n"
"WARNING: This script is going to work directly on your "
"document originals, so\n"
"WARNING: you probably shouldn't run "
"this unless you've got a recent backup\n"
"WARNING: handy. It "
"*should* work without a hitch, but be safe and backup your\n"
"WARNING: stuff first.\n\n"
"Hit Ctrl+C to exit now, or Enter to "
"continue.\n\n",
),
)
_ = input()
except KeyboardInterrupt:
return
passphrase = options["passphrase"] or settings.PASSPHRASE
if not passphrase:
raise CommandError(
"Passphrase not defined. Please set it with --passphrase or "
"by declaring it in your environment or your config.",
)
self.__gpg_to_unencrypted(passphrase)
def __gpg_to_unencrypted(self, passphrase: str) -> None:
encrypted_files = Document.objects.filter(
storage_type=Document.STORAGE_TYPE_GPG,
)
for document in encrypted_files:
self.stdout.write(f"Decrypting {document}")
old_paths = [document.source_path, document.thumbnail_path]
with document.source_file as file_handle:
raw_document = GnuPG.decrypted(file_handle, passphrase)
with document.thumbnail_file as file_handle:
raw_thumb = GnuPG.decrypted(file_handle, passphrase)
document.storage_type = Document.STORAGE_TYPE_UNENCRYPTED
ext: str = Path(document.filename).suffix
if not ext == ".gpg":
raise CommandError(
f"Abort: encrypted file {document.source_path} does not "
f"end with .gpg",
)
document.filename = Path(document.filename).stem
with document.source_path.open("wb") as f:
f.write(raw_document)
with document.thumbnail_path.open("wb") as f:
f.write(raw_thumb)
Document.objects.filter(id=document.id).update(
storage_type=document.storage_type,
filename=document.filename,
)
for path in old_paths:
path.unlink()

View File

@@ -1,135 +1,343 @@
"""
Document consumer management command.
Watches a consumption directory for new documents and queues them for processing.
Uses watchfiles for efficient file system monitoring with support for both
native OS notifications and polling fallback.
"""
from __future__ import annotations
import logging
import os
from concurrent.futures import ThreadPoolExecutor
from fnmatch import filter
from dataclasses import dataclass
from pathlib import Path
from pathlib import PurePath
from threading import Event
from time import monotonic
from time import sleep
from typing import TYPE_CHECKING
from typing import Final
from django import db
from django.conf import settings
from django.core.management.base import BaseCommand
from django.core.management.base import CommandError
from watchdog.events import FileSystemEventHandler
from watchdog.observers.polling import PollingObserver
from watchfiles import Change
from watchfiles import DefaultFilter
from watchfiles import watch
from documents.data_models import ConsumableDocument
from documents.data_models import DocumentMetadataOverrides
from documents.data_models import DocumentSource
from documents.models import Tag
from documents.parsers import is_file_ext_supported
from documents.parsers import get_supported_file_extensions
from documents.tasks import consume_file
try:
from inotifyrecursive import INotify
from inotifyrecursive import flags
except ImportError: # pragma: no cover
INotify = flags = None
if TYPE_CHECKING:
from collections.abc import Iterator
logger = logging.getLogger("paperless.management.consumer")
def _tags_from_path(filepath: Path) -> list[int]:
@dataclass
class TrackedFile:
"""Represents a file being tracked for stability."""
path: Path
last_event_time: float
last_mtime: float | None = None
last_size: int | None = None
def update_stats(self) -> bool:
"""
Walk up the directory tree from filepath to CONSUMPTION_DIR
Update file stats. Returns True if file exists and stats were updated.
"""
try:
stat = self.path.stat()
self.last_mtime = stat.st_mtime
self.last_size = stat.st_size
return True
except OSError:
return False
def is_unchanged(self) -> bool:
"""
Check if file stats match the previously recorded values.
Returns False if file doesn't exist or stats changed.
"""
try:
stat = self.path.stat()
return stat.st_mtime == self.last_mtime and stat.st_size == self.last_size
except OSError:
return False
class FileStabilityTracker:
"""
Tracks file events and determines when files are stable for consumption.
A file is considered stable when:
1. No new events have been received for it within the stability delay
2. Its size and modification time haven't changed
3. It still exists as a regular file
This handles various edge cases:
- Network copies that write in chunks
- Scanners that open/close files multiple times
- Temporary files that get renamed
- Files that are deleted before becoming stable
"""
def __init__(self, stability_delay: float = 1.0) -> None:
"""
Initialize the tracker.
Args:
stability_delay: Time in seconds a file must remain unchanged
before being considered stable.
"""
self.stability_delay = stability_delay
self._tracked: dict[Path, TrackedFile] = {}
def track(self, path: Path, change: Change) -> None:
"""
Register a file event.
Args:
path: The file path that changed.
change: The type of change (added, modified, deleted).
"""
path = path.resolve()
match change:
case Change.deleted:
self._tracked.pop(path, None)
logger.debug(f"Stopped tracking deleted file: {path}")
case Change.added | Change.modified:
current_time = monotonic()
if path in self._tracked:
tracked = self._tracked[path]
tracked.last_event_time = current_time
tracked.update_stats()
logger.debug(f"Updated tracking for: {path}")
else:
tracked = TrackedFile(path=path, last_event_time=current_time)
if tracked.update_stats():
self._tracked[path] = tracked
logger.debug(f"Started tracking: {path}")
else:
logger.debug(f"Could not stat file, not tracking: {path}")
def get_stable_files(self) -> Iterator[Path]:
"""
Yield files that have been stable for the configured delay.
Files are removed from tracking once yielded or determined to be invalid.
"""
current_time = monotonic()
to_remove: list[Path] = []
to_yield: list[Path] = []
for path, tracked in self._tracked.items():
time_since_event = current_time - tracked.last_event_time
if time_since_event < self.stability_delay:
continue
# File has waited long enough, verify it's unchanged
if not tracked.is_unchanged():
# Stats changed or file gone - update and wait again
if tracked.update_stats():
tracked.last_event_time = current_time
logger.debug(f"File changed during stability check: {path}")
else:
# File no longer exists, remove from tracking
to_remove.append(path)
logger.debug(f"File disappeared during stability check: {path}")
continue
# File is stable, we can return it
to_yield.append(path)
logger.info(f"File is stable: {path}")
# Remove files that are no longer valid
for path in to_remove:
self._tracked.pop(path, None)
# Remove and yield stable files
for path in to_yield:
self._tracked.pop(path, None)
yield path
def has_pending_files(self) -> bool:
"""Check if there are files waiting for stability check."""
return len(self._tracked) > 0
@property
def pending_count(self) -> int:
"""Number of files being tracked."""
return len(self._tracked)
class ConsumerFilter(DefaultFilter):
"""
Filter for watchfiles that accepts only supported document types
and ignores system files/directories.
Extends DefaultFilter leveraging its built-in filtering:
- `ignore_dirs`: Directory names to ignore (and all their contents)
- `ignore_entity_patterns`: Regex patterns matched against filename/dirname only
We add custom logic for file extension filtering (only accept supported
document types), which the library doesn't provide.
"""
# Regex patterns for files to always ignore (matched against filename only)
# These are passed to DefaultFilter.ignore_entity_patterns
DEFAULT_IGNORE_PATTERNS: Final[tuple[str, ...]] = (
r"^\.DS_Store$",
r"^\.DS_STORE$",
r"^\._.*",
r"^desktop\.ini$",
r"^Thumbs\.db$",
)
# Directories to always ignore (passed to DefaultFilter.ignore_dirs)
# These are matched by directory name, not full path
DEFAULT_IGNORE_DIRS: Final[tuple[str, ...]] = (
".stfolder", # Syncthing
".stversions", # Syncthing
".localized", # macOS
"@eaDir", # Synology NAS
".Spotlight-V100", # macOS
".Trashes", # macOS
"__MACOSX", # macOS archive artifacts
)
def __init__(
self,
*,
supported_extensions: frozenset[str] | None = None,
ignore_patterns: list[str] | None = None,
ignore_dirs: list[str] | None = None,
) -> None:
"""
Initialize the consumer filter.
Args:
supported_extensions: Set of file extensions to accept (e.g., {".pdf", ".png"}).
If None, uses get_supported_file_extensions().
ignore_patterns: Additional regex patterns to ignore (matched against filename).
ignore_dirs: Additional directory names to ignore (merged with defaults).
"""
# Get supported extensions
if supported_extensions is None:
supported_extensions = frozenset(get_supported_file_extensions())
self._supported_extensions = supported_extensions
# Combine default and user patterns
all_patterns: list[str] = list(self.DEFAULT_IGNORE_PATTERNS)
if ignore_patterns:
all_patterns.extend(ignore_patterns)
# Combine default and user ignore_dirs
all_ignore_dirs: list[str] = list(self.DEFAULT_IGNORE_DIRS)
if ignore_dirs:
all_ignore_dirs.extend(ignore_dirs)
# Let DefaultFilter handle all the pattern and directory filtering
super().__init__(
ignore_dirs=tuple(all_ignore_dirs),
ignore_entity_patterns=tuple(all_patterns),
ignore_paths=(),
)
def __call__(self, change: Change, path: str) -> bool:
"""
Filter function for watchfiles.
Returns True if the path should be watched, False to ignore.
The parent DefaultFilter handles:
- Hidden files/directories (starting with .)
- Directories in ignore_dirs
- Files/directories matching ignore_entity_patterns
We additionally filter files by extension.
"""
# Let parent filter handle directory ignoring and pattern matching
if not super().__call__(change, path):
return False
path_obj = Path(path)
# For directories, parent filter already handled everything
if path_obj.is_dir():
return True
# For files, check extension
return self._has_supported_extension(path_obj)
def _has_supported_extension(self, path: Path) -> bool:
"""Check if the file has a supported extension."""
suffix = path.suffix.lower()
return suffix in self._supported_extensions
def _tags_from_path(filepath: Path, consumption_dir: Path) -> list[int]:
"""
Walk up the directory tree from filepath to consumption_dir
and get or create Tag IDs for every directory.
Returns set of Tag models
Returns list of Tag primary keys.
"""
db.close_old_connections()
tag_ids = set()
path_parts = filepath.relative_to(settings.CONSUMPTION_DIR).parent.parts
tag_ids: set[int] = set()
path_parts = filepath.relative_to(consumption_dir).parent.parts
for part in path_parts:
tag_ids.add(
Tag.objects.get_or_create(name__iexact=part, defaults={"name": part})[0].pk,
tag, _ = Tag.objects.get_or_create(
name__iexact=part,
defaults={"name": part},
)
tag_ids.add(tag.pk)
return list(tag_ids)
def _is_ignored(filepath: Path) -> bool:
def _consume_file(
filepath: Path,
consumption_dir: Path,
*,
subdirs_as_tags: bool,
) -> None:
"""
Checks if the given file should be ignored, based on configured
patterns.
Queue a file for consumption.
Returns True if the file is ignored, False otherwise
Args:
filepath: Path to the file to consume.
consumption_dir: Base consumption directory.
subdirs_as_tags: Whether to create tags from subdirectory names.
"""
# Trim out the consume directory, leaving only filename and it's
# path relative to the consume directory
filepath_relative = PurePath(filepath).relative_to(settings.CONSUMPTION_DIR)
# March through the components of the path, including directories and the filename
# looking for anything matching
# foo/bar/baz/file.pdf -> (foo, bar, baz, file.pdf)
parts = []
for part in filepath_relative.parts:
# If the part is not the name (ie, it's a dir)
# Need to append the trailing slash or fnmatch doesn't match
# fnmatch("dir", "dir/*") == False
# fnmatch("dir/", "dir/*") == True
if part != filepath_relative.name:
part = part + "/"
parts.append(part)
for pattern in settings.CONSUMER_IGNORE_PATTERNS:
if len(filter(parts, pattern)):
return True
return False
def _consume(filepath: Path) -> None:
# Check permissions early
# Verify file still exists and is accessible
try:
filepath.stat()
except (PermissionError, OSError):
logger.warning(f"Not consuming file {filepath}: Permission denied.")
return
if filepath.is_dir() or _is_ignored(filepath):
return
if not filepath.is_file():
logger.debug(f"Not consuming file {filepath}: File has moved.")
logger.debug(f"Not consuming {filepath}: not a file or doesn't exist")
return
if not is_file_ext_supported(filepath.suffix):
logger.warning(f"Not consuming file {filepath}: Unknown file extension.")
return
# Total wait time: up to 500ms
os_error_retry_count: Final[int] = 50
os_error_retry_wait: Final[float] = 0.01
read_try_count = 0
file_open_ok = False
os_error_str = None
while (read_try_count < os_error_retry_count) and not file_open_ok:
try:
with filepath.open("rb"):
file_open_ok = True
except OSError as e:
read_try_count += 1
os_error_str = str(e)
sleep(os_error_retry_wait)
if read_try_count >= os_error_retry_count:
logger.warning(f"Not consuming file {filepath}: OS reports {os_error_str}")
logger.warning(f"Not consuming {filepath}: {e}")
return
tag_ids = None
# Get tags from path if configured
tag_ids: list[int] | None = None
if subdirs_as_tags:
try:
if settings.CONSUMER_SUBDIRS_AS_TAGS:
tag_ids = _tags_from_path(filepath)
tag_ids = _tags_from_path(filepath, consumption_dir)
except Exception:
logger.exception("Error creating tags from path")
logger.exception(f"Error creating tags from path for {filepath}")
# Queue for consumption
try:
logger.info(f"Adding {filepath} to the task queue.")
logger.info(f"Adding {filepath} to the task queue")
consume_file.delay(
ConsumableDocument(
source=DocumentSource.ConsumeFolder,
@@ -138,228 +346,209 @@ def _consume(filepath: Path) -> None:
DocumentMetadataOverrides(tag_ids=tag_ids),
)
except Exception:
# Catch all so that the consumer won't crash.
# This is also what the test case is listening for to check for
# errors.
logger.exception("Error while consuming document")
def _consume_wait_unmodified(file: Path) -> None:
"""
Waits for the given file to appear unmodified based on file size
and modification time. Will wait a configured number of seconds
and retry a configured number of times before either consuming or
giving up
"""
if _is_ignored(file):
return
logger.debug(f"Waiting for file {file} to remain unmodified")
mtime = -1
size = -1
current_try = 0
while current_try < settings.CONSUMER_POLLING_RETRY_COUNT:
try:
stat_data = file.stat()
new_mtime = stat_data.st_mtime
new_size = stat_data.st_size
except FileNotFoundError:
logger.debug(
f"File {file} moved while waiting for it to remain unmodified.",
)
return
if new_mtime == mtime and new_size == size:
_consume(file)
return
mtime = new_mtime
size = new_size
sleep(settings.CONSUMER_POLLING_DELAY)
current_try += 1
logger.error(f"Timeout while waiting on file {file} to remain unmodified.")
class Handler(FileSystemEventHandler):
def __init__(self, pool: ThreadPoolExecutor) -> None:
super().__init__()
self._pool = pool
def on_created(self, event):
self._pool.submit(_consume_wait_unmodified, Path(event.src_path))
def on_moved(self, event):
self._pool.submit(_consume_wait_unmodified, Path(event.dest_path))
logger.exception(f"Error while queuing document {filepath}")
class Command(BaseCommand):
"""
On every iteration of an infinite loop, consume what we can from the
consumption directory.
Watch a consumption directory and queue new documents for processing.
Uses watchfiles for efficient file system monitoring. Supports both
native OS notifications (inotify on Linux, FSEvents on macOS) and
polling for network filesystems.
"""
# This is here primarily for the tests and is irrelevant in production.
stop_flag = Event()
# Also only for testing, configures in one place the timeout used before checking
# the stop flag
testing_timeout_s: Final[float] = 0.5
testing_timeout_ms: Final[float] = testing_timeout_s * 1000.0
help = "Watch the consumption directory for new documents"
def add_arguments(self, parser):
# For testing - allows tests to stop the consumer
stop_flag: Event = Event()
# Testing timeout in seconds
testing_timeout_s: Final[float] = 0.5
def add_arguments(self, parser) -> None:
parser.add_argument(
"directory",
default=settings.CONSUMPTION_DIR,
default=None,
nargs="?",
help="The consumption directory.",
help="The consumption directory (defaults to CONSUMPTION_DIR setting)",
)
parser.add_argument(
"--oneshot",
action="store_true",
help="Process existing files and exit without watching",
)
parser.add_argument("--oneshot", action="store_true", help="Run only once.")
# Only use during unit testing, will configure a timeout
# Leaving it unset or false and the consumer will exit when it
# receives SIGINT
parser.add_argument(
"--testing",
action="store_true",
help="Flag used only for unit testing",
help="Enable testing mode with shorter timeouts",
default=False,
)
def handle(self, *args, **options):
directory = options["directory"]
recursive = settings.CONSUMER_RECURSIVE
def handle(self, *args, **options) -> None:
# Resolve consumption directory
directory = options.get("directory")
if not directory:
raise CommandError("CONSUMPTION_DIR does not appear to be set.")
directory = getattr(settings, "CONSUMPTION_DIR", None)
if not directory:
raise CommandError("CONSUMPTION_DIR is not configured")
directory = Path(directory).resolve()
if not directory.is_dir():
raise CommandError(f"Consumption directory {directory} does not exist")
if not directory.exists():
raise CommandError(f"Consumption directory does not exist: {directory}")
# Consumer will need this
if not directory.is_dir():
raise CommandError(f"Consumption path is not a directory: {directory}")
# Ensure scratch directory exists
settings.SCRATCH_DIR.mkdir(parents=True, exist_ok=True)
if recursive:
for dirpath, _, filenames in os.walk(directory):
for filename in filenames:
filepath = Path(dirpath) / filename
_consume(filepath)
else:
for filepath in directory.iterdir():
_consume(filepath)
# Get settings
recursive: bool = settings.CONSUMER_RECURSIVE
subdirs_as_tags: bool = settings.CONSUMER_SUBDIRS_AS_TAGS
polling_interval: float = settings.CONSUMER_POLLING_INTERVAL
stability_delay: float = settings.CONSUMER_STABILITY_DELAY
ignore_patterns: list[str] = settings.CONSUMER_IGNORE_PATTERNS
ignore_dirs: list[str] = settings.CONSUMER_IGNORE_DIRS
is_testing: bool = options.get("testing", False)
is_oneshot: bool = options.get("oneshot", False)
if options["oneshot"]:
# Create filter
consumer_filter = ConsumerFilter(
ignore_patterns=ignore_patterns,
ignore_dirs=ignore_dirs,
)
# Process existing files
self._process_existing_files(
directory=directory,
recursive=recursive,
subdirs_as_tags=subdirs_as_tags,
consumer_filter=consumer_filter,
)
if is_oneshot:
logger.info("Oneshot mode: processed existing files, exiting")
return
if settings.CONSUMER_POLLING == 0 and INotify:
self.handle_inotify(directory, recursive, is_testing=options["testing"])
else:
if INotify is None and settings.CONSUMER_POLLING == 0: # pragma: no cover
logger.warning("Using polling as INotify import failed")
self.handle_polling(directory, recursive, is_testing=options["testing"])
# Start watching
self._watch_directory(
directory=directory,
recursive=recursive,
subdirs_as_tags=subdirs_as_tags,
consumer_filter=consumer_filter,
polling_interval=polling_interval,
stability_delay=stability_delay,
is_testing=is_testing,
)
logger.debug("Consumer exiting.")
logger.debug("Consumer exiting")
def handle_polling(self, directory, recursive, *, is_testing: bool):
logger.info(f"Polling directory for changes: {directory}")
def _process_existing_files(
self,
*,
directory: Path,
recursive: bool,
subdirs_as_tags: bool,
consumer_filter: ConsumerFilter,
) -> None:
"""Process any existing files in the consumption directory."""
logger.info(f"Processing existing files in {directory}")
timeout = None
if is_testing:
timeout = self.testing_timeout_s
logger.debug(f"Configuring timeout to {timeout}s")
glob_pattern = "**/*" if recursive else "*"
polling_interval = settings.CONSUMER_POLLING
if polling_interval == 0: # pragma: no cover
# Only happens if INotify failed to import
logger.warning("Using polling of 10s, consider setting this")
polling_interval = 10
with ThreadPoolExecutor(max_workers=4) as pool:
observer = PollingObserver(timeout=polling_interval)
observer.schedule(Handler(pool), directory, recursive=recursive)
observer.start()
try:
while observer.is_alive():
observer.join(timeout)
if self.stop_flag.is_set():
observer.stop()
except KeyboardInterrupt:
observer.stop()
observer.join()
def handle_inotify(self, directory, recursive, *, is_testing: bool):
logger.info(f"Using inotify to watch directory for changes: {directory}")
timeout_ms = None
if is_testing:
timeout_ms = self.testing_timeout_ms
logger.debug(f"Configuring timeout to {timeout_ms}ms")
inotify = INotify()
inotify_flags = flags.CLOSE_WRITE | flags.MOVED_TO | flags.MODIFY
if recursive:
inotify.add_watch_recursive(directory, inotify_flags)
else:
inotify.add_watch(directory, inotify_flags)
inotify_debounce_secs: Final[float] = settings.CONSUMER_INOTIFY_DELAY
inotify_debounce_ms: Final[int] = inotify_debounce_secs * 1000
finished = False
notified_files = {}
try:
while not finished:
try:
for event in inotify.read(timeout=timeout_ms):
path = inotify.get_path(event.wd) if recursive else directory
filepath = Path(path) / event.name
if flags.MODIFY in flags.from_mask(event.mask):
notified_files.pop(filepath, None)
else:
notified_files[filepath] = monotonic()
# Check the files against the timeout
still_waiting = {}
# last_event_time is time of the last inotify event for this file
for filepath, last_event_time in notified_files.items():
# Current time - last time over the configured timeout
waited_long_enough = (
monotonic() - last_event_time
) > inotify_debounce_secs
# Also make sure the file exists still, some scanners might write a
# temporary file first
try:
file_still_exists = filepath.exists() and filepath.is_file()
except (PermissionError, OSError): # pragma: no cover
# If we can't check, let it fail in the _consume function
file_still_exists = True
for filepath in directory.glob(glob_pattern):
# Use filter to check if file should be processed
if not filepath.is_file():
continue
if waited_long_enough and file_still_exists:
_consume(filepath)
elif file_still_exists:
still_waiting[filepath] = last_event_time
if not consumer_filter(Change.added, str(filepath)):
continue
# These files are still waiting to hit the timeout
notified_files = still_waiting
_consume_file(
filepath=filepath,
consumption_dir=directory,
subdirs_as_tags=subdirs_as_tags,
)
# If files are waiting, need to exit read() to check them
# Otherwise, go back to infinite sleep time, but only if not testing
if len(notified_files) > 0:
timeout_ms = inotify_debounce_ms
elif is_testing:
timeout_ms = self.testing_timeout_ms
def _watch_directory(
self,
*,
directory: Path,
recursive: bool,
subdirs_as_tags: bool,
consumer_filter: ConsumerFilter,
polling_interval: float,
stability_delay: float,
is_testing: bool,
) -> None:
"""Watch directory for changes and process stable files."""
use_polling = polling_interval > 0
poll_delay_ms = int(polling_interval * 1000) if use_polling else 0
if use_polling:
logger.info(
f"Watching {directory} using polling (interval: {polling_interval}s)",
)
else:
timeout_ms = None
logger.info(f"Watching {directory} using native file system events")
if self.stop_flag.is_set():
logger.debug("Finishing because event is set")
finished = True
# Create stability tracker
tracker = FileStabilityTracker(stability_delay=stability_delay)
except KeyboardInterrupt:
logger.info("Received SIGINT, stopping inotify")
finished = True
finally:
inotify.close()
# Calculate timeouts
stability_timeout_ms = int(stability_delay * 1000)
testing_timeout_ms = int(self.testing_timeout_s * 1000)
# Start with no timeout (wait indefinitely for first event)
# unless in testing mode
timeout_ms = testing_timeout_ms if is_testing else 0
self.stop_flag.clear()
while not self.stop_flag.is_set():
try:
for changes in watch(
directory,
watch_filter=consumer_filter,
rust_timeout=timeout_ms,
yield_on_timeout=True,
force_polling=use_polling,
poll_delay_ms=poll_delay_ms,
recursive=recursive,
stop_event=self.stop_flag,
):
# Process each change
for change_type, path in changes:
path = Path(path).resolve()
if not path.is_file():
continue
logger.debug(f"Event: {change_type.name} for {path}")
tracker.track(path, change_type)
# Check for stable files
for stable_path in tracker.get_stable_files():
_consume_file(
filepath=stable_path,
consumption_dir=directory,
subdirs_as_tags=subdirs_as_tags,
)
# Exit watch loop to reconfigure timeout
break
# Determine next timeout
if tracker.has_pending_files():
# Check pending files at stability interval
timeout_ms = stability_timeout_ms
elif is_testing:
# In testing, use short timeout to check stop flag
timeout_ms = testing_timeout_ms
else: # pragma: nocover
# No pending files, wait indefinitely
timeout_ms = 0
except KeyboardInterrupt: # pragma: nocover
logger.info("Received interrupt, stopping consumer")
self.stop_flag.set()

View File

@@ -3,7 +3,6 @@ import json
import os
import shutil
import tempfile
import time
from pathlib import Path
from typing import TYPE_CHECKING
@@ -56,7 +55,6 @@ from documents.settings import EXPORTER_FILE_NAME
from documents.settings import EXPORTER_THUMBNAIL_NAME
from documents.utils import copy_file_with_basic_stats
from paperless import version
from paperless.db import GnuPG
from paperless.models import ApplicationConfiguration
from paperless_mail.models import MailAccount
from paperless_mail.models import MailRule
@@ -316,20 +314,17 @@ class Command(CryptMixin, BaseCommand):
total=len(document_manifest),
disable=self.no_progress_bar,
):
# 3.1. store files unencrypted
document_dict["fields"]["storage_type"] = Document.STORAGE_TYPE_UNENCRYPTED
document = document_map[document_dict["pk"]]
# 3.2. generate a unique filename
# 3.1. generate a unique filename
base_name = self.generate_base_name(document)
# 3.3. write filenames into manifest
# 3.2. write filenames into manifest
original_target, thumbnail_target, archive_target = (
self.generate_document_targets(document, base_name, document_dict)
)
# 3.4. write files to target folder
# 3.3. write files to target folder
if not self.data_only:
self.copy_document_files(
document,
@@ -423,7 +418,6 @@ class Command(CryptMixin, BaseCommand):
base_name = generate_filename(
document,
counter=filename_counter,
append_gpg=False,
)
else:
base_name = document.get_public_filename(counter=filename_counter)
@@ -482,28 +476,6 @@ class Command(CryptMixin, BaseCommand):
If the document is encrypted, the files are decrypted before copying them to the target location.
"""
if document.storage_type == Document.STORAGE_TYPE_GPG:
t = int(time.mktime(document.created.timetuple()))
original_target.parent.mkdir(parents=True, exist_ok=True)
with document.source_file as out_file:
original_target.write_bytes(GnuPG.decrypted(out_file))
os.utime(original_target, times=(t, t))
if thumbnail_target:
thumbnail_target.parent.mkdir(parents=True, exist_ok=True)
with document.thumbnail_file as out_file:
thumbnail_target.write_bytes(GnuPG.decrypted(out_file))
os.utime(thumbnail_target, times=(t, t))
if archive_target:
archive_target.parent.mkdir(parents=True, exist_ok=True)
if TYPE_CHECKING:
assert isinstance(document.archive_path, Path)
with document.archive_path as out_file:
archive_target.write_bytes(GnuPG.decrypted(out_file))
os.utime(archive_target, times=(t, t))
else:
self.check_and_copy(
document.source_path,
document.checksum,

View File

@@ -383,8 +383,6 @@ class Command(CryptMixin, BaseCommand):
else:
archive_path = None
document.storage_type = Document.STORAGE_TYPE_UNENCRYPTED
with FileLock(settings.MEDIA_LOCK):
if Path(document.source_path).is_file():
raise FileExistsError(document.source_path)

View File

@@ -403,6 +403,18 @@ def existing_document_matches_workflow(
f"Document tags {list(document.tags.all())} include excluded tags {list(trigger_has_not_tags_qs)}",
)
allowed_correspondent_ids = set(
trigger.filter_has_any_correspondents.values_list("id", flat=True),
)
if (
allowed_correspondent_ids
and document.correspondent_id not in allowed_correspondent_ids
):
return (
False,
f"Document correspondent {document.correspondent} is not one of {list(trigger.filter_has_any_correspondents.all())}",
)
# Document correspondent vs trigger has_correspondent
if (
trigger.filter_has_correspondent_id is not None
@@ -424,6 +436,17 @@ def existing_document_matches_workflow(
f"Document correspondent {document.correspondent} is excluded by {list(trigger.filter_has_not_correspondents.all())}",
)
allowed_document_type_ids = set(
trigger.filter_has_any_document_types.values_list("id", flat=True),
)
if allowed_document_type_ids and (
document.document_type_id not in allowed_document_type_ids
):
return (
False,
f"Document doc type {document.document_type} is not one of {list(trigger.filter_has_any_document_types.all())}",
)
# Document document_type vs trigger has_document_type
if (
trigger.filter_has_document_type_id is not None
@@ -445,6 +468,17 @@ def existing_document_matches_workflow(
f"Document doc type {document.document_type} is excluded by {list(trigger.filter_has_not_document_types.all())}",
)
allowed_storage_path_ids = set(
trigger.filter_has_any_storage_paths.values_list("id", flat=True),
)
if allowed_storage_path_ids and (
document.storage_path_id not in allowed_storage_path_ids
):
return (
False,
f"Document storage path {document.storage_path} is not one of {list(trigger.filter_has_any_storage_paths.all())}",
)
# Document storage_path vs trigger has_storage_path
if (
trigger.filter_has_storage_path_id is not None
@@ -532,6 +566,10 @@ def prefilter_documents_by_workflowtrigger(
# Correspondent, DocumentType, etc. filtering
if trigger.filter_has_any_correspondents.exists():
documents = documents.filter(
correspondent__in=trigger.filter_has_any_correspondents.all(),
)
if trigger.filter_has_correspondent is not None:
documents = documents.filter(
correspondent=trigger.filter_has_correspondent,
@@ -541,6 +579,10 @@ def prefilter_documents_by_workflowtrigger(
correspondent__in=trigger.filter_has_not_correspondents.all(),
)
if trigger.filter_has_any_document_types.exists():
documents = documents.filter(
document_type__in=trigger.filter_has_any_document_types.all(),
)
if trigger.filter_has_document_type is not None:
documents = documents.filter(
document_type=trigger.filter_has_document_type,
@@ -550,6 +592,10 @@ def prefilter_documents_by_workflowtrigger(
document_type__in=trigger.filter_has_not_document_types.all(),
)
if trigger.filter_has_any_storage_paths.exists():
documents = documents.filter(
storage_path__in=trigger.filter_has_any_storage_paths.all(),
)
if trigger.filter_has_storage_path is not None:
documents = documents.filter(
storage_path=trigger.filter_has_storage_path,
@@ -604,8 +650,11 @@ def document_matches_workflow(
"filter_has_tags",
"filter_has_all_tags",
"filter_has_not_tags",
"filter_has_any_document_types",
"filter_has_not_document_types",
"filter_has_any_correspondents",
"filter_has_not_correspondents",
"filter_has_any_storage_paths",
"filter_has_not_storage_paths",
)
)

File diff suppressed because it is too large Load Diff

View File

@@ -1,26 +0,0 @@
# Generated by Django 1.9 on 2015-12-26 13:16
import django.utils.timezone
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0001_initial"),
]
operations = [
migrations.AlterModelOptions(
name="document",
options={"ordering": ("sender", "title")},
),
migrations.AlterField(
model_name="document",
name="created",
field=models.DateTimeField(
default=django.utils.timezone.now,
editable=False,
),
),
]

View File

@@ -1,50 +1,49 @@
# Generated by Django 4.1.5 on 2023-03-04 22:33
# Generated by Django 5.2.9 on 2026-01-20 18:46
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
initial = True
dependencies = [
("documents", "1032_alter_correspondent_matching_algorithm_and_more"),
("documents", "0001_initial"),
("paperless_mail", "0001_initial"),
]
operations = [
migrations.AlterModelOptions(
name="documenttype",
options={
"ordering": ("name",),
"verbose_name": "document type",
"verbose_name_plural": "document types",
},
migrations.AddField(
model_name="workflowtrigger",
name="filter_mailrule",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to="paperless_mail.mailrule",
verbose_name="filter documents from this mail rule",
),
migrations.AlterModelOptions(
name="tag",
options={
"ordering": ("name",),
"verbose_name": "tag",
"verbose_name_plural": "tags",
},
),
migrations.AlterField(
model_name="correspondent",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
migrations.AddField(
model_name="workflowtrigger",
name="schedule_date_custom_field",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to="documents.customfield",
verbose_name="schedule date custom field",
),
migrations.AlterField(
model_name="documenttype",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AlterField(
model_name="storagepath",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
migrations.AddField(
model_name="workflow",
name="triggers",
field=models.ManyToManyField(
related_name="workflows",
to="documents.workflowtrigger",
verbose_name="triggers",
),
migrations.AlterField(
model_name="tag",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AddConstraint(
model_name="correspondent",
@@ -61,6 +60,13 @@ class Migration(migrations.Migration):
name="documents_correspondent_name_uniq",
),
),
migrations.AddConstraint(
model_name="customfieldinstance",
constraint=models.UniqueConstraint(
fields=("document", "field"),
name="documents_customfieldinstance_unique_document_field",
),
),
migrations.AddConstraint(
model_name="documenttype",
constraint=models.UniqueConstraint(

View File

@@ -1,70 +0,0 @@
# Generated by Django 1.9 on 2016-01-11 12:21
import django.db.models.deletion
from django.db import migrations
from django.db import models
from django.template.defaultfilters import slugify
DOCUMENT_SENDER_MAP = {}
def move_sender_strings_to_sender_model(apps, schema_editor):
sender_model = apps.get_model("documents", "Sender")
document_model = apps.get_model("documents", "Document")
# Create the sender and log the relationship with the document
for document in document_model.objects.all():
if document.sender:
(
DOCUMENT_SENDER_MAP[document.pk],
_,
) = sender_model.objects.get_or_create(
name=document.sender,
defaults={"slug": slugify(document.sender)},
)
def realign_senders(apps, schema_editor):
document_model = apps.get_model("documents", "Document")
for pk, sender in DOCUMENT_SENDER_MAP.items():
document_model.objects.filter(pk=pk).update(sender=sender)
class Migration(migrations.Migration):
dependencies = [
("documents", "0002_auto_20151226_1316"),
]
operations = [
migrations.CreateModel(
name="Sender",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("name", models.CharField(max_length=128, unique=True)),
("slug", models.SlugField()),
],
),
migrations.RunPython(move_sender_strings_to_sender_model),
migrations.RemoveField(
model_name="document",
name="sender",
),
migrations.AddField(
model_name="document",
name="sender",
field=models.ForeignKey(
blank=True,
on_delete=django.db.models.deletion.CASCADE,
to="documents.Sender",
),
),
migrations.RunPython(realign_senders),
]

View File

@@ -1,4 +1,4 @@
# Generated by Django 3.1.3 on 2020-11-21 21:51
# Generated by Django 5.2.9 on 2026-01-20 20:06
from django.db import migrations
from django.db import models
@@ -6,13 +6,13 @@ from django.db import models
class Migration(migrations.Migration):
dependencies = [
("paperless_mail", "0003_auto_20201118_1940"),
("documents", "0002_initial"),
]
operations = [
migrations.AddField(
model_name="mailrule",
model_name="workflowaction",
name="order",
field=models.IntegerField(default=0),
field=models.PositiveIntegerField(default=0, verbose_name="order"),
),
]

View File

@@ -1,25 +0,0 @@
# Generated by Django 1.9 on 2016-01-14 18:44
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0003_sender"),
]
operations = [
migrations.AlterField(
model_name="document",
name="sender",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="documents",
to="documents.Sender",
),
),
]

View File

@@ -1,178 +0,0 @@
# Generated by Django 4.2.13 on 2024-06-28 17:52
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
replaces = [
("documents", "0004_auto_20160114_1844"),
("documents", "0005_auto_20160123_0313"),
("documents", "0006_auto_20160123_0430"),
("documents", "0007_auto_20160126_2114"),
("documents", "0008_document_file_type"),
("documents", "0009_auto_20160214_0040"),
("documents", "0010_log"),
("documents", "0011_auto_20160303_1929"),
]
dependencies = [
("documents", "0003_sender"),
]
operations = [
migrations.AlterField(
model_name="document",
name="sender",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="documents",
to="documents.sender",
),
),
migrations.AlterModelOptions(
name="sender",
options={"ordering": ("name",)},
),
migrations.CreateModel(
name="Tag",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("name", models.CharField(max_length=128, unique=True)),
("slug", models.SlugField(blank=True)),
(
"colour",
models.PositiveIntegerField(
choices=[
(1, "#a6cee3"),
(2, "#1f78b4"),
(3, "#b2df8a"),
(4, "#33a02c"),
(5, "#fb9a99"),
(6, "#e31a1c"),
(7, "#fdbf6f"),
(8, "#ff7f00"),
(9, "#cab2d6"),
(10, "#6a3d9a"),
(11, "#b15928"),
(12, "#000000"),
(13, "#cccccc"),
],
default=1,
),
),
("match", models.CharField(blank=True, max_length=256)),
(
"matching_algorithm",
models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. If you don\'t know what a regex is, you probably don\'t want this option.',
),
),
],
options={
"abstract": False,
},
),
migrations.AlterField(
model_name="sender",
name="slug",
field=models.SlugField(blank=True),
),
migrations.AddField(
model_name="document",
name="file_type",
field=models.CharField(
choices=[
("pdf", "PDF"),
("png", "PNG"),
("jpg", "JPG"),
("gif", "GIF"),
("tiff", "TIFF"),
],
default="pdf",
editable=False,
max_length=4,
),
preserve_default=False,
),
migrations.AddField(
model_name="document",
name="tags",
field=models.ManyToManyField(
blank=True,
related_name="documents",
to="documents.tag",
),
),
migrations.CreateModel(
name="Log",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("group", models.UUIDField(blank=True)),
("message", models.TextField()),
(
"level",
models.PositiveIntegerField(
choices=[
(10, "Debugging"),
(20, "Informational"),
(30, "Warning"),
(40, "Error"),
(50, "Critical"),
],
default=20,
),
),
(
"component",
models.PositiveIntegerField(
choices=[(1, "Consumer"), (2, "Mail Fetcher")],
),
),
("created", models.DateTimeField(auto_now_add=True)),
("modified", models.DateTimeField(auto_now=True)),
],
options={
"ordering": ("-modified",),
},
),
migrations.RenameModel(
old_name="Sender",
new_name="Correspondent",
),
migrations.AlterModelOptions(
name="document",
options={"ordering": ("correspondent", "title")},
),
migrations.RenameField(
model_name="document",
old_name="sender",
new_name="correspondent",
),
]

View File

@@ -1,16 +1,16 @@
# Generated by Django 3.2.12 on 2022-03-11 15:18
# Generated by Django 5.2.9 on 2026-01-24 23:05
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("paperless_mail", "0010_auto_20220311_1602"),
("documents", "0003_workflowaction_order"),
]
operations = [
migrations.RemoveField(
model_name="mailrule",
name="assign_tag",
model_name="document",
name="storage_type",
),
]

View File

@@ -1,16 +0,0 @@
# Generated by Django 1.9 on 2016-01-23 03:13
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "0004_auto_20160114_1844"),
]
operations = [
migrations.AlterModelOptions(
name="sender",
options={"ordering": ("name",)},
),
]

View File

@@ -0,0 +1,43 @@
# Generated by Django 5.2.7 on 2025-12-17 22:25
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0004_remove_document_storage_type"),
]
operations = [
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_any_correspondents",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_any_correspondent",
to="documents.correspondent",
verbose_name="has one of these correspondents",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_any_document_types",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_any_document_type",
to="documents.documenttype",
verbose_name="has one of these document types",
),
),
migrations.AddField(
model_name="workflowtrigger",
name="filter_has_any_storage_paths",
field=models.ManyToManyField(
blank=True,
related_name="workflowtriggers_has_any_storage_path",
to="documents.storagepath",
verbose_name="has one of these storage paths",
),
),
]

View File

@@ -1,64 +0,0 @@
# Generated by Django 1.9 on 2016-01-23 04:30
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0005_auto_20160123_0313"),
]
operations = [
migrations.CreateModel(
name="Tag",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("name", models.CharField(max_length=128, unique=True)),
("slug", models.SlugField(blank=True)),
(
"colour",
models.PositiveIntegerField(
choices=[
(1, "#a6cee3"),
(2, "#1f78b4"),
(3, "#b2df8a"),
(4, "#33a02c"),
(5, "#fb9a99"),
(6, "#e31a1c"),
(7, "#fdbf6f"),
(8, "#ff7f00"),
(9, "#cab2d6"),
(10, "#6a3d9a"),
(11, "#ffff99"),
(12, "#b15928"),
(13, "#000000"),
(14, "#cccccc"),
],
default=1,
),
),
],
options={
"abstract": False,
},
),
migrations.AlterField(
model_name="sender",
name="slug",
field=models.SlugField(blank=True),
),
migrations.AddField(
model_name="document",
name="tags",
field=models.ManyToManyField(related_name="documents", to="documents.Tag"),
),
]

View File

@@ -1,55 +0,0 @@
# Generated by Django 1.9 on 2016-01-26 21:14
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0006_auto_20160123_0430"),
]
operations = [
migrations.AddField(
model_name="tag",
name="match",
field=models.CharField(blank=True, max_length=256),
),
migrations.AddField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
blank=True,
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
],
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. If you don\'t know what a regex is, you probably don\'t want this option.',
null=True,
),
),
migrations.AlterField(
model_name="tag",
name="colour",
field=models.PositiveIntegerField(
choices=[
(1, "#a6cee3"),
(2, "#1f78b4"),
(3, "#b2df8a"),
(4, "#33a02c"),
(5, "#fb9a99"),
(6, "#e31a1c"),
(7, "#fdbf6f"),
(8, "#ff7f00"),
(9, "#cab2d6"),
(10, "#6a3d9a"),
(11, "#b15928"),
(12, "#000000"),
(13, "#cccccc"),
],
default=1,
),
),
]

View File

@@ -1,39 +0,0 @@
# Generated by Django 1.9 on 2016-01-29 22:58
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0007_auto_20160126_2114"),
]
operations = [
migrations.AddField(
model_name="document",
name="file_type",
field=models.CharField(
choices=[
("pdf", "PDF"),
("png", "PNG"),
("jpg", "JPG"),
("gif", "GIF"),
("tiff", "TIFF"),
],
default="pdf",
editable=False,
max_length=4,
),
preserve_default=False,
),
migrations.AlterField(
model_name="document",
name="tags",
field=models.ManyToManyField(
blank=True,
related_name="documents",
to="documents.Tag",
),
),
]

View File

@@ -1,27 +0,0 @@
# Generated by Django 1.9 on 2016-02-14 00:40
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0008_document_file_type"),
]
operations = [
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. If you don\'t know what a regex is, you probably don\'t want this option.',
),
),
]

View File

@@ -1,53 +0,0 @@
# Generated by Django 1.9 on 2016-02-27 17:54
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0009_auto_20160214_0040"),
]
operations = [
migrations.CreateModel(
name="Log",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("group", models.UUIDField(blank=True)),
("message", models.TextField()),
(
"level",
models.PositiveIntegerField(
choices=[
(10, "Debugging"),
(20, "Informational"),
(30, "Warning"),
(40, "Error"),
(50, "Critical"),
],
default=20,
),
),
(
"component",
models.PositiveIntegerField(
choices=[(1, "Consumer"), (2, "Mail Fetcher")],
),
),
("created", models.DateTimeField(auto_now_add=True)),
("modified", models.DateTimeField(auto_now=True)),
],
options={
"ordering": ("-modified",),
},
),
]

View File

@@ -1,26 +0,0 @@
# Generated by Django 1.9.2 on 2016-03-03 19:29
from django.db import migrations
class Migration(migrations.Migration):
atomic = False
dependencies = [
("documents", "0010_log"),
]
operations = [
migrations.RenameModel(
old_name="Sender",
new_name="Correspondent",
),
migrations.AlterModelOptions(
name="document",
options={"ordering": ("correspondent", "title")},
),
migrations.RenameField(
model_name="document",
old_name="sender",
new_name="correspondent",
),
]

View File

@@ -1,128 +0,0 @@
# Generated by Django 1.9.2 on 2016-03-05 00:40
import os
import re
import shutil
import subprocess
import tempfile
from pathlib import Path
import gnupg
from django.conf import settings
from django.db import migrations
from django.utils.termcolors import colorize as colourise # Spelling hurts me
class GnuPG:
"""
A handy singleton to use when handling encrypted files.
"""
gpg = gnupg.GPG(gnupghome=settings.GNUPG_HOME)
@classmethod
def decrypted(cls, file_handle):
return cls.gpg.decrypt_file(file_handle, passphrase=settings.PASSPHRASE).data
@classmethod
def encrypted(cls, file_handle):
return cls.gpg.encrypt_file(
file_handle,
recipients=None,
passphrase=settings.PASSPHRASE,
symmetric=True,
).data
def move_documents_and_create_thumbnails(apps, schema_editor):
(Path(settings.MEDIA_ROOT) / "documents" / "originals").mkdir(
parents=True,
exist_ok=True,
)
(Path(settings.MEDIA_ROOT) / "documents" / "thumbnails").mkdir(
parents=True,
exist_ok=True,
)
documents: list[str] = os.listdir(Path(settings.MEDIA_ROOT) / "documents") # noqa: PTH208
if set(documents) == {"originals", "thumbnails"}:
return
print(
colourise(
"\n\n"
" This is a one-time only migration to generate thumbnails for all of your\n"
" documents so that future UIs will have something to work with. If you have\n"
" a lot of documents though, this may take a while, so a coffee break may be\n"
" in order."
"\n",
opts=("bold",),
),
)
Path(settings.SCRATCH_DIR).mkdir(parents=True, exist_ok=True)
for f in sorted(documents):
if not f.endswith("gpg"):
continue
print(
" {} {} {}".format(
colourise("*", fg="green"),
colourise("Generating a thumbnail for", fg="white"),
colourise(f, fg="cyan"),
),
)
thumb_temp: str = tempfile.mkdtemp(prefix="paperless", dir=settings.SCRATCH_DIR)
orig_temp: str = tempfile.mkdtemp(prefix="paperless", dir=settings.SCRATCH_DIR)
orig_source: Path = Path(settings.MEDIA_ROOT) / "documents" / f
orig_target: Path = Path(orig_temp) / f.replace(".gpg", "")
with orig_source.open("rb") as encrypted, orig_target.open("wb") as unencrypted:
unencrypted.write(GnuPG.decrypted(encrypted))
subprocess.Popen(
(
settings.CONVERT_BINARY,
"-scale",
"500x5000",
"-alpha",
"remove",
orig_target,
Path(thumb_temp) / "convert-%04d.png",
),
).wait()
thumb_source: Path = Path(thumb_temp) / "convert-0000.png"
thumb_target: Path = (
Path(settings.MEDIA_ROOT)
/ "documents"
/ "thumbnails"
/ re.sub(r"(\d+)\.\w+(\.gpg)", "\\1.png\\2", f)
)
with (
thumb_source.open("rb") as unencrypted,
thumb_target.open("wb") as encrypted,
):
encrypted.write(GnuPG.encrypted(unencrypted))
shutil.rmtree(thumb_temp)
shutil.rmtree(orig_temp)
shutil.move(
Path(settings.MEDIA_ROOT) / "documents" / f,
Path(settings.MEDIA_ROOT) / "documents" / "originals" / f,
)
class Migration(migrations.Migration):
dependencies = [
("documents", "0011_auto_20160303_1929"),
]
operations = [
migrations.RunPython(move_documents_and_create_thumbnails),
]

View File

@@ -1,42 +0,0 @@
# Generated by Django 1.9.4 on 2016-03-25 21:11
import django.utils.timezone
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0012_auto_20160305_0040"),
]
operations = [
migrations.AddField(
model_name="correspondent",
name="match",
field=models.CharField(blank=True, max_length=256),
),
migrations.AddField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. If you don\'t know what a regex is, you probably don\'t want this option.',
),
),
migrations.AlterField(
model_name="document",
name="created",
field=models.DateTimeField(default=django.utils.timezone.now),
),
migrations.RemoveField(
model_name="log",
name="component",
),
]

View File

@@ -1,182 +0,0 @@
# Generated by Django 1.9.4 on 2016-03-28 19:09
import hashlib
from pathlib import Path
import django.utils.timezone
import gnupg
from django.conf import settings
from django.db import migrations
from django.db import models
from django.template.defaultfilters import slugify
from django.utils.termcolors import colorize as colourise # Spelling hurts me
class GnuPG:
"""
A handy singleton to use when handling encrypted files.
"""
gpg = gnupg.GPG(gnupghome=settings.GNUPG_HOME)
@classmethod
def decrypted(cls, file_handle):
return cls.gpg.decrypt_file(file_handle, passphrase=settings.PASSPHRASE).data
@classmethod
def encrypted(cls, file_handle):
return cls.gpg.encrypt_file(
file_handle,
recipients=None,
passphrase=settings.PASSPHRASE,
symmetric=True,
).data
class Document:
"""
Django's migrations restrict access to model methods, so this is a snapshot
of the methods that existed at the time this migration was written, since
we need to make use of a lot of these shortcuts here.
"""
def __init__(self, doc):
self.pk = doc.pk
self.correspondent = doc.correspondent
self.title = doc.title
self.file_type = doc.file_type
self.tags = doc.tags
self.created = doc.created
def __str__(self):
created = self.created.strftime("%Y%m%d%H%M%S")
if self.correspondent and self.title:
return f"{created}: {self.correspondent} - {self.title}"
if self.correspondent or self.title:
return f"{created}: {self.correspondent or self.title}"
return str(created)
@property
def source_path(self):
return (
Path(settings.MEDIA_ROOT)
/ "documents"
/ "originals"
/ f"{self.pk:07}.{self.file_type}.gpg"
)
@property
def source_file(self):
return self.source_path.open("rb")
@property
def file_name(self):
return slugify(str(self)) + "." + self.file_type
def set_checksums(apps, schema_editor):
document_model = apps.get_model("documents", "Document")
if not document_model.objects.all().exists():
return
print(
colourise(
"\n\n"
" This is a one-time only migration to generate checksums for all\n"
" of your existing documents. If you have a lot of documents\n"
" though, this may take a while, so a coffee break may be in\n"
" order."
"\n",
opts=("bold",),
),
)
sums = {}
for d in document_model.objects.all():
document = Document(d)
print(
" {} {} {}".format(
colourise("*", fg="green"),
colourise("Generating a checksum for", fg="white"),
colourise(document.file_name, fg="cyan"),
),
)
with document.source_file as encrypted:
checksum = hashlib.md5(GnuPG.decrypted(encrypted)).hexdigest()
if checksum in sums:
error = "\n{line}{p1}\n\n{doc1}\n{doc2}\n\n{p2}\n\n{code}\n\n{p3}{line}".format(
p1=colourise(
"It appears that you have two identical documents in your collection and \nPaperless no longer supports this (see issue #97). The documents in question\nare:",
fg="yellow",
),
p2=colourise(
"To fix this problem, you'll have to remove one of them from the database, a task\nmost easily done by running the following command in the same\ndirectory as manage.py:",
fg="yellow",
),
p3=colourise(
"When that's finished, re-run the migrate, and provided that there aren't any\nother duplicates, you should be good to go.",
fg="yellow",
),
doc1=colourise(
f" * {sums[checksum][1]} (id: {sums[checksum][0]})",
fg="red",
),
doc2=colourise(
f" * {document.file_name} (id: {document.pk})",
fg="red",
),
code=colourise(
f" $ echo 'DELETE FROM documents_document WHERE id = {document.pk};' | ./manage.py dbshell",
fg="green",
),
line=colourise("\n{}\n".format("=" * 80), fg="white", opts=("bold",)),
)
raise RuntimeError(error)
sums[checksum] = (document.pk, document.file_name)
document_model.objects.filter(pk=document.pk).update(checksum=checksum)
def do_nothing(apps, schema_editor):
pass
class Migration(migrations.Migration):
dependencies = [
("documents", "0013_auto_20160325_2111"),
]
operations = [
migrations.AddField(
model_name="document",
name="checksum",
field=models.CharField(
default="-",
db_index=True,
editable=False,
max_length=32,
help_text="The checksum of the original document (before it "
"was encrypted). We use this to prevent duplicate "
"document imports.",
),
preserve_default=False,
),
migrations.RunPython(set_checksums, do_nothing),
migrations.AlterField(
model_name="document",
name="created",
field=models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
),
),
migrations.AlterField(
model_name="document",
name="modified",
field=models.DateTimeField(auto_now=True, db_index=True),
),
]

View File

@@ -1,33 +0,0 @@
# Generated by Django 1.10.2 on 2016-10-05 21:38
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0014_document_checksum"),
]
operations = [
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
help_text="The checksum of the original document (before it was encrypted). We use this to prevent duplicate document imports.",
max_length=32,
unique=True,
),
),
migrations.AddField(
model_name="correspondent",
name="is_insensitive",
field=models.BooleanField(default=True),
),
migrations.AddField(
model_name="tag",
name="is_insensitive",
field=models.BooleanField(default=True),
),
]

View File

@@ -1,92 +0,0 @@
# Generated by Django 4.2.13 on 2024-06-28 17:57
import django.db.models.deletion
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
replaces = [
("documents", "0015_add_insensitive_to_match"),
("documents", "0016_auto_20170325_1558"),
("documents", "0017_auto_20170512_0507"),
("documents", "0018_auto_20170715_1712"),
]
dependencies = [
("documents", "0014_document_checksum"),
]
operations = [
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
help_text="The checksum of the original document (before it was encrypted). We use this to prevent duplicate document imports.",
max_length=32,
unique=True,
),
),
migrations.AddField(
model_name="correspondent",
name="is_insensitive",
field=models.BooleanField(default=True),
),
migrations.AddField(
model_name="tag",
name="is_insensitive",
field=models.BooleanField(default=True),
),
migrations.AlterField(
model_name="document",
name="content",
field=models.TextField(
blank=True,
db_index=("mysql" not in settings.DATABASES["default"]["ENGINE"]),
help_text="The raw, text-only data of the document. This field is primarily used for searching.",
),
),
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
migrations.AlterField(
model_name="document",
name="correspondent",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.correspondent",
),
),
]

View File

@@ -1,23 +0,0 @@
# Generated by Django 1.10.5 on 2017-03-25 15:58
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0015_add_insensitive_to_match"),
]
operations = [
migrations.AlterField(
model_name="document",
name="content",
field=models.TextField(
blank=True,
db_index=("mysql" not in settings.DATABASES["default"]["ENGINE"]),
help_text="The raw, text-only data of the document. This field is primarily used for searching.",
),
),
]

View File

@@ -1,43 +0,0 @@
# Generated by Django 1.10.5 on 2017-05-12 05:07
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0016_auto_20170325_1558"),
]
operations = [
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
]

View File

@@ -1,25 +0,0 @@
# Generated by Django 1.10.5 on 2017-07-15 17:12
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0017_auto_20170512_0507"),
]
operations = [
migrations.AlterField(
model_name="document",
name="correspondent",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.Correspondent",
),
),
]

View File

@@ -1,22 +0,0 @@
# Generated by Django 1.10.5 on 2017-07-15 17:12
from django.contrib.auth.models import User
from django.db import migrations
def forwards_func(apps, schema_editor):
User.objects.create(username="consumer")
def reverse_func(apps, schema_editor):
User.objects.get(username="consumer").delete()
class Migration(migrations.Migration):
dependencies = [
("documents", "0018_auto_20170715_1712"),
]
operations = [
migrations.RunPython(forwards_func, reverse_func),
]

View File

@@ -1,29 +0,0 @@
import django.utils.timezone
from django.db import migrations
from django.db import models
def set_added_time_to_created_time(apps, schema_editor):
Document = apps.get_model("documents", "Document")
for doc in Document.objects.all():
doc.added = doc.created
doc.save()
class Migration(migrations.Migration):
dependencies = [
("documents", "0019_add_consumer_user"),
]
operations = [
migrations.AddField(
model_name="document",
name="added",
field=models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
editable=False,
),
),
migrations.RunPython(set_added_time_to_created_time),
]

View File

@@ -1,41 +0,0 @@
# Generated by Django 1.11.10 on 2018-02-04 13:07
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "0020_document_added"),
]
operations = [
# Add the field with the default GPG-encrypted value
migrations.AddField(
model_name="document",
name="storage_type",
field=models.CharField(
choices=[
("unencrypted", "Unencrypted"),
("gpg", "Encrypted with GNU Privacy Guard"),
],
default="gpg",
editable=False,
max_length=11,
),
),
# Now that the field is added, change the default to unencrypted
migrations.AlterField(
model_name="document",
name="storage_type",
field=models.CharField(
choices=[
("unencrypted", "Unencrypted"),
("gpg", "Encrypted with GNU Privacy Guard"),
],
default="unencrypted",
editable=False,
max_length=11,
),
),
]

View File

@@ -1,61 +0,0 @@
# Generated by Django 2.0.8 on 2018-10-07 14:20
from django.db import migrations
from django.db import models
from django.utils.text import slugify
def re_slug_all_the_things(apps, schema_editor):
"""
Rewrite all slug values to make sure they're actually slugs before we brand
them as uneditable.
"""
Tag = apps.get_model("documents", "Tag")
Correspondent = apps.get_model("documents", "Correspondent")
for klass in (Tag, Correspondent):
for instance in klass.objects.all():
klass.objects.filter(pk=instance.pk).update(slug=slugify(instance.slug))
class Migration(migrations.Migration):
dependencies = [
("documents", "0021_document_storage_type"),
]
operations = [
migrations.AlterModelOptions(
name="tag",
options={"ordering": ("name",)},
),
migrations.AlterField(
model_name="correspondent",
name="slug",
field=models.SlugField(blank=True, editable=False),
),
migrations.AlterField(
model_name="document",
name="file_type",
field=models.CharField(
choices=[
("pdf", "PDF"),
("png", "PNG"),
("jpg", "JPG"),
("gif", "GIF"),
("tiff", "TIFF"),
("txt", "TXT"),
("csv", "CSV"),
("md", "MD"),
],
editable=False,
max_length=4,
),
),
migrations.AlterField(
model_name="tag",
name="slug",
field=models.SlugField(blank=True, editable=False),
),
migrations.RunPython(re_slug_all_the_things, migrations.RunPython.noop),
]

View File

@@ -1,39 +0,0 @@
# Generated by Django 2.0.10 on 2019-04-26 18:57
from django.db import migrations
from django.db import models
def set_filename(apps, schema_editor):
Document = apps.get_model("documents", "Document")
for doc in Document.objects.all():
file_name = f"{doc.pk:07}.{doc.file_type}"
if doc.storage_type == "gpg":
file_name += ".gpg"
# Set filename
doc.filename = file_name
# Save document
doc.save()
class Migration(migrations.Migration):
dependencies = [
("documents", "0022_auto_20181007_1420"),
]
operations = [
migrations.AddField(
model_name="document",
name="filename",
field=models.FilePathField(
default=None,
null=True,
editable=False,
help_text="Current filename in storage",
max_length=256,
),
),
migrations.RunPython(set_filename),
]

View File

@@ -1,147 +0,0 @@
# Generated by Django 3.1.3 on 2020-11-07 12:35
import uuid
import django.db.models.deletion
from django.db import migrations
from django.db import models
def logs_set_default_group(apps, schema_editor):
Log = apps.get_model("documents", "Log")
for log in Log.objects.all():
if log.group is None:
log.group = uuid.uuid4()
log.save()
class Migration(migrations.Migration):
dependencies = [
("documents", "0023_document_current_filename"),
]
operations = [
migrations.AddField(
model_name="document",
name="archive_serial_number",
field=models.IntegerField(
blank=True,
db_index=True,
help_text="The position of this document in your physical document archive.",
null=True,
unique=True,
),
),
migrations.AddField(
model_name="tag",
name="is_inbox_tag",
field=models.BooleanField(
default=False,
help_text="Marks this tag as an inbox tag: All newly consumed documents will be tagged with inbox tags.",
),
),
migrations.CreateModel(
name="DocumentType",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("name", models.CharField(max_length=128, unique=True)),
("slug", models.SlugField(blank=True, editable=False)),
("match", models.CharField(blank=True, max_length=256)),
(
"matching_algorithm",
models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
(6, "Automatic Classification"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
("is_insensitive", models.BooleanField(default=True)),
],
options={
"abstract": False,
"ordering": ("name",),
},
),
migrations.AddField(
model_name="document",
name="document_type",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.documenttype",
),
),
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
(6, "Automatic Classification"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any"),
(2, "All"),
(3, "Literal"),
(4, "Regular Expression"),
(5, "Fuzzy Match"),
(6, "Automatic Classification"),
],
default=1,
help_text='Which algorithm you want to use when matching text to the OCR\'d PDF. Here, "any" looks for any occurrence of any word provided in the PDF, while "all" requires that every word provided appear in the PDF, albeit not in the order provided. A "literal" match means that the text you enter must appear in the PDF exactly as you\'ve entered it, and "regular expression" uses a regex to match the PDF. (If you don\'t know what a regex is, you probably don\'t want this option.) Finally, a "fuzzy match" looks for words or phrases that are mostly—but not exactly—the same, which can be useful for matching against documents containing imperfections that foil accurate OCR.',
),
),
migrations.AlterField(
model_name="document",
name="content",
field=models.TextField(
blank=True,
help_text="The raw, text-only data of the document. This field is primarily used for searching.",
),
),
migrations.AlterModelOptions(
name="log",
options={"ordering": ("-created",)},
),
migrations.RemoveField(
model_name="log",
name="modified",
),
migrations.AlterField(
model_name="log",
name="group",
field=models.UUIDField(blank=True, null=True),
),
migrations.RunPython(
code=django.db.migrations.operations.special.RunPython.noop,
reverse_code=logs_set_default_group,
),
]

View File

@@ -1,13 +0,0 @@
# Generated by Django 3.1.3 on 2020-11-09 16:36
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "1000_update_paperless_all"),
]
operations = [
migrations.RunPython(migrations.RunPython.noop, migrations.RunPython.noop),
]

View File

@@ -1,24 +0,0 @@
# Generated by Django 3.1.3 on 2020-11-11 11:05
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1001_auto_20201109_1636"),
]
operations = [
migrations.AlterField(
model_name="document",
name="filename",
field=models.FilePathField(
default=None,
editable=False,
help_text="Current filename in storage",
max_length=1024,
null=True,
),
),
]

View File

@@ -1,92 +0,0 @@
# Generated by Django 3.1.3 on 2020-11-20 11:21
from pathlib import Path
import magic
from django.conf import settings
from django.db import migrations
from django.db import models
from paperless.db import GnuPG
STORAGE_TYPE_UNENCRYPTED = "unencrypted"
STORAGE_TYPE_GPG = "gpg"
def source_path(self) -> Path:
if self.filename:
fname: str = str(self.filename)
else:
fname = f"{self.pk:07}.{self.file_type}"
if self.storage_type == STORAGE_TYPE_GPG:
fname += ".gpg"
return Path(settings.ORIGINALS_DIR) / fname
def add_mime_types(apps, schema_editor):
Document = apps.get_model("documents", "Document")
documents = Document.objects.all()
for d in documents:
with Path(source_path(d)).open("rb") as f:
if d.storage_type == STORAGE_TYPE_GPG:
data = GnuPG.decrypted(f)
else:
data = f.read(1024)
d.mime_type = magic.from_buffer(data, mime=True)
d.save()
def add_file_extensions(apps, schema_editor):
Document = apps.get_model("documents", "Document")
documents = Document.objects.all()
for d in documents:
d.file_type = Path(d.filename).suffix.lstrip(".")
d.save()
class Migration(migrations.Migration):
dependencies = [
("documents", "1002_auto_20201111_1105"),
]
operations = [
migrations.AddField(
model_name="document",
name="mime_type",
field=models.CharField(default="-", editable=False, max_length=256),
preserve_default=False,
),
migrations.RunPython(add_mime_types, migrations.RunPython.noop),
# This operation is here so that we can revert the entire migration:
# By allowing this field to be blank and null, we can revert the
# remove operation further down and the database won't complain about
# NOT NULL violations.
migrations.AlterField(
model_name="document",
name="file_type",
field=models.CharField(
choices=[
("pdf", "PDF"),
("png", "PNG"),
("jpg", "JPG"),
("gif", "GIF"),
("tiff", "TIFF"),
("txt", "TXT"),
("csv", "CSV"),
("md", "MD"),
],
editable=False,
max_length=4,
null=True,
blank=True,
),
),
migrations.RunPython(migrations.RunPython.noop, add_file_extensions),
migrations.RemoveField(
model_name="document",
name="file_type",
),
]

View File

@@ -1,12 +0,0 @@
# Generated by Django 3.1.3 on 2020-11-25 14:53
from django.db import migrations
from django.db.migrations import RunPython
class Migration(migrations.Migration):
dependencies = [
("documents", "1003_mime_types"),
]
operations = [RunPython(migrations.RunPython.noop, migrations.RunPython.noop)]

View File

@@ -1,34 +0,0 @@
# Generated by Django 3.1.3 on 2020-11-29 00:48
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1004_sanity_check_schedule"),
]
operations = [
migrations.AddField(
model_name="document",
name="archive_checksum",
field=models.CharField(
blank=True,
editable=False,
help_text="The checksum of the archived document.",
max_length=32,
null=True,
),
),
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
help_text="The checksum of the original document.",
max_length=32,
unique=True,
),
),
]

View File

@@ -1,24 +0,0 @@
# Generated by Django 3.1.4 on 2020-12-08 22:09
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "1005_checksums"),
]
operations = [
migrations.RemoveField(
model_name="correspondent",
name="slug",
),
migrations.RemoveField(
model_name="documenttype",
name="slug",
),
migrations.RemoveField(
model_name="tag",
name="slug",
),
]

View File

@@ -1,485 +0,0 @@
# Generated by Django 4.2.13 on 2024-06-28 18:01
import django.db.models.deletion
import django.utils.timezone
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
replaces = [
("documents", "1006_auto_20201208_2209"),
("documents", "1007_savedview_savedviewfilterrule"),
("documents", "1008_auto_20201216_1736"),
("documents", "1009_auto_20201216_2005"),
("documents", "1010_auto_20210101_2159"),
("documents", "1011_auto_20210101_2340"),
]
dependencies = [
("documents", "1005_checksums"),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.RemoveField(
model_name="correspondent",
name="slug",
),
migrations.RemoveField(
model_name="documenttype",
name="slug",
),
migrations.RemoveField(
model_name="tag",
name="slug",
),
migrations.CreateModel(
name="SavedView",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("name", models.CharField(max_length=128, verbose_name="name")),
(
"show_on_dashboard",
models.BooleanField(verbose_name="show on dashboard"),
),
(
"show_in_sidebar",
models.BooleanField(verbose_name="show in sidebar"),
),
(
"sort_field",
models.CharField(max_length=128, verbose_name="sort field"),
),
(
"sort_reverse",
models.BooleanField(default=False, verbose_name="sort reverse"),
),
(
"user",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
to=settings.AUTH_USER_MODEL,
verbose_name="user",
),
),
],
options={
"ordering": ("name",),
"verbose_name": "saved view",
"verbose_name_plural": "saved views",
},
),
migrations.AlterModelOptions(
name="correspondent",
options={
"ordering": ("name",),
"verbose_name": "correspondent",
"verbose_name_plural": "correspondents",
},
),
migrations.AlterModelOptions(
name="document",
options={
"ordering": ("-created",),
"verbose_name": "document",
"verbose_name_plural": "documents",
},
),
migrations.AlterModelOptions(
name="documenttype",
options={
"verbose_name": "document type",
"verbose_name_plural": "document types",
},
),
migrations.AlterModelOptions(
name="log",
options={
"ordering": ("-created",),
"verbose_name": "log",
"verbose_name_plural": "logs",
},
),
migrations.AlterModelOptions(
name="tag",
options={"verbose_name": "tag", "verbose_name_plural": "tags"},
),
migrations.AlterField(
model_name="correspondent",
name="is_insensitive",
field=models.BooleanField(default=True, verbose_name="is insensitive"),
),
migrations.AlterField(
model_name="correspondent",
name="match",
field=models.CharField(blank=True, max_length=256, verbose_name="match"),
),
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="correspondent",
name="name",
field=models.CharField(max_length=128, unique=True, verbose_name="name"),
),
migrations.AlterField(
model_name="document",
name="added",
field=models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
editable=False,
verbose_name="added",
),
),
migrations.AlterField(
model_name="document",
name="archive_checksum",
field=models.CharField(
blank=True,
editable=False,
help_text="The checksum of the archived document.",
max_length=32,
null=True,
verbose_name="archive checksum",
),
),
migrations.AlterField(
model_name="document",
name="archive_serial_number",
field=models.IntegerField(
blank=True,
db_index=True,
help_text="The position of this document in your physical document archive.",
null=True,
unique=True,
verbose_name="archive serial number",
),
),
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
help_text="The checksum of the original document.",
max_length=32,
unique=True,
verbose_name="checksum",
),
),
migrations.AlterField(
model_name="document",
name="content",
field=models.TextField(
blank=True,
help_text="The raw, text-only data of the document. This field is primarily used for searching.",
verbose_name="content",
),
),
migrations.AlterField(
model_name="document",
name="correspondent",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.correspondent",
verbose_name="correspondent",
),
),
migrations.AlterField(
model_name="document",
name="created",
field=models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
verbose_name="created",
),
),
migrations.AlterField(
model_name="document",
name="document_type",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.documenttype",
verbose_name="document type",
),
),
migrations.AlterField(
model_name="document",
name="filename",
field=models.FilePathField(
default=None,
editable=False,
help_text="Current filename in storage",
max_length=1024,
null=True,
verbose_name="filename",
),
),
migrations.AlterField(
model_name="document",
name="mime_type",
field=models.CharField(
editable=False,
max_length=256,
verbose_name="mime type",
),
),
migrations.AlterField(
model_name="document",
name="modified",
field=models.DateTimeField(
auto_now=True,
db_index=True,
verbose_name="modified",
),
),
migrations.AlterField(
model_name="document",
name="storage_type",
field=models.CharField(
choices=[
("unencrypted", "Unencrypted"),
("gpg", "Encrypted with GNU Privacy Guard"),
],
default="unencrypted",
editable=False,
max_length=11,
verbose_name="storage type",
),
),
migrations.AlterField(
model_name="document",
name="tags",
field=models.ManyToManyField(
blank=True,
related_name="documents",
to="documents.tag",
verbose_name="tags",
),
),
migrations.AlterField(
model_name="document",
name="title",
field=models.CharField(
blank=True,
db_index=True,
max_length=128,
verbose_name="title",
),
),
migrations.AlterField(
model_name="documenttype",
name="is_insensitive",
field=models.BooleanField(default=True, verbose_name="is insensitive"),
),
migrations.AlterField(
model_name="documenttype",
name="match",
field=models.CharField(blank=True, max_length=256, verbose_name="match"),
),
migrations.AlterField(
model_name="documenttype",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="documenttype",
name="name",
field=models.CharField(max_length=128, unique=True, verbose_name="name"),
),
migrations.AlterField(
model_name="log",
name="created",
field=models.DateTimeField(auto_now_add=True, verbose_name="created"),
),
migrations.AlterField(
model_name="log",
name="group",
field=models.UUIDField(blank=True, null=True, verbose_name="group"),
),
migrations.AlterField(
model_name="log",
name="level",
field=models.PositiveIntegerField(
choices=[
(10, "debug"),
(20, "information"),
(30, "warning"),
(40, "error"),
(50, "critical"),
],
default=20,
verbose_name="level",
),
),
migrations.AlterField(
model_name="log",
name="message",
field=models.TextField(verbose_name="message"),
),
migrations.CreateModel(
name="SavedViewFilterRule",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"rule_type",
models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
],
verbose_name="rule type",
),
),
(
"value",
models.CharField(
blank=True,
max_length=128,
null=True,
verbose_name="value",
),
),
(
"saved_view",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="filter_rules",
to="documents.savedview",
verbose_name="saved view",
),
),
],
options={
"verbose_name": "filter rule",
"verbose_name_plural": "filter rules",
},
),
migrations.AlterField(
model_name="tag",
name="colour",
field=models.PositiveIntegerField(
choices=[
(1, "#a6cee3"),
(2, "#1f78b4"),
(3, "#b2df8a"),
(4, "#33a02c"),
(5, "#fb9a99"),
(6, "#e31a1c"),
(7, "#fdbf6f"),
(8, "#ff7f00"),
(9, "#cab2d6"),
(10, "#6a3d9a"),
(11, "#b15928"),
(12, "#000000"),
(13, "#cccccc"),
],
default=1,
verbose_name="color",
),
),
migrations.AlterField(
model_name="tag",
name="is_inbox_tag",
field=models.BooleanField(
default=False,
help_text="Marks this tag as an inbox tag: All newly consumed documents will be tagged with inbox tags.",
verbose_name="is inbox tag",
),
),
migrations.AlterField(
model_name="tag",
name="is_insensitive",
field=models.BooleanField(default=True, verbose_name="is insensitive"),
),
migrations.AlterField(
model_name="tag",
name="match",
field=models.CharField(blank=True, max_length=256, verbose_name="match"),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="tag",
name="name",
field=models.CharField(max_length=128, unique=True, verbose_name="name"),
),
]

View File

@@ -1,90 +0,0 @@
# Generated by Django 3.1.4 on 2020-12-12 14:41
import django.db.models.deletion
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1006_auto_20201208_2209"),
]
operations = [
migrations.CreateModel(
name="SavedView",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("name", models.CharField(max_length=128)),
("show_on_dashboard", models.BooleanField()),
("show_in_sidebar", models.BooleanField()),
("sort_field", models.CharField(max_length=128)),
("sort_reverse", models.BooleanField(default=False)),
(
"user",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
to=settings.AUTH_USER_MODEL,
),
),
],
),
migrations.CreateModel(
name="SavedViewFilterRule",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"rule_type",
models.PositiveIntegerField(
choices=[
(0, "Title contains"),
(1, "Content contains"),
(2, "ASN is"),
(3, "Correspondent is"),
(4, "Document type is"),
(5, "Is in inbox"),
(6, "Has tag"),
(7, "Has any tag"),
(8, "Created before"),
(9, "Created after"),
(10, "Created year is"),
(11, "Created month is"),
(12, "Created day is"),
(13, "Added before"),
(14, "Added after"),
(15, "Modified before"),
(16, "Modified after"),
(17, "Does not have tag"),
],
),
),
("value", models.CharField(max_length=128)),
(
"saved_view",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="filter_rules",
to="documents.savedview",
),
),
],
),
]

View File

@@ -1,33 +0,0 @@
# Generated by Django 3.1.4 on 2020-12-16 17:36
import django.db.models.functions.text
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "1007_savedview_savedviewfilterrule"),
]
operations = [
migrations.AlterModelOptions(
name="correspondent",
options={"ordering": (django.db.models.functions.text.Lower("name"),)},
),
migrations.AlterModelOptions(
name="document",
options={"ordering": ("-created",)},
),
migrations.AlterModelOptions(
name="documenttype",
options={"ordering": (django.db.models.functions.text.Lower("name"),)},
),
migrations.AlterModelOptions(
name="savedview",
options={"ordering": (django.db.models.functions.text.Lower("name"),)},
),
migrations.AlterModelOptions(
name="tag",
options={"ordering": (django.db.models.functions.text.Lower("name"),)},
),
]

View File

@@ -1,28 +0,0 @@
# Generated by Django 3.1.4 on 2020-12-16 20:05
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "1008_auto_20201216_1736"),
]
operations = [
migrations.AlterModelOptions(
name="correspondent",
options={"ordering": ("name",)},
),
migrations.AlterModelOptions(
name="documenttype",
options={"ordering": ("name",)},
),
migrations.AlterModelOptions(
name="savedview",
options={"ordering": ("name",)},
),
migrations.AlterModelOptions(
name="tag",
options={"ordering": ("name",)},
),
]

View File

@@ -1,18 +0,0 @@
# Generated by Django 3.1.4 on 2021-01-01 21:59
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1009_auto_20201216_2005"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="value",
field=models.CharField(blank=True, max_length=128, null=True),
),
]

View File

@@ -1,454 +0,0 @@
# Generated by Django 3.1.4 on 2021-01-01 23:40
import django.db.models.deletion
import django.utils.timezone
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1010_auto_20210101_2159"),
]
operations = [
migrations.AlterModelOptions(
name="correspondent",
options={
"ordering": ("name",),
"verbose_name": "correspondent",
"verbose_name_plural": "correspondents",
},
),
migrations.AlterModelOptions(
name="document",
options={
"ordering": ("-created",),
"verbose_name": "document",
"verbose_name_plural": "documents",
},
),
migrations.AlterModelOptions(
name="documenttype",
options={
"verbose_name": "document type",
"verbose_name_plural": "document types",
},
),
migrations.AlterModelOptions(
name="log",
options={
"ordering": ("-created",),
"verbose_name": "log",
"verbose_name_plural": "logs",
},
),
migrations.AlterModelOptions(
name="savedview",
options={
"ordering": ("name",),
"verbose_name": "saved view",
"verbose_name_plural": "saved views",
},
),
migrations.AlterModelOptions(
name="savedviewfilterrule",
options={
"verbose_name": "filter rule",
"verbose_name_plural": "filter rules",
},
),
migrations.AlterModelOptions(
name="tag",
options={"verbose_name": "tag", "verbose_name_plural": "tags"},
),
migrations.AlterField(
model_name="correspondent",
name="is_insensitive",
field=models.BooleanField(default=True, verbose_name="is insensitive"),
),
migrations.AlterField(
model_name="correspondent",
name="match",
field=models.CharField(blank=True, max_length=256, verbose_name="match"),
),
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="correspondent",
name="name",
field=models.CharField(max_length=128, unique=True, verbose_name="name"),
),
migrations.AlterField(
model_name="document",
name="added",
field=models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
editable=False,
verbose_name="added",
),
),
migrations.AlterField(
model_name="document",
name="archive_checksum",
field=models.CharField(
blank=True,
editable=False,
help_text="The checksum of the archived document.",
max_length=32,
null=True,
verbose_name="archive checksum",
),
),
migrations.AlterField(
model_name="document",
name="archive_serial_number",
field=models.IntegerField(
blank=True,
db_index=True,
help_text="The position of this document in your physical document archive.",
null=True,
unique=True,
verbose_name="archive serial number",
),
),
migrations.AlterField(
model_name="document",
name="checksum",
field=models.CharField(
editable=False,
help_text="The checksum of the original document.",
max_length=32,
unique=True,
verbose_name="checksum",
),
),
migrations.AlterField(
model_name="document",
name="content",
field=models.TextField(
blank=True,
help_text="The raw, text-only data of the document. This field is primarily used for searching.",
verbose_name="content",
),
),
migrations.AlterField(
model_name="document",
name="correspondent",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.correspondent",
verbose_name="correspondent",
),
),
migrations.AlterField(
model_name="document",
name="created",
field=models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
verbose_name="created",
),
),
migrations.AlterField(
model_name="document",
name="document_type",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.documenttype",
verbose_name="document type",
),
),
migrations.AlterField(
model_name="document",
name="filename",
field=models.FilePathField(
default=None,
editable=False,
help_text="Current filename in storage",
max_length=1024,
null=True,
verbose_name="filename",
),
),
migrations.AlterField(
model_name="document",
name="mime_type",
field=models.CharField(
editable=False,
max_length=256,
verbose_name="mime type",
),
),
migrations.AlterField(
model_name="document",
name="modified",
field=models.DateTimeField(
auto_now=True,
db_index=True,
verbose_name="modified",
),
),
migrations.AlterField(
model_name="document",
name="storage_type",
field=models.CharField(
choices=[
("unencrypted", "Unencrypted"),
("gpg", "Encrypted with GNU Privacy Guard"),
],
default="unencrypted",
editable=False,
max_length=11,
verbose_name="storage type",
),
),
migrations.AlterField(
model_name="document",
name="tags",
field=models.ManyToManyField(
blank=True,
related_name="documents",
to="documents.Tag",
verbose_name="tags",
),
),
migrations.AlterField(
model_name="document",
name="title",
field=models.CharField(
blank=True,
db_index=True,
max_length=128,
verbose_name="title",
),
),
migrations.AlterField(
model_name="documenttype",
name="is_insensitive",
field=models.BooleanField(default=True, verbose_name="is insensitive"),
),
migrations.AlterField(
model_name="documenttype",
name="match",
field=models.CharField(blank=True, max_length=256, verbose_name="match"),
),
migrations.AlterField(
model_name="documenttype",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="documenttype",
name="name",
field=models.CharField(max_length=128, unique=True, verbose_name="name"),
),
migrations.AlterField(
model_name="log",
name="created",
field=models.DateTimeField(auto_now_add=True, verbose_name="created"),
),
migrations.AlterField(
model_name="log",
name="group",
field=models.UUIDField(blank=True, null=True, verbose_name="group"),
),
migrations.AlterField(
model_name="log",
name="level",
field=models.PositiveIntegerField(
choices=[
(10, "debug"),
(20, "information"),
(30, "warning"),
(40, "error"),
(50, "critical"),
],
default=20,
verbose_name="level",
),
),
migrations.AlterField(
model_name="log",
name="message",
field=models.TextField(verbose_name="message"),
),
migrations.AlterField(
model_name="savedview",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AlterField(
model_name="savedview",
name="show_in_sidebar",
field=models.BooleanField(verbose_name="show in sidebar"),
),
migrations.AlterField(
model_name="savedview",
name="show_on_dashboard",
field=models.BooleanField(verbose_name="show on dashboard"),
),
migrations.AlterField(
model_name="savedview",
name="sort_field",
field=models.CharField(max_length=128, verbose_name="sort field"),
),
migrations.AlterField(
model_name="savedview",
name="sort_reverse",
field=models.BooleanField(default=False, verbose_name="sort reverse"),
),
migrations.AlterField(
model_name="savedview",
name="user",
field=models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
to=settings.AUTH_USER_MODEL,
verbose_name="user",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
],
verbose_name="rule type",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="saved_view",
field=models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE,
related_name="filter_rules",
to="documents.savedview",
verbose_name="saved view",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="value",
field=models.CharField(
blank=True,
max_length=128,
null=True,
verbose_name="value",
),
),
migrations.AlterField(
model_name="tag",
name="colour",
field=models.PositiveIntegerField(
choices=[
(1, "#a6cee3"),
(2, "#1f78b4"),
(3, "#b2df8a"),
(4, "#33a02c"),
(5, "#fb9a99"),
(6, "#e31a1c"),
(7, "#fdbf6f"),
(8, "#ff7f00"),
(9, "#cab2d6"),
(10, "#6a3d9a"),
(11, "#b15928"),
(12, "#000000"),
(13, "#cccccc"),
],
default=1,
verbose_name="color",
),
),
migrations.AlterField(
model_name="tag",
name="is_inbox_tag",
field=models.BooleanField(
default=False,
help_text="Marks this tag as an inbox tag: All newly consumed documents will be tagged with inbox tags.",
verbose_name="is inbox tag",
),
),
migrations.AlterField(
model_name="tag",
name="is_insensitive",
field=models.BooleanField(default=True, verbose_name="is insensitive"),
),
migrations.AlterField(
model_name="tag",
name="match",
field=models.CharField(blank=True, max_length=256, verbose_name="match"),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="tag",
name="name",
field=models.CharField(max_length=128, unique=True, verbose_name="name"),
),
]

View File

@@ -1,367 +0,0 @@
# Generated by Django 3.1.6 on 2021-02-07 22:26
import datetime
import hashlib
import logging
import os
import shutil
from collections import defaultdict
from pathlib import Path
from time import sleep
import pathvalidate
from django.conf import settings
from django.db import migrations
from django.db import models
from django.template.defaultfilters import slugify
logger = logging.getLogger("paperless.migrations")
###############################################################################
# This is code copied straight paperless before the change.
###############################################################################
class defaultdictNoStr(defaultdict):
def __str__(self): # pragma: no cover
raise ValueError("Don't use {tags} directly.")
def many_to_dictionary(field): # pragma: no cover
# Converts ManyToManyField to dictionary by assuming, that field
# entries contain an _ or - which will be used as a delimiter
mydictionary = dict()
for index, t in enumerate(field.all()):
# Populate tag names by index
mydictionary[index] = slugify(t.name)
# Find delimiter
delimiter = t.name.find("_")
if delimiter == -1:
delimiter = t.name.find("-")
if delimiter == -1:
continue
key = t.name[:delimiter]
value = t.name[delimiter + 1 :]
mydictionary[slugify(key)] = slugify(value)
return mydictionary
def archive_name_from_filename(filename: Path) -> Path:
return Path(filename.stem + ".pdf")
def archive_path_old(doc) -> Path:
if doc.filename:
fname = archive_name_from_filename(Path(doc.filename))
else:
fname = Path(f"{doc.pk:07}.pdf")
return settings.ARCHIVE_DIR / fname
STORAGE_TYPE_GPG = "gpg"
def archive_path_new(doc) -> Path | None:
if doc.archive_filename is not None:
return settings.ARCHIVE_DIR / doc.archive_filename
else:
return None
def source_path(doc) -> Path:
if doc.filename:
fname = doc.filename
else:
fname = f"{doc.pk:07}{doc.file_type}"
if doc.storage_type == STORAGE_TYPE_GPG:
fname = Path(str(fname) + ".gpg") # pragma: no cover
return settings.ORIGINALS_DIR / fname
def generate_unique_filename(doc, *, archive_filename=False):
if archive_filename:
old_filename = doc.archive_filename
root = settings.ARCHIVE_DIR
else:
old_filename = doc.filename
root = settings.ORIGINALS_DIR
counter = 0
while True:
new_filename = generate_filename(
doc,
counter=counter,
archive_filename=archive_filename,
)
if new_filename == old_filename:
# still the same as before.
return new_filename
if (root / new_filename).exists():
counter += 1
else:
return new_filename
def generate_filename(doc, *, counter=0, append_gpg=True, archive_filename=False):
path = ""
try:
if settings.FILENAME_FORMAT is not None:
tags = defaultdictNoStr(lambda: slugify(None), many_to_dictionary(doc.tags))
tag_list = pathvalidate.sanitize_filename(
",".join(sorted([tag.name for tag in doc.tags.all()])),
replacement_text="-",
)
if doc.correspondent:
correspondent = pathvalidate.sanitize_filename(
doc.correspondent.name,
replacement_text="-",
)
else:
correspondent = "none"
if doc.document_type:
document_type = pathvalidate.sanitize_filename(
doc.document_type.name,
replacement_text="-",
)
else:
document_type = "none"
path = settings.FILENAME_FORMAT.format(
title=pathvalidate.sanitize_filename(doc.title, replacement_text="-"),
correspondent=correspondent,
document_type=document_type,
created=datetime.date.isoformat(doc.created),
created_year=doc.created.year if doc.created else "none",
created_month=f"{doc.created.month:02}" if doc.created else "none",
created_day=f"{doc.created.day:02}" if doc.created else "none",
added=datetime.date.isoformat(doc.added),
added_year=doc.added.year if doc.added else "none",
added_month=f"{doc.added.month:02}" if doc.added else "none",
added_day=f"{doc.added.day:02}" if doc.added else "none",
tags=tags,
tag_list=tag_list,
).strip()
path = path.strip(os.sep)
except (ValueError, KeyError, IndexError):
logger.warning(
f"Invalid PAPERLESS_FILENAME_FORMAT: "
f"{settings.FILENAME_FORMAT}, falling back to default",
)
counter_str = f"_{counter:02}" if counter else ""
filetype_str = ".pdf" if archive_filename else doc.file_type
if len(path) > 0:
filename = f"{path}{counter_str}{filetype_str}"
else:
filename = f"{doc.pk:07}{counter_str}{filetype_str}"
# Append .gpg for encrypted files
if append_gpg and doc.storage_type == STORAGE_TYPE_GPG:
filename += ".gpg"
return filename
###############################################################################
# This code performs bidirection archive file transformation.
###############################################################################
def parse_wrapper(parser, path, mime_type, file_name):
# this is here so that I can mock this out for testing.
parser.parse(path, mime_type, file_name)
def create_archive_version(doc, retry_count=3):
from documents.parsers import DocumentParser
from documents.parsers import ParseError
from documents.parsers import get_parser_class_for_mime_type
logger.info(f"Regenerating archive document for document ID:{doc.id}")
parser_class = get_parser_class_for_mime_type(doc.mime_type)
for try_num in range(retry_count):
parser: DocumentParser = parser_class(None, None)
try:
parse_wrapper(
parser,
source_path(doc),
doc.mime_type,
Path(doc.filename).name,
)
doc.content = parser.get_text()
if parser.get_archive_path() and Path(parser.get_archive_path()).is_file():
doc.archive_filename = generate_unique_filename(
doc,
archive_filename=True,
)
with Path(parser.get_archive_path()).open("rb") as f:
doc.archive_checksum = hashlib.md5(f.read()).hexdigest()
archive_path_new(doc).parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(parser.get_archive_path(), archive_path_new(doc))
else:
doc.archive_checksum = None
logger.error(
f"Parser did not return an archive document for document "
f"ID:{doc.id}. Removing archive document.",
)
doc.save()
return
except ParseError:
if try_num + 1 == retry_count:
logger.exception(
f"Unable to regenerate archive document for ID:{doc.id}. You "
f"need to invoke the document_archiver management command "
f"manually for that document.",
)
doc.archive_checksum = None
doc.save()
return
else:
# This is mostly here for the tika parser in docker
# environments. The servers for parsing need to come up first,
# and the docker setup doesn't ensure that tika is running
# before attempting migrations.
logger.error("Parse error, will try again in 5 seconds...")
sleep(5)
finally:
parser.cleanup()
def move_old_to_new_locations(apps, schema_editor):
Document = apps.get_model("documents", "Document")
affected_document_ids = set()
old_archive_path_to_id = {}
# check for documents that have incorrect archive versions
for doc in Document.objects.filter(archive_checksum__isnull=False):
old_path = archive_path_old(doc)
if old_path in old_archive_path_to_id:
affected_document_ids.add(doc.id)
affected_document_ids.add(old_archive_path_to_id[old_path])
else:
old_archive_path_to_id[old_path] = doc.id
# check that archive files of all unaffected documents are in place
for doc in Document.objects.filter(archive_checksum__isnull=False):
old_path = archive_path_old(doc)
if doc.id not in affected_document_ids and not old_path.is_file():
raise ValueError(
f"Archived document ID:{doc.id} does not exist at: {old_path}",
)
# check that we can regenerate affected archive versions
for doc_id in affected_document_ids:
from documents.parsers import get_parser_class_for_mime_type
doc = Document.objects.get(id=doc_id)
parser_class = get_parser_class_for_mime_type(doc.mime_type)
if not parser_class:
raise ValueError(
f"Document ID:{doc.id} has an invalid archived document, "
f"but no parsers are available. Cannot migrate.",
)
for doc in Document.objects.filter(archive_checksum__isnull=False):
if doc.id in affected_document_ids:
old_path = archive_path_old(doc)
# remove affected archive versions
if old_path.is_file():
logger.debug(f"Removing {old_path}")
old_path.unlink()
else:
# Set archive path for unaffected files
doc.archive_filename = archive_name_from_filename(Path(doc.filename))
Document.objects.filter(id=doc.id).update(
archive_filename=doc.archive_filename,
)
# regenerate archive documents
for doc_id in affected_document_ids:
doc = Document.objects.get(id=doc_id)
create_archive_version(doc)
def move_new_to_old_locations(apps, schema_editor):
Document = apps.get_model("documents", "Document")
old_archive_paths = set()
for doc in Document.objects.filter(archive_checksum__isnull=False):
new_archive_path = archive_path_new(doc)
old_archive_path = archive_path_old(doc)
if old_archive_path in old_archive_paths:
raise ValueError(
f"Cannot migrate: Archive file name {old_archive_path} of "
f"document {doc.filename} would clash with another archive "
f"filename.",
)
old_archive_paths.add(old_archive_path)
if new_archive_path != old_archive_path and old_archive_path.is_file():
raise ValueError(
f"Cannot migrate: Cannot move {new_archive_path} to "
f"{old_archive_path}: file already exists.",
)
for doc in Document.objects.filter(archive_checksum__isnull=False):
new_archive_path = archive_path_new(doc)
old_archive_path = archive_path_old(doc)
if new_archive_path != old_archive_path:
logger.debug(f"Moving {new_archive_path} to {old_archive_path}")
shutil.move(new_archive_path, old_archive_path)
class Migration(migrations.Migration):
dependencies = [
("documents", "1011_auto_20210101_2340"),
]
operations = [
migrations.AddField(
model_name="document",
name="archive_filename",
field=models.FilePathField(
default=None,
editable=False,
help_text="Current archive filename in storage",
max_length=1024,
null=True,
unique=True,
verbose_name="archive filename",
),
),
migrations.AlterField(
model_name="document",
name="filename",
field=models.FilePathField(
default=None,
editable=False,
help_text="Current filename in storage",
max_length=1024,
null=True,
unique=True,
verbose_name="filename",
),
),
migrations.RunPython(move_old_to_new_locations, move_new_to_old_locations),
]

View File

@@ -1,74 +0,0 @@
# Generated by Django 3.1.4 on 2020-12-02 21:43
from django.db import migrations
from django.db import models
COLOURS_OLD = {
1: "#a6cee3",
2: "#1f78b4",
3: "#b2df8a",
4: "#33a02c",
5: "#fb9a99",
6: "#e31a1c",
7: "#fdbf6f",
8: "#ff7f00",
9: "#cab2d6",
10: "#6a3d9a",
11: "#b15928",
12: "#000000",
13: "#cccccc",
}
def forward(apps, schema_editor):
Tag = apps.get_model("documents", "Tag")
for tag in Tag.objects.all():
colour_old_id = tag.colour_old
rgb = COLOURS_OLD[colour_old_id]
tag.color = rgb
tag.save()
def reverse(apps, schema_editor):
Tag = apps.get_model("documents", "Tag")
def _get_colour_id(rdb):
for idx, rdbx in COLOURS_OLD.items():
if rdbx == rdb:
return idx
# Return colour 1 if we can't match anything
return 1
for tag in Tag.objects.all():
colour_id = _get_colour_id(tag.color)
tag.colour_old = colour_id
tag.save()
class Migration(migrations.Migration):
dependencies = [
("documents", "1012_fix_archive_files"),
]
operations = [
migrations.RenameField(
model_name="tag",
old_name="colour",
new_name="colour_old",
),
migrations.AddField(
model_name="tag",
name="color",
field=models.CharField(
default="#a6cee3",
max_length=7,
verbose_name="color",
),
),
migrations.RunPython(forward, reverse),
migrations.RemoveField(
model_name="tag",
name="colour_old",
),
]

View File

@@ -1,42 +0,0 @@
# Generated by Django 3.1.7 on 2021-02-28 15:14
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1013_migrate_tag_colour"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,27 +0,0 @@
# Generated by Django 3.1.7 on 2021-04-04 18:28
import logging
from django.db import migrations
logger = logging.getLogger("paperless.migrations")
def remove_null_characters(apps, schema_editor):
Document = apps.get_model("documents", "Document")
for doc in Document.objects.all():
content: str = doc.content
if "\0" in content:
logger.info(f"Removing null characters from document {doc}...")
doc.content = content.replace("\0", " ")
doc.save()
class Migration(migrations.Migration):
dependencies = [
("documents", "1014_auto_20210228_1614"),
]
operations = [
migrations.RunPython(remove_null_characters, migrations.RunPython.noop),
]

View File

@@ -1,54 +0,0 @@
# Generated by Django 3.1.7 on 2021-03-17 12:51
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1015_remove_null_characters"),
]
operations = [
migrations.AlterField(
model_name="savedview",
name="sort_field",
field=models.CharField(
blank=True,
max_length=128,
null=True,
verbose_name="sort field",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,190 +0,0 @@
# Generated by Django 4.2.13 on 2024-06-28 18:09
import django.db.models.deletion
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
replaces = [
("documents", "1016_auto_20210317_1351"),
("documents", "1017_alter_savedviewfilterrule_rule_type"),
("documents", "1018_alter_savedviewfilterrule_value"),
("documents", "1019_uisettings"),
("documents", "1019_storagepath_document_storage_path"),
("documents", "1020_merge_20220518_1839"),
]
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1015_remove_null_characters"),
]
operations = [
migrations.AlterField(
model_name="savedview",
name="sort_field",
field=models.CharField(
blank=True,
max_length=128,
null=True,
verbose_name="sort field",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
],
verbose_name="rule type",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
],
verbose_name="rule type",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="value",
field=models.CharField(
blank=True,
max_length=255,
null=True,
verbose_name="value",
),
),
migrations.CreateModel(
name="UiSettings",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("settings", models.JSONField(null=True)),
(
"user",
models.OneToOneField(
on_delete=django.db.models.deletion.CASCADE,
related_name="ui_settings",
to=settings.AUTH_USER_MODEL,
),
),
],
),
migrations.CreateModel(
name="StoragePath",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"name",
models.CharField(max_length=128, unique=True, verbose_name="name"),
),
(
"match",
models.CharField(blank=True, max_length=256, verbose_name="match"),
),
(
"matching_algorithm",
models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
(
"is_insensitive",
models.BooleanField(default=True, verbose_name="is insensitive"),
),
("path", models.CharField(max_length=512, verbose_name="path")),
],
options={
"verbose_name": "storage path",
"verbose_name_plural": "storage paths",
"ordering": ("name",),
},
),
migrations.AddField(
model_name="document",
name="storage_path",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.storagepath",
verbose_name="storage path",
),
),
]

View File

@@ -1,45 +0,0 @@
# Generated by Django 3.2.12 on 2022-03-17 11:59
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1016_auto_20210317_1351"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,23 +0,0 @@
# Generated by Django 4.0.3 on 2022-04-01 22:50
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1017_alter_savedviewfilterrule_rule_type"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="value",
field=models.CharField(
blank=True,
max_length=255,
null=True,
verbose_name="value",
),
),
]

View File

@@ -1,73 +0,0 @@
# Generated by Django 4.0.4 on 2022-05-02 15:56
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1018_alter_savedviewfilterrule_value"),
]
operations = [
migrations.CreateModel(
name="StoragePath",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"name",
models.CharField(max_length=128, unique=True, verbose_name="name"),
),
(
"match",
models.CharField(blank=True, max_length=256, verbose_name="match"),
),
(
"matching_algorithm",
models.PositiveIntegerField(
choices=[
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
(
"is_insensitive",
models.BooleanField(default=True, verbose_name="is insensitive"),
),
("path", models.CharField(max_length=512, verbose_name="path")),
],
options={
"verbose_name": "storage path",
"verbose_name_plural": "storage paths",
"ordering": ("name",),
},
),
migrations.AddField(
model_name="document",
name="storage_path",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="documents",
to="documents.storagepath",
verbose_name="storage path",
),
),
]

View File

@@ -1,39 +0,0 @@
# Generated by Django 4.0.4 on 2022-05-07 05:10
import django.db.models.deletion
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1018_alter_savedviewfilterrule_value"),
]
operations = [
migrations.CreateModel(
name="UiSettings",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("settings", models.JSONField(null=True)),
(
"user",
models.OneToOneField(
on_delete=django.db.models.deletion.CASCADE,
related_name="ui_settings",
to=settings.AUTH_USER_MODEL,
),
),
],
),
]

View File

@@ -1,12 +0,0 @@
# Generated by Django 4.0.4 on 2022-05-18 18:39
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "1019_storagepath_document_storage_path"),
("documents", "1019_uisettings"),
]
operations = []

View File

@@ -1,104 +0,0 @@
# Generated by Django 4.0.5 on 2022-06-11 15:40
import logging
import multiprocessing.pool
import shutil
import tempfile
import time
from pathlib import Path
from django.conf import settings
from django.db import migrations
from documents.parsers import run_convert
logger = logging.getLogger("paperless.migrations")
def _do_convert(work_package):
existing_thumbnail, converted_thumbnail = work_package
try:
logger.info(f"Converting thumbnail: {existing_thumbnail}")
# Run actual conversion
run_convert(
density=300,
scale="500x5000>",
alpha="remove",
strip=True,
trim=False,
auto_orient=True,
input_file=f"{existing_thumbnail}[0]",
output_file=str(converted_thumbnail),
)
# Copy newly created thumbnail to thumbnail directory
shutil.copy(converted_thumbnail, existing_thumbnail.parent)
# Remove the PNG version
existing_thumbnail.unlink()
logger.info(
"Conversion to WebP completed, "
f"replaced {existing_thumbnail.name} with {converted_thumbnail.name}",
)
except Exception as e:
logger.error(f"Error converting thumbnail (existing file unchanged): {e}")
def _convert_thumbnails_to_webp(apps, schema_editor):
start = time.time()
with tempfile.TemporaryDirectory() as tempdir:
work_packages = []
for file in Path(settings.THUMBNAIL_DIR).glob("*.png"):
existing_thumbnail = file.resolve()
# Change the existing filename suffix from png to webp
converted_thumbnail_name = existing_thumbnail.with_suffix(
".webp",
).name
# Create the expected output filename in the tempdir
converted_thumbnail = (
Path(tempdir) / Path(converted_thumbnail_name)
).resolve()
# Package up the necessary info
work_packages.append(
(existing_thumbnail, converted_thumbnail),
)
if work_packages:
logger.info(
"\n\n"
" This is a one-time only migration to convert thumbnails for all of your\n"
" documents into WebP format. If you have a lot of documents though, \n"
" this may take a while, so a coffee break may be in order."
"\n",
)
with multiprocessing.pool.Pool(
processes=min(multiprocessing.cpu_count(), 4),
maxtasksperchild=4,
) as pool:
pool.map(_do_convert, work_packages)
end = time.time()
duration = end - start
logger.info(f"Conversion completed in {duration:.3f}s")
class Migration(migrations.Migration):
dependencies = [
("documents", "1020_merge_20220518_1839"),
]
operations = [
migrations.RunPython(
code=_convert_thumbnails_to_webp,
reverse_code=migrations.RunPython.noop,
),
]

View File

@@ -1,52 +0,0 @@
# Generated by Django 4.0.4 on 2022-05-23 07:14
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1021_webp_thumbnail_conversion"),
]
operations = [
migrations.CreateModel(
name="PaperlessTask",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("task_id", models.CharField(max_length=128)),
("name", models.CharField(max_length=256, null=True)),
(
"created",
models.DateTimeField(auto_now=True, verbose_name="created"),
),
(
"started",
models.DateTimeField(null=True, verbose_name="started"),
),
("acknowledged", models.BooleanField(default=False)),
(
"attempted_task",
models.OneToOneField(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="attempted_task",
# This is a dummy field, 1026 will fix up the column
# This manual change is required, as django doesn't django doesn't really support
# removing an app which has migration deps like this
to="documents.document",
),
),
],
),
]

View File

@@ -1,668 +0,0 @@
# Generated by Django 4.2.13 on 2024-06-28 18:10
import django.core.validators
import django.db.models.deletion
import django.utils.timezone
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
replaces = [
("documents", "1022_paperlesstask"),
("documents", "1023_add_comments"),
("documents", "1024_document_original_filename"),
("documents", "1025_alter_savedviewfilterrule_rule_type"),
("documents", "1026_transition_to_celery"),
("documents", "1027_remove_paperlesstask_attempted_task_and_more"),
("documents", "1028_remove_paperlesstask_task_args_and_more"),
("documents", "1029_alter_document_archive_serial_number"),
("documents", "1030_alter_paperlesstask_task_file_name"),
("documents", "1031_remove_savedview_user_correspondent_owner_and_more"),
("documents", "1032_alter_correspondent_matching_algorithm_and_more"),
("documents", "1033_alter_documenttype_options_alter_tag_options_and_more"),
("documents", "1034_alter_savedviewfilterrule_rule_type"),
("documents", "1035_rename_comment_note"),
("documents", "1036_alter_savedviewfilterrule_rule_type"),
]
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("django_celery_results", "0011_taskresult_periodic_task_name"),
("documents", "1021_webp_thumbnail_conversion"),
]
operations = [
migrations.CreateModel(
name="Comment",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"comment",
models.TextField(
blank=True,
help_text="Comment for the document",
verbose_name="content",
),
),
(
"created",
models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
verbose_name="created",
),
),
(
"document",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="documents",
to="documents.document",
verbose_name="document",
),
),
(
"user",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="users",
to=settings.AUTH_USER_MODEL,
verbose_name="user",
),
),
],
options={
"verbose_name": "comment",
"verbose_name_plural": "comments",
"ordering": ("created",),
},
),
migrations.AddField(
model_name="document",
name="original_filename",
field=models.CharField(
default=None,
editable=False,
help_text="The original name of the file when it was uploaded",
max_length=1024,
null=True,
verbose_name="original filename",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
(23, "ASN greater than"),
(24, "ASN less than"),
(25, "storage path is"),
],
verbose_name="rule type",
),
),
migrations.CreateModel(
name="PaperlessTask",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
("task_id", models.CharField(max_length=128)),
("acknowledged", models.BooleanField(default=False)),
(
"attempted_task",
models.OneToOneField(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="attempted_task",
to="django_celery_results.taskresult",
),
),
],
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_q_ormq",
reverse_sql="",
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_q_schedule",
reverse_sql="",
),
migrations.RunSQL(
sql="DROP TABLE IF EXISTS django_q_task",
reverse_sql="",
),
migrations.RemoveField(
model_name="paperlesstask",
name="attempted_task",
),
migrations.AddField(
model_name="paperlesstask",
name="date_created",
field=models.DateTimeField(
default=django.utils.timezone.now,
help_text="Datetime field when the task result was created in UTC",
null=True,
verbose_name="Created DateTime",
),
),
migrations.AddField(
model_name="paperlesstask",
name="date_done",
field=models.DateTimeField(
default=None,
help_text="Datetime field when the task was completed in UTC",
null=True,
verbose_name="Completed DateTime",
),
),
migrations.AddField(
model_name="paperlesstask",
name="date_started",
field=models.DateTimeField(
default=None,
help_text="Datetime field when the task was started in UTC",
null=True,
verbose_name="Started DateTime",
),
),
migrations.AddField(
model_name="paperlesstask",
name="result",
field=models.TextField(
default=None,
help_text="The data returned by the task",
null=True,
verbose_name="Result Data",
),
),
migrations.AddField(
model_name="paperlesstask",
name="status",
field=models.CharField(
choices=[
("FAILURE", "FAILURE"),
("PENDING", "PENDING"),
("RECEIVED", "RECEIVED"),
("RETRY", "RETRY"),
("REVOKED", "REVOKED"),
("STARTED", "STARTED"),
("SUCCESS", "SUCCESS"),
],
default="PENDING",
help_text="Current state of the task being run",
max_length=30,
verbose_name="Task State",
),
),
migrations.AddField(
model_name="paperlesstask",
name="task_name",
field=models.CharField(
help_text="Name of the Task which was run",
max_length=255,
null=True,
verbose_name="Task Name",
),
),
migrations.AlterField(
model_name="paperlesstask",
name="acknowledged",
field=models.BooleanField(
default=False,
help_text="If the task is acknowledged via the frontend or API",
verbose_name="Acknowledged",
),
),
migrations.AlterField(
model_name="paperlesstask",
name="task_id",
field=models.CharField(
help_text="Celery ID for the Task that was run",
max_length=255,
unique=True,
verbose_name="Task ID",
),
),
migrations.AlterField(
model_name="document",
name="archive_serial_number",
field=models.PositiveIntegerField(
blank=True,
db_index=True,
help_text="The position of this document in your physical document archive.",
null=True,
unique=True,
validators=[
django.core.validators.MaxValueValidator(4294967295),
django.core.validators.MinValueValidator(0),
],
verbose_name="archive serial number",
),
),
migrations.AddField(
model_name="paperlesstask",
name="task_file_name",
field=models.CharField(
help_text="Name of the file which the Task was run for",
max_length=255,
null=True,
verbose_name="Task Filename",
),
),
migrations.RenameField(
model_name="savedview",
old_name="user",
new_name="owner",
),
migrations.AlterField(
model_name="savedview",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="correspondent",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="document",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="documenttype",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="storagepath",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="tag",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="documenttype",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="storagepath",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterModelOptions(
name="documenttype",
options={
"ordering": ("name",),
"verbose_name": "document type",
"verbose_name_plural": "document types",
},
),
migrations.AlterModelOptions(
name="tag",
options={
"ordering": ("name",),
"verbose_name": "tag",
"verbose_name_plural": "tags",
},
),
migrations.AlterField(
model_name="correspondent",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AlterField(
model_name="documenttype",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AlterField(
model_name="storagepath",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AlterField(
model_name="tag",
name="name",
field=models.CharField(max_length=128, verbose_name="name"),
),
migrations.AddConstraint(
model_name="correspondent",
constraint=models.UniqueConstraint(
fields=("name", "owner"),
name="documents_correspondent_unique_name_owner",
),
),
migrations.AddConstraint(
model_name="correspondent",
constraint=models.UniqueConstraint(
condition=models.Q(("owner__isnull", True)),
fields=("name",),
name="documents_correspondent_name_uniq",
),
),
migrations.AddConstraint(
model_name="documenttype",
constraint=models.UniqueConstraint(
fields=("name", "owner"),
name="documents_documenttype_unique_name_owner",
),
),
migrations.AddConstraint(
model_name="documenttype",
constraint=models.UniqueConstraint(
condition=models.Q(("owner__isnull", True)),
fields=("name",),
name="documents_documenttype_name_uniq",
),
),
migrations.AddConstraint(
model_name="storagepath",
constraint=models.UniqueConstraint(
fields=("name", "owner"),
name="documents_storagepath_unique_name_owner",
),
),
migrations.AddConstraint(
model_name="storagepath",
constraint=models.UniqueConstraint(
condition=models.Q(("owner__isnull", True)),
fields=("name",),
name="documents_storagepath_name_uniq",
),
),
migrations.AddConstraint(
model_name="tag",
constraint=models.UniqueConstraint(
fields=("name", "owner"),
name="documents_tag_unique_name_owner",
),
),
migrations.AddConstraint(
model_name="tag",
constraint=models.UniqueConstraint(
condition=models.Q(("owner__isnull", True)),
fields=("name",),
name="documents_tag_name_uniq",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
(23, "ASN greater than"),
(24, "ASN less than"),
(25, "storage path is"),
(26, "has correspondent in"),
(27, "does not have correspondent in"),
(28, "has document type in"),
(29, "does not have document type in"),
(30, "has storage path in"),
(31, "does not have storage path in"),
],
verbose_name="rule type",
),
),
migrations.RenameModel(
old_name="Comment",
new_name="Note",
),
migrations.RenameField(
model_name="note",
old_name="comment",
new_name="note",
),
migrations.AlterModelOptions(
name="note",
options={
"ordering": ("created",),
"verbose_name": "note",
"verbose_name_plural": "notes",
},
),
migrations.AlterField(
model_name="note",
name="document",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="notes",
to="documents.document",
verbose_name="document",
),
),
migrations.AlterField(
model_name="note",
name="note",
field=models.TextField(
blank=True,
help_text="Note for the document",
verbose_name="content",
),
),
migrations.AlterField(
model_name="note",
name="user",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="notes",
to=settings.AUTH_USER_MODEL,
verbose_name="user",
),
),
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
(23, "ASN greater than"),
(24, "ASN less than"),
(25, "storage path is"),
(26, "has correspondent in"),
(27, "does not have correspondent in"),
(28, "has document type in"),
(29, "does not have document type in"),
(30, "has storage path in"),
(31, "does not have storage path in"),
(32, "owner is"),
(33, "has owner in"),
(34, "does not have owner"),
(35, "does not have owner in"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,70 +0,0 @@
import django.utils.timezone
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1022_paperlesstask"),
]
operations = [
migrations.CreateModel(
name="Comment",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"comment",
models.TextField(
blank=True,
help_text="Comment for the document",
verbose_name="content",
),
),
(
"created",
models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
verbose_name="created",
),
),
(
"document",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="documents",
to="documents.document",
verbose_name="document",
),
),
(
"user",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="users",
to=settings.AUTH_USER_MODEL,
verbose_name="user",
),
),
],
options={
"verbose_name": "comment",
"verbose_name_plural": "comments",
"ordering": ("created",),
},
),
]

View File

@@ -1,25 +0,0 @@
# Generated by Django 4.0.6 on 2022-07-25 06:34
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1023_add_comments"),
]
operations = [
migrations.AddField(
model_name="document",
name="original_filename",
field=models.CharField(
default=None,
editable=False,
help_text="The original name of the file when it was uploaded",
max_length=1024,
null=True,
verbose_name="original filename",
),
),
]

View File

@@ -1,48 +0,0 @@
# Generated by Django 4.0.5 on 2022-08-26 16:49
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1024_document_original_filename"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
(23, "ASN greater than"),
(24, "ASN less than"),
(25, "storage path is"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,60 +0,0 @@
# Generated by Django 4.1.1 on 2022-09-27 19:31
import django.db.models.deletion
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("django_celery_results", "0011_taskresult_periodic_task_name"),
("documents", "1025_alter_savedviewfilterrule_rule_type"),
]
operations = [
migrations.RemoveField(
model_name="paperlesstask",
name="created",
),
migrations.RemoveField(
model_name="paperlesstask",
name="name",
),
migrations.RemoveField(
model_name="paperlesstask",
name="started",
),
# Remove the field from the model
migrations.RemoveField(
model_name="paperlesstask",
name="attempted_task",
),
# Add the field back, pointing to the correct model
# This resolves a problem where the temporary change in 1022
# results in a type mismatch
migrations.AddField(
model_name="paperlesstask",
name="attempted_task",
field=models.OneToOneField(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="attempted_task",
to="django_celery_results.taskresult",
),
),
# Drop the django-q tables entirely
# Must be done last or there could be references here
migrations.RunSQL(
"DROP TABLE IF EXISTS django_q_ormq",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
"DROP TABLE IF EXISTS django_q_schedule",
reverse_sql=migrations.RunSQL.noop,
),
migrations.RunSQL(
"DROP TABLE IF EXISTS django_q_task",
reverse_sql=migrations.RunSQL.noop,
),
]

View File

@@ -1,134 +0,0 @@
# Generated by Django 4.1.2 on 2022-10-17 16:31
import django.utils.timezone
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1026_transition_to_celery"),
]
operations = [
migrations.RemoveField(
model_name="paperlesstask",
name="attempted_task",
),
migrations.AddField(
model_name="paperlesstask",
name="date_created",
field=models.DateTimeField(
default=django.utils.timezone.now,
help_text="Datetime field when the task result was created in UTC",
null=True,
verbose_name="Created DateTime",
),
),
migrations.AddField(
model_name="paperlesstask",
name="date_done",
field=models.DateTimeField(
default=None,
help_text="Datetime field when the task was completed in UTC",
null=True,
verbose_name="Completed DateTime",
),
),
migrations.AddField(
model_name="paperlesstask",
name="date_started",
field=models.DateTimeField(
default=None,
help_text="Datetime field when the task was started in UTC",
null=True,
verbose_name="Started DateTime",
),
),
migrations.AddField(
model_name="paperlesstask",
name="result",
field=models.TextField(
default=None,
help_text="The data returned by the task",
null=True,
verbose_name="Result Data",
),
),
migrations.AddField(
model_name="paperlesstask",
name="status",
field=models.CharField(
choices=[
("FAILURE", "FAILURE"),
("PENDING", "PENDING"),
("RECEIVED", "RECEIVED"),
("RETRY", "RETRY"),
("REVOKED", "REVOKED"),
("STARTED", "STARTED"),
("SUCCESS", "SUCCESS"),
],
default="PENDING",
help_text="Current state of the task being run",
max_length=30,
verbose_name="Task State",
),
),
migrations.AddField(
model_name="paperlesstask",
name="task_args",
field=models.JSONField(
help_text="JSON representation of the positional arguments used with the task",
null=True,
verbose_name="Task Positional Arguments",
),
),
migrations.AddField(
model_name="paperlesstask",
name="task_file_name",
field=models.CharField(
help_text="Name of the file which the Task was run for",
max_length=255,
null=True,
verbose_name="Task Name",
),
),
migrations.AddField(
model_name="paperlesstask",
name="task_kwargs",
field=models.JSONField(
help_text="JSON representation of the named arguments used with the task",
null=True,
verbose_name="Task Named Arguments",
),
),
migrations.AddField(
model_name="paperlesstask",
name="task_name",
field=models.CharField(
help_text="Name of the Task which was run",
max_length=255,
null=True,
verbose_name="Task Name",
),
),
migrations.AlterField(
model_name="paperlesstask",
name="acknowledged",
field=models.BooleanField(
default=False,
help_text="If the task is acknowledged via the frontend or API",
verbose_name="Acknowledged",
),
),
migrations.AlterField(
model_name="paperlesstask",
name="task_id",
field=models.CharField(
help_text="Celery ID for the Task that was run",
max_length=255,
unique=True,
verbose_name="Task ID",
),
),
]

View File

@@ -1,20 +0,0 @@
# Generated by Django 4.1.3 on 2022-11-22 17:50
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("documents", "1027_remove_paperlesstask_attempted_task_and_more"),
]
operations = [
migrations.RemoveField(
model_name="paperlesstask",
name="task_args",
),
migrations.RemoveField(
model_name="paperlesstask",
name="task_kwargs",
),
]

View File

@@ -1,30 +0,0 @@
# Generated by Django 4.1.4 on 2023-01-24 17:56
import django.core.validators
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1028_remove_paperlesstask_task_args_and_more"),
]
operations = [
migrations.AlterField(
model_name="document",
name="archive_serial_number",
field=models.PositiveIntegerField(
blank=True,
db_index=True,
help_text="The position of this document in your physical document archive.",
null=True,
unique=True,
validators=[
django.core.validators.MaxValueValidator(4294967295),
django.core.validators.MinValueValidator(0),
],
verbose_name="archive serial number",
),
),
]

View File

@@ -1,23 +0,0 @@
# Generated by Django 4.1.5 on 2023-02-03 21:53
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1029_alter_document_archive_serial_number"),
]
operations = [
migrations.AlterField(
model_name="paperlesstask",
name="task_file_name",
field=models.CharField(
help_text="Name of the file which the Task was run for",
max_length=255,
null=True,
verbose_name="Task Filename",
),
),
]

View File

@@ -1,87 +0,0 @@
# Generated by Django 4.1.4 on 2022-02-03 04:24
import django.db.models.deletion
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1030_alter_paperlesstask_task_file_name"),
]
operations = [
migrations.RenameField(
model_name="savedview",
old_name="user",
new_name="owner",
),
migrations.AlterField(
model_name="savedview",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="correspondent",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="document",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="documenttype",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="storagepath",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
migrations.AddField(
model_name="tag",
name="owner",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
]

View File

@@ -1,81 +0,0 @@
# Generated by Django 4.1.7 on 2023-02-22 00:45
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1031_remove_savedview_user_correspondent_owner_and_more"),
]
operations = [
migrations.AlterField(
model_name="correspondent",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="documenttype",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="storagepath",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
migrations.AlterField(
model_name="tag",
name="matching_algorithm",
field=models.PositiveIntegerField(
choices=[
(0, "None"),
(1, "Any word"),
(2, "All words"),
(3, "Exact match"),
(4, "Regular expression"),
(5, "Fuzzy word"),
(6, "Automatic"),
],
default=1,
verbose_name="matching algorithm",
),
),
]

View File

@@ -1,54 +0,0 @@
# Generated by Django 4.1.5 on 2023-03-15 07:10
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1033_alter_documenttype_options_alter_tag_options_and_more"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
(23, "ASN greater than"),
(24, "ASN less than"),
(25, "storage path is"),
(26, "has correspondent in"),
(27, "does not have correspondent in"),
(28, "has document type in"),
(29, "does not have document type in"),
(30, "has storage path in"),
(31, "does not have storage path in"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,62 +0,0 @@
# Generated by Django 4.1.5 on 2023-03-17 22:15
import django.db.models.deletion
from django.conf import settings
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1034_alter_savedviewfilterrule_rule_type"),
]
operations = [
migrations.RenameModel(
old_name="Comment",
new_name="Note",
),
migrations.RenameField(model_name="note", old_name="comment", new_name="note"),
migrations.AlterModelOptions(
name="note",
options={
"ordering": ("created",),
"verbose_name": "note",
"verbose_name_plural": "notes",
},
),
migrations.AlterField(
model_name="note",
name="document",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="notes",
to="documents.document",
verbose_name="document",
),
),
migrations.AlterField(
model_name="note",
name="note",
field=models.TextField(
blank=True,
help_text="Note for the document",
verbose_name="content",
),
),
migrations.AlterField(
model_name="note",
name="user",
field=models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="notes",
to=settings.AUTH_USER_MODEL,
verbose_name="user",
),
),
]

View File

@@ -1,58 +0,0 @@
# Generated by Django 4.1.7 on 2023-05-04 04:11
from django.db import migrations
from django.db import models
class Migration(migrations.Migration):
dependencies = [
("documents", "1035_rename_comment_note"),
]
operations = [
migrations.AlterField(
model_name="savedviewfilterrule",
name="rule_type",
field=models.PositiveIntegerField(
choices=[
(0, "title contains"),
(1, "content contains"),
(2, "ASN is"),
(3, "correspondent is"),
(4, "document type is"),
(5, "is in inbox"),
(6, "has tag"),
(7, "has any tag"),
(8, "created before"),
(9, "created after"),
(10, "created year is"),
(11, "created month is"),
(12, "created day is"),
(13, "added before"),
(14, "added after"),
(15, "modified before"),
(16, "modified after"),
(17, "does not have tag"),
(18, "does not have ASN"),
(19, "title or content contains"),
(20, "fulltext query"),
(21, "more like this"),
(22, "has tags in"),
(23, "ASN greater than"),
(24, "ASN less than"),
(25, "storage path is"),
(26, "has correspondent in"),
(27, "does not have correspondent in"),
(28, "has document type in"),
(29, "does not have document type in"),
(30, "has storage path in"),
(31, "does not have storage path in"),
(32, "owner is"),
(33, "has owner in"),
(34, "does not have owner"),
(35, "does not have owner in"),
],
verbose_name="rule type",
),
),
]

View File

@@ -1,164 +0,0 @@
# Generated by Django 4.1.9 on 2023-06-29 19:29
import logging
import multiprocessing.pool
import shutil
import tempfile
import time
from pathlib import Path
import gnupg
from django.conf import settings
from django.db import migrations
from documents.parsers import run_convert
logger = logging.getLogger("paperless.migrations")
def _do_convert(work_package) -> None:
(
existing_encrypted_thumbnail,
converted_encrypted_thumbnail,
passphrase,
) = work_package
try:
gpg = gnupg.GPG(gnupghome=settings.GNUPG_HOME)
logger.info(f"Decrypting thumbnail: {existing_encrypted_thumbnail}")
# Decrypt png
decrypted_thumbnail = existing_encrypted_thumbnail.with_suffix("").resolve()
with existing_encrypted_thumbnail.open("rb") as existing_encrypted_file:
raw_thumb = gpg.decrypt_file(
existing_encrypted_file,
passphrase=passphrase,
always_trust=True,
).data
with Path(decrypted_thumbnail).open("wb") as decrypted_file:
decrypted_file.write(raw_thumb)
converted_decrypted_thumbnail = Path(
str(converted_encrypted_thumbnail).replace("webp.gpg", "webp"),
).resolve()
logger.info(f"Converting decrypted thumbnail: {decrypted_thumbnail}")
# Convert to webp
run_convert(
density=300,
scale="500x5000>",
alpha="remove",
strip=True,
trim=False,
auto_orient=True,
input_file=f"{decrypted_thumbnail}[0]",
output_file=str(converted_decrypted_thumbnail),
)
logger.info(
f"Encrypting converted thumbnail: {converted_decrypted_thumbnail}",
)
# Encrypt webp
with Path(converted_decrypted_thumbnail).open("rb") as converted_decrypted_file:
encrypted = gpg.encrypt_file(
fileobj_or_path=converted_decrypted_file,
recipients=None,
passphrase=passphrase,
symmetric=True,
always_trust=True,
).data
with Path(converted_encrypted_thumbnail).open(
"wb",
) as converted_encrypted_file:
converted_encrypted_file.write(encrypted)
# Copy newly created thumbnail to thumbnail directory
shutil.copy(converted_encrypted_thumbnail, existing_encrypted_thumbnail.parent)
# Remove the existing encrypted PNG version
existing_encrypted_thumbnail.unlink()
# Remove the decrypted PNG version
decrypted_thumbnail.unlink()
# Remove the decrypted WebP version
converted_decrypted_thumbnail.unlink()
logger.info(
"Conversion to WebP completed, "
f"replaced {existing_encrypted_thumbnail.name} with {converted_encrypted_thumbnail.name}",
)
except Exception as e:
logger.error(f"Error converting thumbnail (existing file unchanged): {e}")
def _convert_encrypted_thumbnails_to_webp(apps, schema_editor) -> None:
start: float = time.time()
with tempfile.TemporaryDirectory() as tempdir:
work_packages = []
if len(list(Path(settings.THUMBNAIL_DIR).glob("*.png.gpg"))) > 0:
passphrase = settings.PASSPHRASE
if not passphrase:
raise Exception(
"Passphrase not defined, encrypted thumbnails cannot be migrated"
"without this",
)
for file in Path(settings.THUMBNAIL_DIR).glob("*.png.gpg"):
existing_thumbnail: Path = file.resolve()
# Change the existing filename suffix from png to webp
converted_thumbnail_name: str = Path(
str(existing_thumbnail).replace(".png.gpg", ".webp.gpg"),
).name
# Create the expected output filename in the tempdir
converted_thumbnail: Path = (
Path(tempdir) / Path(converted_thumbnail_name)
).resolve()
# Package up the necessary info
work_packages.append(
(existing_thumbnail, converted_thumbnail, passphrase),
)
if work_packages:
logger.info(
"\n\n"
" This is a one-time only migration to convert thumbnails for all of your\n"
" *encrypted* documents into WebP format. If you have a lot of encrypted documents, \n"
" this may take a while, so a coffee break may be in order."
"\n",
)
with multiprocessing.pool.Pool(
processes=min(multiprocessing.cpu_count(), 4),
maxtasksperchild=4,
) as pool:
pool.map(_do_convert, work_packages)
end: float = time.time()
duration: float = end - start
logger.info(f"Conversion completed in {duration:.3f}s")
class Migration(migrations.Migration):
dependencies = [
("documents", "1036_alter_savedviewfilterrule_rule_type"),
]
operations = [
migrations.RunPython(
code=_convert_encrypted_thumbnails_to_webp,
reverse_code=migrations.RunPython.noop,
),
]

View File

@@ -1,126 +0,0 @@
# Generated by Django 4.1.10 on 2023-08-14 14:51
import django.db.models.deletion
import django.utils.timezone
from django.conf import settings
from django.contrib.auth.management import create_permissions
from django.contrib.auth.models import Group
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.db import migrations
from django.db import models
from django.db.models import Q
def add_sharelink_permissions(apps, schema_editor):
# create permissions without waiting for post_migrate signal
for app_config in apps.get_app_configs():
app_config.models_module = True
create_permissions(app_config, apps=apps, verbosity=0)
app_config.models_module = None
add_permission = Permission.objects.get(codename="add_document")
sharelink_permissions = Permission.objects.filter(codename__contains="sharelink")
for user in User.objects.filter(Q(user_permissions=add_permission)).distinct():
user.user_permissions.add(*sharelink_permissions)
for group in Group.objects.filter(Q(permissions=add_permission)).distinct():
group.permissions.add(*sharelink_permissions)
def remove_sharelink_permissions(apps, schema_editor):
sharelink_permissions = Permission.objects.filter(codename__contains="sharelink")
for user in User.objects.all():
user.user_permissions.remove(*sharelink_permissions)
for group in Group.objects.all():
group.permissions.remove(*sharelink_permissions)
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("documents", "1037_webp_encrypted_thumbnail_conversion"),
]
operations = [
migrations.CreateModel(
name="ShareLink",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"created",
models.DateTimeField(
blank=True,
db_index=True,
default=django.utils.timezone.now,
editable=False,
verbose_name="created",
),
),
(
"expiration",
models.DateTimeField(
blank=True,
db_index=True,
null=True,
verbose_name="expiration",
),
),
(
"slug",
models.SlugField(
blank=True,
editable=False,
unique=True,
verbose_name="slug",
),
),
(
"file_version",
models.CharField(
choices=[("archive", "Archive"), ("original", "Original")],
default="archive",
max_length=50,
),
),
(
"document",
models.ForeignKey(
blank=True,
on_delete=django.db.models.deletion.CASCADE,
related_name="share_links",
to="documents.document",
verbose_name="document",
),
),
(
"owner",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="share_links",
to=settings.AUTH_USER_MODEL,
verbose_name="owner",
),
),
],
options={
"verbose_name": "share link",
"verbose_name_plural": "share links",
"ordering": ("created",),
},
),
migrations.RunPython(add_sharelink_permissions, remove_sharelink_permissions),
]

View File

@@ -1,219 +0,0 @@
# Generated by Django 4.1.11 on 2023-09-16 18:04
import django.db.models.deletion
import multiselectfield.db.fields
from django.conf import settings
from django.contrib.auth.management import create_permissions
from django.contrib.auth.models import Group
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.db import migrations
from django.db import models
from django.db.models import Q
def add_consumptiontemplate_permissions(apps, schema_editor):
# create permissions without waiting for post_migrate signal
for app_config in apps.get_app_configs():
app_config.models_module = True
create_permissions(app_config, apps=apps, verbosity=0)
app_config.models_module = None
add_permission = Permission.objects.get(codename="add_document")
consumptiontemplate_permissions = Permission.objects.filter(
codename__contains="consumptiontemplate",
)
for user in User.objects.filter(Q(user_permissions=add_permission)).distinct():
user.user_permissions.add(*consumptiontemplate_permissions)
for group in Group.objects.filter(Q(permissions=add_permission)).distinct():
group.permissions.add(*consumptiontemplate_permissions)
def remove_consumptiontemplate_permissions(apps, schema_editor):
consumptiontemplate_permissions = Permission.objects.filter(
codename__contains="consumptiontemplate",
)
for user in User.objects.all():
user.user_permissions.remove(*consumptiontemplate_permissions)
for group in Group.objects.all():
group.permissions.remove(*consumptiontemplate_permissions)
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
("auth", "0012_alter_user_first_name_max_length"),
("documents", "1038_sharelink"),
("paperless_mail", "0021_alter_mailaccount_password"),
]
operations = [
migrations.CreateModel(
name="ConsumptionTemplate",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"name",
models.CharField(max_length=256, unique=True, verbose_name="name"),
),
("order", models.IntegerField(default=0, verbose_name="order")),
(
"sources",
multiselectfield.db.fields.MultiSelectField(
choices=[
(1, "Consume Folder"),
(2, "Api Upload"),
(3, "Mail Fetch"),
],
default="1,2,3",
max_length=3,
),
),
(
"filter_path",
models.CharField(
blank=True,
help_text="Only consume documents with a path that matches this if specified. Wildcards specified as * are allowed. Case insensitive.",
max_length=256,
null=True,
verbose_name="filter path",
),
),
(
"filter_filename",
models.CharField(
blank=True,
help_text="Only consume documents which entirely match this filename if specified. Wildcards such as *.pdf or *invoice* are allowed. Case insensitive.",
max_length=256,
null=True,
verbose_name="filter filename",
),
),
(
"filter_mailrule",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to="paperless_mail.mailrule",
verbose_name="filter documents from this mail rule",
),
),
(
"assign_change_groups",
models.ManyToManyField(
blank=True,
related_name="+",
to="auth.group",
verbose_name="grant change permissions to these groups",
),
),
(
"assign_change_users",
models.ManyToManyField(
blank=True,
related_name="+",
to=settings.AUTH_USER_MODEL,
verbose_name="grant change permissions to these users",
),
),
(
"assign_correspondent",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to="documents.correspondent",
verbose_name="assign this correspondent",
),
),
(
"assign_document_type",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to="documents.documenttype",
verbose_name="assign this document type",
),
),
(
"assign_owner",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
related_name="+",
to=settings.AUTH_USER_MODEL,
verbose_name="assign this owner",
),
),
(
"assign_storage_path",
models.ForeignKey(
blank=True,
null=True,
on_delete=django.db.models.deletion.SET_NULL,
to="documents.storagepath",
verbose_name="assign this storage path",
),
),
(
"assign_tags",
models.ManyToManyField(
blank=True,
to="documents.tag",
verbose_name="assign this tag",
),
),
(
"assign_title",
models.CharField(
blank=True,
help_text="Assign a document title, can include some placeholders, see documentation.",
max_length=256,
null=True,
verbose_name="assign title",
),
),
(
"assign_view_groups",
models.ManyToManyField(
blank=True,
related_name="+",
to="auth.group",
verbose_name="grant view permissions to these groups",
),
),
(
"assign_view_users",
models.ManyToManyField(
blank=True,
related_name="+",
to=settings.AUTH_USER_MODEL,
verbose_name="grant view permissions to these users",
),
),
],
options={
"verbose_name": "consumption template",
"verbose_name_plural": "consumption templates",
},
),
migrations.RunPython(
add_consumptiontemplate_permissions,
remove_consumptiontemplate_permissions,
),
]

View File

@@ -1,171 +0,0 @@
# Generated by Django 4.2.6 on 2023-11-02 17:38
import django.db.models.deletion
import django.utils.timezone
from django.contrib.auth.management import create_permissions
from django.contrib.auth.models import Group
from django.contrib.auth.models import Permission
from django.contrib.auth.models import User
from django.db import migrations
from django.db import models
from django.db.models import Q
def add_customfield_permissions(apps, schema_editor):
# create permissions without waiting for post_migrate signal
for app_config in apps.get_app_configs():
app_config.models_module = True
create_permissions(app_config, apps=apps, verbosity=0)
app_config.models_module = None
add_permission = Permission.objects.get(codename="add_document")
customfield_permissions = Permission.objects.filter(
codename__contains="customfield",
)
for user in User.objects.filter(Q(user_permissions=add_permission)).distinct():
user.user_permissions.add(*customfield_permissions)
for group in Group.objects.filter(Q(permissions=add_permission)).distinct():
group.permissions.add(*customfield_permissions)
def remove_customfield_permissions(apps, schema_editor):
customfield_permissions = Permission.objects.filter(
codename__contains="customfield",
)
for user in User.objects.all():
user.user_permissions.remove(*customfield_permissions)
for group in Group.objects.all():
group.permissions.remove(*customfield_permissions)
class Migration(migrations.Migration):
dependencies = [
("documents", "1039_consumptiontemplate"),
]
operations = [
migrations.CreateModel(
name="CustomField",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"created",
models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
editable=False,
verbose_name="created",
),
),
("name", models.CharField(max_length=128)),
(
"data_type",
models.CharField(
choices=[
("string", "String"),
("url", "URL"),
("date", "Date"),
("boolean", "Boolean"),
("integer", "Integer"),
("float", "Float"),
("monetary", "Monetary"),
],
editable=False,
max_length=50,
verbose_name="data type",
),
),
],
options={
"verbose_name": "custom field",
"verbose_name_plural": "custom fields",
"ordering": ("created",),
},
),
migrations.CreateModel(
name="CustomFieldInstance",
fields=[
(
"id",
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name="ID",
),
),
(
"created",
models.DateTimeField(
db_index=True,
default=django.utils.timezone.now,
editable=False,
verbose_name="created",
),
),
("value_text", models.CharField(max_length=128, null=True)),
("value_bool", models.BooleanField(null=True)),
("value_url", models.URLField(null=True)),
("value_date", models.DateField(null=True)),
("value_int", models.IntegerField(null=True)),
("value_float", models.FloatField(null=True)),
(
"value_monetary",
models.DecimalField(decimal_places=2, max_digits=12, null=True),
),
(
"document",
models.ForeignKey(
editable=False,
on_delete=django.db.models.deletion.CASCADE,
related_name="custom_fields",
to="documents.document",
),
),
(
"field",
models.ForeignKey(
editable=False,
on_delete=django.db.models.deletion.CASCADE,
related_name="fields",
to="documents.customfield",
),
),
],
options={
"verbose_name": "custom field instance",
"verbose_name_plural": "custom field instances",
"ordering": ("created",),
},
),
migrations.AddConstraint(
model_name="customfield",
constraint=models.UniqueConstraint(
fields=("name",),
name="documents_customfield_unique_name",
),
),
migrations.AddConstraint(
model_name="customfieldinstance",
constraint=models.UniqueConstraint(
fields=("document", "field"),
name="documents_customfieldinstance_unique_document_field",
),
),
migrations.RunPython(
add_customfield_permissions,
remove_customfield_permissions,
),
]

Some files were not shown because too many files have changed in this diff Show More