Feature: Enhanced backend custom field search API (#7589)

commit 910dae8413028f647e6295f30207cb5d4fc6605d
Author: Yichi Yang <yiy067@ucsd.edu>
Date:   Wed Sep 4 12:47:19 2024 -0700

    Fix: correctly handle the case where custom_field_lookup refers to multiple fields

commit e43f70d708b7d6b445f3ca8c8bf9dbdf5ee26085
Author: Yichi Yang <yiy067@ucsd.edu>
Date:   Sat Aug 31 14:06:45 2024 -0700

Co-Authored-By: Yichi Yang <yichiyan@usc.edu>
This commit is contained in:
shamoon
2024-09-23 11:28:31 -07:00
parent f06ff85b7d
commit d7ba6d98d3
7 changed files with 1270 additions and 38 deletions

View File

@@ -235,12 +235,6 @@ results:
Pagination works exactly the same as it does for normal requests on this
endpoint.
Certain limitations apply to full text queries:
- Results are always sorted by search score. The results matching the
query best will show up first.
- Only a small subset of filtering parameters are supported.
Furthermore, each returned document has an additional `__search_hit__`
attribute with various information about the search results:
@@ -280,6 +274,67 @@ attribute with various information about the search results:
- `rank` is the index of the search results. The first result will
have rank 0.
### Filtering by custom fields
You can filter documents by their custom field values by specifying the
`custom_field_lookup` query parameter. Here are some recipes for common
use cases:
1. Documents with a custom field "due" (date) between Aug 1, 2024 and
Sept 1, 2024 (inclusive):
`?custom_field_lookup=["due", "range", ["2024-08-01", "2024-09-01"]]`
2. Documents with a custom field "customer" (text) that equals "bob"
(case sensitive):
`?custom_field_lookup=["customer", "exact", "bob"]`
3. Documents with a custom field "answered" (boolean) set to `true`:
`?custom_field_lookup=["answered", "exact", true]`
4. Documents with a custom field "favorite animal" (select) set to either
"cat" or "dog":
`?custom_field_lookup=["favorite animal", "in", ["cat", "dog"]]`
5. Documents with a custom field "address" (text) that is empty:
`?custom_field_lookup=["OR", ["address", "isnull", true], ["address", "exact", ""]]`
6. Documents that don't have a field called "foo":
`?custom_field_lookup=["foo", "exists", false]`
7. Documents that have document links "references" to both document 3 and 7:
`?custom_field_lookup=["references", "contains", [3, 7]]`
All field types support basic operations including `exact`, `in`, `isnull`,
and `exists`. String, URL, and monetary fields support case-insensitive
substring matching operations including `icontains`, `istartswith`, and
`iendswith`. Integer, float, and date fields support arithmetic comparisons
including `gt` (>), `gte` (>=), `lt` (<), `lte` (<=), and `range`.
Lastly, document link fields support a `contains` operator that behaves
like a "is superset of" check.
!!! warning
It is possible to do case-insensitive exact match (i.e., `iexact`) and
case-sensitive substring match (i.e., `contains`, `startswith`,
`endswith`) for string, URL, and monetary fields, but
[they may not work as expected on some database backends](https://docs.djangoproject.com/en/5.1/ref/databases/#substring-matching-and-case-sensitivity).
It is also possible to use regular expressions to match string, URL, and
monetary fields, but the syntax is database-dependent, and accepting
regular expressions from untrusted sources could make your instance
vulnerable to regular expression denial of service attacks.
For these reasons the above expressions are disabled by default.
If you understand the implications, you may enable them by uncommenting
`PAPERLESS_CUSTOM_FIELD_LOOKUP_OPT_IN` in your configuration file.
### `/api/search/autocomplete/`
Get auto completions for a partial search term.