diff --git a/docs/api.rst b/docs/api.rst index c2120b20f..3a9d244c5 100644 --- a/docs/api.rst +++ b/docs/api.rst @@ -147,93 +147,57 @@ The REST api provides three different forms of authentication. Searching for documents ####################### -Paperless-ng offers API endpoints for full text search. These are as follows: +Full text searching is available on the ``/api/documents/`` endpoint. Two specific +query parameters cause the API to return full text search results: -``/api/search/`` -================ +* ``/api/documents/?query=your%20search%20query``: Search for a document using a full text query. + For details on the syntax, see :ref:`basic-usage_searching`. -Get search results based on a query. +* ``/api/documents/?more_like=1234``: Search for documents similar to the document with id 1234. -Query parameters: +Pagination works exactly the same as it does for normal requests on this endpoint. -* ``query``: The query string. See - `here `_ - for details on the syntax. -* ``page``: Specify the page you want to retrieve. Each page - contains 10 search results and the first page is ``page=1``, which - is the default if this is omitted. +Certain limitations apply to full text queries: -Result list object returned by the endpoint: +* Results are always sorted by search score. The results matching the query best will show up first. -.. code:: json +* Only a small subset of filtering parameters are supported. + +Furthermore, each returned document has an additional ``__search_hit__`` attribute with various information +about the search results: + +.. code:: { - "count": 1, - "page": 1, - "page_count": 1, - "corrected_query": "", + "count": 31, + "next": "http://localhost:8000/api/documents/?page=2&query=test", + "previous": null, "results": [ + ... + + { + "id": 123, + "title": "title", + "content": "content", + + ... + + "__search_hit__": { + "score": 0.343, + "highlights": "text Test text", + "rank": 23 + } + }, + + ... + ] } -* ``count``: The approximate total number of results. -* ``page``: The page returned to you. This might be different from - the page you requested, if you requested a page that is behind - the last page. In that case, the last page is returned. -* ``page_count``: The total number of pages. -* ``corrected_query``: Corrected version of the query string. Can be null. - If not null, can be used verbatim to start a new query. -* ``results``: A list of result objects on the current page. - -Result object: - -.. code:: json - - { - "id": 1, - "highlights": [ - - ], - "score": 6.34234, - "rank": 23, - "document": { - - } - } - -* ``id``: the primary key of the found document -* ``highlights``: an object containing parsable highlights for the result. - See below. -* ``score``: The score assigned to the document. A higher score indicates a - better match with the query. Search results are sorted descending by score. -* ``rank``: the position of the document within the entire search results list. -* ``document``: The full json of the document, as returned by - ``/api/documents//``. - -Highlights object: - -Highlights are provided as a list of fragments. A fragment is a longer section of -text from the original document. -Each fragment contains a list of strings, and some of them are marked as a highlight. - -.. code:: json - - [ - [ - {"text": "This is a sample text with a ", "highlight": false}, - {"text": "highlighted", "highlight": true}, - {"text": " word.", "highlight": false} - ], - [ - {"text": "Another", "highlight": true}, - {"text": " fragment with a highlight.", "highlight": false} - ] - ] - -A client may use this example to produce the following output: - -... This is a sample text with a **highlighted** word. ... **Another** fragment with a highlight. ... +* ``score`` is an indication how well this document matches the query relative to the other search results. +* ``highlights`` is an excerpt from the document content and highlights the search terms with ```` tags as shown above. +* ``rank`` is the index of the search results. The first result will have rank 0. ``/api/search/autocomplete/`` ============================= diff --git a/docs/usage_overview.rst b/docs/usage_overview.rst index 2c7093b99..7283db02f 100644 --- a/docs/usage_overview.rst +++ b/docs/usage_overview.rst @@ -255,6 +255,8 @@ Here are a couple examples of tags and types that you could use in your collecti * A tag ``missing_metadata`` when you still need to add some metadata to a document, but can't or don't want to do this right now. +.. _basic-usage_searching: + Searching #########