mirror of
https://github.com/paperless-ngx/paperless-ngx.git
synced 2025-04-02 13:45:10 -05:00
232 lines
7.7 KiB
ReStructuredText
232 lines
7.7 KiB
ReStructuredText
.. _extending:
|
|
|
|
Paperless development
|
|
#####################
|
|
|
|
This section describes the steps you need to take to start development on paperless-ng.
|
|
|
|
1. Check out the source from github. The repository is organized in the following way:
|
|
|
|
* ``master`` always represents the latest release and will only see changes
|
|
when a new release is made.
|
|
* ``dev`` contains the code that will be in the next release.
|
|
* ``feature-X`` contain bigger changes that will be in some release, but not
|
|
necessarily the next one.
|
|
|
|
Apart from that, the folder structure is as follows:
|
|
|
|
* ``docs/`` - Documentation.
|
|
* ``src-ui/`` - Code of the front end.
|
|
* ``src/`` - Code of the back end.
|
|
* ``scripts/`` - Various scripts that help with different parts of development.
|
|
* ``docker/`` - Files required to build the docker image.
|
|
|
|
2. Install some dependencies.
|
|
|
|
* Python 3.6.
|
|
* All dependencies listed in the :ref:`Bare metal route <setup-bare_metal>`
|
|
* redis. You can either install redis or use the included scritps/start-redis.sh
|
|
to use docker to fire up a redis instance.
|
|
|
|
Back end development
|
|
====================
|
|
|
|
The backend is a django application. I use PyCharm for development, but you can use whatever
|
|
you want.
|
|
|
|
Install the python dependencies by performing ``pipenv install --dev`` in the src/ directory.
|
|
This will also create a virtual environment, which you can enter with ``pipenv shell`` or
|
|
execute one-shot commands in with ``pipenv run``.
|
|
|
|
In ``src/paperless.conf``, enable debug mode.
|
|
|
|
Configure the IDE to use the src/ folder as the base source folder. Configure the following
|
|
launch configurations in your IDE:
|
|
|
|
* python3 manage.py runserver
|
|
* python3 manage.py qcluster
|
|
* python3 manage.py consumer
|
|
|
|
Depending on which part of paperless you're developing for, you need to have some or all of
|
|
them running.
|
|
|
|
Testing and code style:
|
|
|
|
* Run ``pytest`` in the src/ directory to execute all tests. This also generates a HTML coverage
|
|
report. When runnings test, paperless.conf is loaded as well. However: the tests rely on the default
|
|
configuration. This is not ideal. But for now, make sure no settings except for DEBUG are overridden when testing.
|
|
* Run ``pycodestyle`` to test your code for issues with the configured code style settings.
|
|
|
|
.. note::
|
|
|
|
The line length rule E501 is generally useful for getting multiple source files
|
|
next to each other on the screen. However, in some cases, its just not possible
|
|
to make some lines fit, especially complicated IF cases. Append `` # NOQA: E501``
|
|
to disable this check for certain lines.
|
|
|
|
Front end development
|
|
=====================
|
|
|
|
The front end is build using angular. I use the ``Code - OSS`` IDE for development.
|
|
|
|
In order to get started, you need ``npm``. Install the Angular CLI interface with
|
|
|
|
.. code:: shell-session
|
|
|
|
$ npm install -g @angular/cli
|
|
|
|
and make sure that it's on your path. Next, in the src-ui/ directory, install the
|
|
required dependencies of the project.
|
|
|
|
.. code:: shell-session
|
|
|
|
$ npm install
|
|
|
|
You can launch a development server by running
|
|
|
|
.. code:: shell-session
|
|
|
|
$ ng serve
|
|
|
|
This will automatically update whenever you save. However, in-place compilation might fail
|
|
on syntax errors, in which case you need to restart it.
|
|
|
|
By default, the development server is available on ``http://localhost:4200/`` and is configured
|
|
to access the API at ``http://localhost:8000/api/``, which is the default of the backend.
|
|
If you enabled DEBUG on the back end, several security overrides for allowed hosts, CORS and
|
|
X-Frame-Options are in place so that the front end behaves exactly as in production. This also
|
|
relies on you being logged into the back end. Without a valid session, The front end will simply
|
|
not work.
|
|
|
|
In order to build the front end and serve it as part of django, execute
|
|
|
|
.. code:: shell-session
|
|
|
|
$ ng build --prod --output-path ../src/documents/static/frontend/
|
|
|
|
This will build the front end and put it in a location from which the Django server will serve
|
|
it as static content. This way, you can verify that authentication is working.
|
|
|
|
Making a release
|
|
================
|
|
|
|
Execute the ``make-release.sh <ver>`` script.
|
|
|
|
This will test and assemble everything and also build and tag a docker image.
|
|
|
|
|
|
Extending Paperless
|
|
===================
|
|
|
|
.. warning::
|
|
|
|
This section is not updated to paperless-ng yet.
|
|
|
|
For the most part, Paperless is monolithic, so extending it is often best
|
|
managed by way of modifying the code directly and issuing a pull request on
|
|
`GitHub`_. However, over time the project has been evolving to be a little
|
|
more "pluggable" so that users can write their own stuff that talks to it.
|
|
|
|
.. _GitHub: https://github.com/the-paperless-project/paperless
|
|
|
|
|
|
.. _extending-parsers:
|
|
|
|
Parsers
|
|
-------
|
|
|
|
You can leverage Paperless' consumption model to have it consume files *other*
|
|
than ones handled by default like ``.pdf``, ``.jpg``, and ``.tiff``. To do so,
|
|
you simply follow Django's convention of creating a new app, with a few key
|
|
requirements.
|
|
|
|
|
|
.. _extending-parsers-parserspy:
|
|
|
|
parsers.py
|
|
..........
|
|
|
|
In this file, you create a class that extends
|
|
``documents.parsers.DocumentParser`` and go about implementing the three
|
|
required methods:
|
|
|
|
* ``get_thumbnail()``: Returns the path to a file we can use as a thumbnail for
|
|
this document.
|
|
* ``get_text()``: Returns the text from the document and only the text.
|
|
* ``get_date()``: If possible, this returns the date of the document, otherwise
|
|
it should return ``None``.
|
|
|
|
|
|
.. _extending-parsers-signalspy:
|
|
|
|
signals.py
|
|
..........
|
|
|
|
At consumption time, Paperless emits a ``document_consumer_declaration``
|
|
signal which your module has to react to in order to let the consumer know
|
|
whether or not it's capable of handling a particular file. Think of it like
|
|
this:
|
|
|
|
1. Consumer finds a file in the consumption directory.
|
|
2. It asks all the available parsers: *"Hey, can you handle this file?"*
|
|
3. Each parser responds with either ``None`` meaning they can't handle the
|
|
file, or a dictionary in the following format:
|
|
|
|
.. code:: python
|
|
|
|
{
|
|
"parser": <the class name>,
|
|
"weight": <an integer>
|
|
}
|
|
|
|
The consumer compares the ``weight`` values from all respondents and uses the
|
|
class with the highest value to consume the document. The default parser,
|
|
``RasterisedDocumentParser`` has a weight of ``0``.
|
|
|
|
|
|
.. _extending-parsers-appspy:
|
|
|
|
apps.py
|
|
.......
|
|
|
|
This is a standard Django file, but you'll need to add some code to it to
|
|
connect your parser to the ``document_consumer_declaration`` signal.
|
|
|
|
|
|
.. _extending-parsers-finally:
|
|
|
|
Finally
|
|
.......
|
|
|
|
The last step is to update ``settings.py`` to include your new module.
|
|
Eventually, this will be dynamic, but at the moment, you have to edit the
|
|
``INSTALLED_APPS`` section manually. Simply add the path to your AppConfig to
|
|
the list like this:
|
|
|
|
.. code:: python
|
|
|
|
INSTALLED_APPS = [
|
|
...
|
|
"my_module.apps.MyModuleConfig",
|
|
...
|
|
]
|
|
|
|
Order doesn't matter, but generally it's a good idea to place your module lower
|
|
in the list so that you don't end up accidentally overriding project defaults
|
|
somewhere.
|
|
|
|
|
|
.. _extending-parsers-example:
|
|
|
|
An Example
|
|
..........
|
|
|
|
The core Paperless functionality is based on this design, so if you want to see
|
|
what a parser module should look like, have a look at `parsers.py`_,
|
|
`signals.py`_, and `apps.py`_ in the `paperless_tesseract`_ module.
|
|
|
|
.. _parsers.py: https://github.com/the-paperless-project/paperless/blob/master/src/paperless_tesseract/parsers.py
|
|
.. _signals.py: https://github.com/the-paperless-project/paperless/blob/master/src/paperless_tesseract/signals.py
|
|
.. _apps.py: https://github.com/the-paperless-project/paperless/blob/master/src/paperless_tesseract/apps.py
|
|
.. _paperless_tesseract: https://github.com/the-paperless-project/paperless/blob/master/src/paperless_tesseract/
|