Fabian Ohler 658d372cd2 Feature: split documents on ASN barcode (#2554)
* also split documents when an ASN barcode is found

* linter

* fix test case parameters

* avoid pre-python-3.9 features

* simplify dict-creation in tests

* simplify dict-creation in tests for empty dicts

* Add test cases for the splitting by ASN barcode feature

* deleted supporting files for test case construction
2023-02-01 01:13:30 -08:00
2023-01-27 10:00:55 -08:00
2022-02-26 20:14:24 -08:00
2022-10-27 00:27:15 +02:00
2023-01-29 08:52:01 -08:00
2022-03-20 15:58:37 +01:00
2022-12-03 18:29:14 +01:00
2023-01-06 17:59:39 -08:00
2022-03-21 08:38:01 -07:00
2022-12-03 14:38:55 +01:00
2021-03-06 22:13:12 +01:00
2015-12-20 12:54:28 +00:00
2023-01-06 17:59:39 -08:00
2023-01-10 17:57:21 -08:00
2023-01-13 20:33:01 -08:00

ci Crowdin Documentation Status Coverage Status Chat on Matrix demo

Paperless-ngx

Paperless-ngx is a document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.

Paperless-ngx forked from paperless-ng to continue the great work and distribute responsibility of supporting and advancing the project among a team of people. Consider joining us! Discussion of this transition can be found in issues #1599 and #1632.

A demo is available at demo.paperless-ngx.com using login demo / demo. Note: demo content is reset frequently and confidential information should not be uploaded.

Features

Dashboard Dashboard

  • Organize and index your scanned documents with tags, correspondents, types, and more.
  • Performs OCR on your documents, adds selectable text to image only documents and adds tags, correspondents and document types to your documents.
  • Supports PDF documents, images, plain text files, and Office documents (Word, Excel, Powerpoint, and LibreOffice equivalents).
    • Office document support is optional and provided by Apache Tika (see configuration)
  • Paperless stores your documents plain on disk. Filenames and folders are managed by paperless and their format can be configured freely.
  • Single page application front end.
    • Includes a dashboard that shows basic statistics and has document upload.
    • Filtering by tags, correspondents, types, and more.
    • Customizable views can be saved and displayed on the dashboard.
  • Full text search helps you find what you need.
    • Auto completion suggests relevant words from your documents.
    • Results are sorted by relevance to your search query.
    • Highlighting shows you which parts of the document matched the query.
    • Searching for similar documents ("More like this")
  • Email processing: Paperless adds documents from your email accounts.
    • Configure multiple accounts and filters for each account.
    • When adding documents from mail, paperless can move these mail to a new folder, mark them as read, flag them as important or delete them.
  • Machine learning powered document matching.
    • Paperless-ngx learns from your documents and will be able to automatically assign tags, correspondents and types to documents once you've stored a few documents in paperless.
  • Optimized for multi core systems: Paperless-ngx consumes multiple documents in parallel.
  • The integrated sanity checker makes sure that your document archive is in good health.
  • More screenshots are available in the documentation.

Getting started

The easiest way to deploy paperless is docker-compose. The files in the /docker/compose directory are configured to pull the image from Github Packages.

If you'd like to jump right in, you can configure a docker-compose environment with our install script:

bash -c "$(curl -L https://raw.githubusercontent.com/paperless-ngx/paperless-ngx/main/install-paperless-ngx.sh)"

Alternatively, you can install the dependencies and setup apache and a database server yourself. The documentation has a step by step guide on how to do it.

Migrating from Paperless-ng is easy, just drop in the new docker image! See the documentation on migrating for more details.

Documentation

The documentation for Paperless-ngx is available at https://docs.paperless-ngx.com.

Contributing

If you feel like contributing to the project, please do! Bug fixes, enhancements, visual fixes etc. are always welcome. If you want to implement something big: Please start a discussion about that! The documentation has some basic information on how to get started.

Community Support

People interested in continuing the work on paperless-ngx are encouraged to reach out here on github and in the Matrix Room. If you would like to contribute to the project on an ongoing basis there are multiple teams (frontend, ci/cd, etc) that could use your help so please reach out!

Translation

Paperless-ngx is available in many languages that are coordinated on Crowdin. If you want to help out by translating paperless-ngx into your language, please head over to https://crwd.in/paperless-ngx, and thank you! More details can be found in CONTRIBUTING.md.

Feature Requests

Feature requests can be submitted via GitHub Discussions, you can search for existing ideas, add your own and vote for the ones you care about.

Bugs

For bugs please open an issue or start a discussion if you have questions.

Affiliated Projects

Paperless has been around a while now, and people are starting to build stuff on top of it. If you're one of those people, we can add your project to this list:

  • Paperless App: An Android/iOS app for Paperless-ngx. Also works with the original Paperless and Paperless-ng.
  • Paperless Share. Share any files from your Android application with paperless. Very simple, but works with all of the mobile scanning apps out there that allow you to share scanned documents.
  • Scan to Paperless: Scan and prepare (crop, deskew, OCR, ...) your documents for Paperless.
  • Paperless Mobile: A modern, feature rich mobile application for Paperless.

Important Note

Document scanners are typically used to scan sensitive documents. Things like your social insurance number, tax records, invoices, etc. Everything is stored in the clear without encryption. This means that Paperless should never be run on an untrusted host. Instead, I recommend that if you do want to use it, run it locally on a server in your own home.

Description
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
Readme GPL-3.0 391 MiB
Languages
PostScript 72.6%
Python 14.8%
TypeScript 9.6%
HTML 2.4%
SCSS 0.3%