mirror of
				https://github.com/paperless-ngx/paperless-ngx.git
				synced 2025-10-26 03:36:08 -05:00 
			
		
		
		
	Merge branch 'dev' into feature-ocrmypdf
This commit is contained in:
		| @@ -1,13 +1,26 @@ | ||||
| # Contributing | ||||
|  | ||||
| If you feel that somethings is not working, please submit an issue. You can also ask questions on the issue tracker by tagging your question with the question tag. | ||||
| There's still lots of things to be done, just have a look at that issue log. If you feel like conctributing to the project, please do! Bug fixes and improvements to the front end (I just can't seem to get some of these CSS things right) are always welcome. | ||||
|  | ||||
| Pull requests are welcome, however, I will be a little bit more strict about what goes into the code and what does not. If you want to make a big change, please ask me about it first. | ||||
| If you want to implement something big: Please start a discussion about that in the issues! Maybe I've already had something similar in mind and we can make it happen together. However, keep in mind that the general roadmap is to make the existing features stable and get them tested. See the roadmap in the readme. | ||||
|  | ||||
| * When making additions to the project, consider if the majority of users will benefit from your change. If not, you're probably better of forking the project. | ||||
| * Also consider if your change will get in the way of other users. A good change is a change that enhances the experience of some users who want that change and does not affect users who do not care about the change. | ||||
|  | ||||
| However: | ||||
| ## Python | ||||
|  | ||||
| * Bug fixes and are always welcome. Docker makes things easier, however, I alone cannot ensure that this runs on all platforms. | ||||
| * Improvements to the styling of the front-end are always welcome. I'm no expert in things UX, and simply copied one of the Bootstrap examples. I think it turned out rather good, but I just can't seem to get some things working properly. | ||||
| Use python 3.6 for development. Paperless supports python 3.6, 3.7 and 3.8. | ||||
|  | ||||
| ## Branches | ||||
|  | ||||
| master always reflects the latest release. | ||||
|  | ||||
| dev contains all changes that will be part of the next release. Use this branch to start making your changes. | ||||
|  | ||||
| feature-X branches is for experimental stuff that will eventually be merged into dev, and then released as part of the next release. | ||||
|  | ||||
| ## Testing: | ||||
|  | ||||
| I'm trying to get most of paperless tested, so please do the same for your code! I know its a hassle, but it makes sure that your code works now and will allow us to detect regressions easily. | ||||
|  | ||||
| To test your code, execute `pytest` in the src/ directory. Executing that in the project root is no good. This also generates a html coverage report, which you can use to see if you missed anything important during testing. | ||||
|   | ||||
							
								
								
									
										34
									
								
								README.md
									
									
									
									
									
								
							
							
						
						
									
										34
									
								
								README.md
									
									
									
									
									
								
							| @@ -35,43 +35,61 @@ The gist of the changes is the following: | ||||
| * New full text search. | ||||
| * New email processing. | ||||
| * Machine learning powered document matching. | ||||
| * Code cleanup in many, MANY areas. | ||||
| * A task processor that processes documents in parallel and also tells you when something goes wrong. | ||||
| * Code cleanup in many, MANY areas. Some of the code was just overly complicated. | ||||
| * More tests, more stability. | ||||
|  | ||||
| If you want to see some screenshots of paperless-ng in action, [some are available in the documentation](https://paperless-ng.readthedocs.io/en/latest/screenshots.html). | ||||
|  | ||||
| For a complete list of changes, check out the [changelog](https://paperless-ng.readthedocs.io/en/latest/changelog.html) | ||||
|  | ||||
| ## Planned | ||||
| # Roadmap for 1.0 | ||||
|  | ||||
| These features will make it into the application at some point, sorted by priority. | ||||
| - Test coverage at 90%. | ||||
| - Store archived documents with an embedded OCR text layer, while keeping originals available. Making good progress in the `feature-ocrmypdf` branch. | ||||
| - Fix whatever bugs I and you find | ||||
|  | ||||
| ## Roadmap for versions beyond 1.0 | ||||
|  | ||||
| - **More search.** The search backend is incredibly versatile and customizable. Searching is the most important feature of this project and thus, I want to implement things like: | ||||
|   - Group and limit search results by correspondent, show “more from this” links in the results. | ||||
|   - Ability to search for “Similar documents” in the search results | ||||
|   - Provide corrections for mispelled queries | ||||
| - **More robust consumer** that shows its progress on the web page. | ||||
| - **An interactive consumer** that shows its progress for documents it processes on the web page. | ||||
| 	- With live updates ans websockets. This already works on a dev branch, but requires a lot of new dependencies, which I'm not particular happy about. | ||||
| 	- Notifications when a document was added with buttons to open the new document right away. | ||||
| - **Arbitrary tag colors**. Allow the selection of any color with a color picker. | ||||
|  | ||||
| ## On the chopping block. | ||||
|  | ||||
| - **GnuPG encrypion.** Since its disabled by default and the website allows transparent access to encrypted documents anyway, this doesn’t really provide any benefit over having the application stored on an encrypted file system. | ||||
| - **GnuPG encrypion.** [Here's a note about encryption in paperless](https://paperless-ng.readthedocs.io/en/latest/administration.html#managing-encryption). The gist of it is that I don't see which attacks this implementation protects against. It gives a false sense of security to users who don't care about how it works. | ||||
|  | ||||
| # Getting started | ||||
|  | ||||
| The recommended way to deploy paperless is docker-compose. Grab the latest release to get started. the dockerfiles archive contains just the docker files which will pull the image from docker hub. The source archive contains everything you need to build the docker image yourself. | ||||
| The recommended way to deploy paperless is docker-compose. Don't clone the repository, grab the latest release to get started instead. The dockerfiles archive contains just the docker files which will pull the image from docker hub. The source archive contains everything you need to build the docker image yourself (i.e. if you want to run on Raspberry Pi). | ||||
|  | ||||
| Read the [documentation](https://paperless-ng.readthedocs.io/en/latest/setup.html#installation) on how to get started. | ||||
|  | ||||
| Alternatively, you can install the dependencies and setup apache and a database server yourself. Details for that will be available in the documentation at some point. | ||||
| Alternatively, you can install the dependencies and setup apache and a database server yourself. The documenation has information about the individual components of paperless that you need to take care of. | ||||
|  | ||||
| # Migrating to paperless-ng | ||||
|  | ||||
| Read the section about [migration](https://paperless-ng.readthedocs.io/en/latest/setup.html#migration-to-paperless-ng) in the documentation. | ||||
| Read the section about [migration](https://paperless-ng.readthedocs.io/en/latest/setup.html#migration-to-paperless-ng) in the documentation. Its also entirely possible to go back to paperless by reverting the database migrations. | ||||
|  | ||||
| # Documentation | ||||
|  | ||||
| The documentation for Paperless-ng is available on [ReadTheDocs](https://paperless-ng.readthedocs.io/). | ||||
|  | ||||
| # Suggestions? Questions? Something not working? | ||||
|  | ||||
| Please open an issue and start a discussion about it! | ||||
|  | ||||
| ## Feel like helping out? | ||||
|  | ||||
| There's still lots of things to be done, just have a look at that issue log. If you feel like conctributing to the project, please do! Bug fixes and improvements to the front end (I just can't seem to get some of these CSS things right) are always welcome. | ||||
|  | ||||
| If you want to implement something big: Please start a discussion about that in the issues! Maybe I've already had something similar in mind and we can make it happen together. However, keep in mind that the general roadmap is to make the existing features stable and get them tested. See the roadmap above. | ||||
|  | ||||
| # Affiliated Projects | ||||
|  | ||||
| Paperless has been around a while now, and people are starting to build stuff on top of it.  If you're one of those people, we can add your project to this list: | ||||
|   | ||||
| @@ -12,7 +12,9 @@ from documents.sanity_checker import SanityFailedError | ||||
|  | ||||
|  | ||||
| def index_optimize(): | ||||
|     index.open_index().optimize() | ||||
|     ix = index.open_index() | ||||
|     with AsyncWriter(ix) as writer: | ||||
|         writer.commit(optimize=True) | ||||
|  | ||||
|  | ||||
| def index_reindex(): | ||||
|   | ||||
| @@ -5,6 +5,7 @@ from unittest import mock | ||||
| from django.contrib.auth.models import User | ||||
| from pathvalidate import ValidationError | ||||
| from rest_framework.test import APITestCase | ||||
| from whoosh.writing import AsyncWriter | ||||
|  | ||||
| from documents import index | ||||
| from documents.models import Document, Correspondent, DocumentType, Tag | ||||
| @@ -173,7 +174,7 @@ class DocumentApiTest(DirectoriesMixin, APITestCase): | ||||
|         d1=Document.objects.create(title="invoice", content="the thing i bought at a shop and paid with bank account", checksum="A", pk=1) | ||||
|         d2=Document.objects.create(title="bank statement 1", content="things i paid for in august", pk=2, checksum="B") | ||||
|         d3=Document.objects.create(title="bank statement 3", content="things i paid for in september", pk=3, checksum="C") | ||||
|         with index.open_index(False).writer() as writer: | ||||
|         with AsyncWriter(index.open_index()) as writer: | ||||
|             # Note to future self: there is a reason we dont use a model signal handler to update the index: some operations edit many documents at once | ||||
|             # (retagger, renamer) and we don't want to open a writer for each of these, but rather perform the entire operation with one writer. | ||||
|             # That's why we cant open the writer in a model on_save handler or something. | ||||
| @@ -209,7 +210,7 @@ class DocumentApiTest(DirectoriesMixin, APITestCase): | ||||
|         self.assertEqual(len(results), 0) | ||||
|  | ||||
|     def test_search_multi_page(self): | ||||
|         with index.open_index(False).writer() as writer: | ||||
|         with AsyncWriter(index.open_index()) as writer: | ||||
|             for i in range(55): | ||||
|                 doc = Document.objects.create(checksum=str(i), pk=i+1, title=f"Document {i+1}", content="content") | ||||
|                 index.update_document(writer, doc) | ||||
| @@ -248,7 +249,7 @@ class DocumentApiTest(DirectoriesMixin, APITestCase): | ||||
|         self.assertEqual(len(results), 5) | ||||
|  | ||||
|     def test_search_invalid_page(self): | ||||
|         with index.open_index(False).writer() as writer: | ||||
|         with AsyncWriter(index.open_index()) as writer: | ||||
|             for i in range(15): | ||||
|                 doc = Document.objects.create(checksum=str(i), pk=i+1, title=f"Document {i+1}", content="content") | ||||
|                 index.update_document(writer, doc) | ||||
|   | ||||
		Reference in New Issue
	
	Block a user
	 jonaswinkler
					jonaswinkler