paperless-ngx

mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2026-02-22 00:49:35 -06:00

Author	SHA1	Message	Date
Daniel Quinn	5f0962bc3e	Travis integration: take 6	2016-02-21 01:58:09 +00:00
Daniel Quinn	300dc97e83	Travis integration: take 5	2016-02-21 01:53:10 +00:00
Daniel Quinn	e0b2d27e01	Travis integration: take 4	2016-02-21 01:50:04 +00:00
Daniel Quinn	6f7169d2d6	Travis integration: take 3	2016-02-21 01:46:49 +00:00
Daniel Quinn	55a7dc2444	pep8	2016-02-21 01:43:48 +00:00
Daniel Quinn	c7787bc076	Let's see if I can get Travis CI working on the first try	2016-02-21 01:37:57 +00:00
Daniel Quinn	0d46643026	Version bump	2016-02-21 01:24:30 +00:00
Daniel Quinn	17d3a44952	A crude API is in place	2016-02-21 00:55:38 +00:00
Daniel Quinn	809fb8fa1f	Moved the default GNUPG home to /tmp for tox-friendliness	2016-02-21 00:29:59 +00:00
Daniel Quinn	440614eddc	Got tox working	2016-02-21 00:29:21 +00:00
Daniel Quinn	422ae9303a	pep8	2016-02-21 00:14:50 +00:00
Daniel Quinn	a5124cade6	Merge branch 'master' into feature/api	2016-02-20 22:55:42 +00:00
Daniel Quinn	224f4acdc3	Merge branch 'master' of github.com:danielquinn/paperless	2016-02-20 22:50:58 +00:00
Daniel Quinn	51b19f4c19	Issue #57	2016-02-20 22:30:01 +00:00
Daniel Quinn	5c6aa201be	Merge pull request #50 from tikitu/docker-tweaks Some small tweaks to the Docker setup and documentation	2016-02-20 00:06:12 +00:00
Tikitu de Jager	438b161a25	Move `docker-compose.env` to `docker-compose.env.example` & adjust docs This file, like `docker-compose.yml`, should be edited by the user. To avoid merge conflicts when pulling updates, the edited version should not be committed to the repository.	2016-02-19 22:51:49 +02:00
Tikitu de Jager	147f8f72a2	Simplify instructions for exporting with docker The export workflow reusing the `/consume` volume is complex and error- prone, and not at all necessary if the `docker-compose.yml` file has a volume for `/export` from the beginning.	2016-02-19 22:27:48 +02:00
Daniel Quinn	3a8755e4c8	Document the retagger Fixes #54	2016-02-19 17:26:40 +00:00
Daniel Quinn	d9602312b1	Merge pull request #52 from pitkley/fix/detect-orientation-errors Ignore error if orientation detection fails	2016-02-19 09:13:14 +00:00
Pit Kleyersburg	c45f951ca0	Ignore error if orientation detection fails Fixes an additional issue that came up in #48.	2016-02-19 09:52:32 +01:00
Daniel Quinn	ec88ea73f6	#48 : make the tag matching smarter	2016-02-19 00:45:02 +00:00
Daniel Quinn	99be40a433	Merge pull request #39 from pitkley/feature/dockerfile Add Dockerfile for application and documentation	2016-02-18 22:01:54 +00:00
Pit Kleyersburg	724afa59c7	Add Dockerfile for application and documentation This commit adds a `Dockerfile` to the root of the project, accompanied by a `docker-compose.yml.example` for simplified deployment. The `Dockerfile` is agnostic to whether it will be the webserver, the consumer, or if it is run for a one-off command (i.e. creation of a superuser, migration of the database, document export, ...). The containers entrypoint is the `scripts/docker-entrypoint.sh` script. This script verifies that the required permissions are set, remaps the default users and/or groups id if required and installs additional languages if the user wishes to. After initialization, it analyzes the command the user supplied: - If the command starts with a slash, it is expected that the user wants to execute a binary file and the command will be executed without further intervention. (Using `exec` to effectively replace the started shell-script and not have any reaping-issues.) - If the command does not start with a slash, the command will be passed directly to the `manage.py` script without further modification. (Again using `exec`.) The default command is set to `--help`. If the user wants to execute a command that is not meant for `manage.py` but doesn't start with a slash, the Docker `--entrypoint` parameter can be used to circumvent the mechanics of `docker-entrypoint.sh`. Further information can be found in `docs/setup.rst` and in `docs/migrating.rst`. For additional convenience, a `Dockerfile` has been added to the `docs/` directory which allows for easy building and serving of the documentation. This is documented in `docs/requirements.rst`.	2016-02-18 22:58:32 +01:00
Daniel Quinn	57bcb883bf	Merge pull request #49 from pitkley/feature/detect-orientation Detect image orientation if the OCR supports it. Fixes #47	2016-02-18 11:36:08 +00:00
Pit Kleyersburg	c34d57a872	Detect image orientation if the OCR supports it Fixes issue #47.	2016-02-18 09:37:13 +01:00
Daniel Quinn	1e7ece81ee	Fixes #45	2016-02-17 23:07:54 +00:00
Daniel Quinn	eb01bcf98b	The Log class needed a __str__() method	2016-02-17 23:06:35 +00:00
Daniel Quinn	1c45ca10d4	Patched sorting	2016-02-17 00:11:57 +00:00
Daniel Quinn	550184cbae	Patched sorting	2016-02-17 00:11:46 +00:00
Daniel Quinn	52f242574f	Merge branch 'pitkley-fix/secure-temporary-files'	2016-02-17 00:10:54 +00:00
Daniel Quinn	6f95b05287	Support appropriate sorting for long documents	2016-02-17 00:10:05 +00:00
Pit Kleyersburg	46f8f492f5	Safely and non-randomly create scratch directory Creating the scratch-files in `_get_grayscale` using a random integer is for one inherently unsafe and can cause a collision. On the other hand, it should be unnecessary given that the files will be cleaned up after the OCR run. Since we don't know if OCR runs might be parallel in the future, this commit implements thread-safe and deterministic directory-creation. Additionally it fixes the call to `_cleanup` by `consume`. In the current implementation `_cleanup` will not be called if the last consumed document failed with an `OCRError`, this commit fixes this.	2016-02-16 12:15:57 +01:00
Daniel Quinn	cebc44f2c9	API is halfway there	2016-02-16 09:28:34 +00:00
Daniel Quinn	bbe7a02b4d	Added a screenshot and cleaned things up a bit.	2016-02-16 09:22:51 +00:00
Daniel Quinn	5de4951a46	Added a screenshot, now I have to figure out how to put it in the readme.	2016-02-16 09:08:35 +00:00
Daniel Quinn	8a5d4b1cc8	Merge branch 'master' of github.com:danielquinn/paperless	2016-02-15 22:38:25 +00:00
Daniel Quinn	2f0da8ab25	Added download_url to the Document model	2016-02-15 22:38:18 +00:00
Daniel Quinn	a256d5ee2f	Merge pull request #37 from jat255/DOCFIX_documentation_badge Make docs badge in readme redirect to documentation, not image	2016-02-15 16:59:30 +00:00
Joshua Taillon	d2757707b3	Make docs badge in readme redirect to documentation, not image	2016-02-15 11:58:07 -05:00
Daniel Quinn	9a437dc9f6	Merge pull request #35 from pitkley/fix/matching-logic Fix matching if user supplied an empty value	2016-02-14 19:21:50 +00:00
Pit Kleyersburg	7b227ffa2f	Fix matching if user supplied an empty value	2016-02-14 19:47:05 +01:00
Daniel Quinn	aea4af5d3b	Version bump and feature update	2016-02-14 17:18:28 +00:00
Daniel Quinn	a0f4f6c5f2	Fixed merge conflict and did some pep8	2016-02-14 17:13:48 +00:00
Daniel Quinn	4689e2b975	Merge pull request #32 from pitkley/feature/single-page-langdetect Detect language only on first page of PDF	2016-02-14 16:56:30 +00:00
Pit Kleyersburg	aeab9a0e81	Detect language only on one page of PDF To detect the language currently the entire document gets processed. If a different language has been detected than the default one, the entire document will be processed again for the new language. This PR analyzes the middle page for its language and either processes the remaining pages with the default language if it didn't differ, or processes all pages for the new guessed language. The amount of processed pages comes down from the worst case `2n` to worst case `n+1`.	2016-02-14 17:55:13 +01:00
Daniel Quinn	7843ea5037	Added and implemented a rudimentary logger	2016-02-14 16:09:52 +00:00
Daniel Quinn	9162e41507	Merge pull request #33 from pitkley/fix/parallelism Ensure `OCR_THREADS` is integer, add documentation	2016-02-14 15:40:20 +00:00
Pit Kleyersburg	20b2408dbb	Ensure `OCR_THREADS` is integer, add documentation	2016-02-14 16:37:38 +01:00
Daniel Quinn	88acf50fe0	Merge pull request #31 from pitkley/feature/paralellism This is great. It seriously sped up the OCR time.	2016-02-14 15:29:05 +00:00
Pit Kleyersburg	f5beda9c56	Enable parallel OCR processing At the moment, every page in a PDF will be processed one by one using tesseract. Since the processing of a single page is independent from every other page, one can make use of multi-core machines. This PR introduces a multiprocessing pool to process multiple pages simultaneously. The amount of threads to use can be specified in the environment variable `PAPERLESS_OCR_THREADS`. This will default to the number of cores/hyperthreads Python detects for your system.	2016-02-14 15:57:42 +01:00

1 2 3 4

166 Commits