mirror of https://github.com/paperless-ngx/paperless-ngx.git synced 2025-12-12 17:47:08 +01:00

Michael Shamoon cd5432fec0 Squashed commit of the following:

commit b1410a854e
Merge: f9ce4d8f 8ec9c77e
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Thu Dec 29 20:09:09 2022 -0800

    Merge pull request #2263 from paperless-ngx/v1.11.0-changelog

    [Documentation] Add v1.11.0 changelog

commit 8ec9c77e51
Author: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Fri Dec 30 04:08:17 2022 +0000

    Changelog v1.11.0 - GHA

commit f9ce4d8f6a
Author: Michael Shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Thu Dec 29 19:40:25 2022 -0800

    Update version strings for 1.11.0

commit 8c9a74ee0c
Merge: 605f86f0 0b59ef2c
Author: Michael Shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Thu Dec 29 19:39:38 2022 -0800

    Merge branch 'dev'

commit 605f86f0cf
Merge: 800e842a 8cbaca22
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Wed Dec 28 15:55:35 2022 -0800

    Merge pull request #2256 from mendelk/patch-1

    Fixed typo in docs

commit 8cbaca22c1
Author: Mendel Kramer <mendelk@users.noreply.github.com>
Date:   Wed Dec 28 18:16:00 2022 -0500

    Fixed typo in docs

commit 800e842ab3
Author: ThellraAK <github.com@absurdlybored.com>
Date:   Wed Dec 21 01:36:37 2022 -0900

    Removing Mariadb default open port (#2227)

    * Removing Mariadb default open port

    Removing the listening port 3306 for the DB, Docker networks will let the containers talk to one another.  The existing setup would allow anyone to connect to the DB and use the default passwords.

    * Update docker-compose.mariadb-tika.yml

    Adding change to the other compose file to remove open port

    * Remove excess blank lines

    * Remove excess blank lines

    Co-authored-by: Felix E <felix@eckhofer.com>

commit 6f6f365e2b
Merge: 6d324dbd 43b863b8
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sat Dec 17 19:58:06 2022 -0800

    Merge pull request #2203 from tooomm/docs_updates

    Docs: More fixes and improvements

commit 43b863b816
Author: tooomm <tooomm@users.noreply.github.com>
Date:   Sun Dec 11 19:44:18 2022 +0100

    doc fixes

    This reverts commit e015babdc102a65a3cce0cc71812d3eb730da92e.

    link fix

    fix escaping, spacing, profile links, typo

    revert

    ~~add~~ at fixes

    Revert "~~add~~ at fixes"

    This reverts commit ce0192b733c19614048de81ea917660e25bb35f2.

commit 6d324dbd8e
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 16 09:10:11 2022 -0800

    Update config.yml

commit 8ddf05e573
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 16 09:09:48 2022 -0800

    Update bug-report.yml

commit 0472dfe25a
Author: tooomm <tooomm@users.noreply.github.com>
Date:   Sun Dec 11 19:12:58 2022 +0100

    Docs: Fix leftover issues from conversion (#2172)

commit 8b36c9ad64
Author: tooomm <tooomm@users.noreply.github.com>
Date:   Sun Dec 11 16:07:08 2022 +0100

    more fixes and cleanup

commit 1266f2d5b9
Author: tooomm <tooomm@users.noreply.github.com>
Date:   Sun Dec 11 12:06:15 2022 +0100

    fix links

commit 8196051959
Merge: 06a6eb03 d198142a
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 9 16:12:20 2022 -0800

    Merge pull request #2157 from Weltraumschaf/patch-1

    Update setup.md

commit d198142a1e
Author: Sven Strittmatter <ich@weltraumschaf.de>
Date:   Fri Dec 9 22:09:06 2022 +0100

    Update setup.md

    W/o the slash it resolves to /setup/configuration/ which does 404.

commit 06a6eb0326
Author: Michael Shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 9 08:15:03 2022 -0800

    fix code block indentation

commit 28819d6d0f
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Fri Dec 9 08:11:42 2022 -0800

    Fix code block indentation

commit 8cd5e25364
Merge: 32d54674 7788d932
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Tue Dec 6 11:23:15 2022 -0800

    Merge pull request #2137 from paperless-ngx/more-docs-cleanup

    Chore: Cleanup of new documentation

commit 7788d93227
Author: Trenton Holmes <797416+stumpylog@users.noreply.github.com>
Date:   Sun Dec 4 08:34:49 2022 -0800

    Further cleanup of docs, including fixing autoconvert issues and general cleanups

commit 32d546740b
Merge: b0ca57a7 24da3e50
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sun Dec 4 19:12:27 2022 -0800

    Merge pull request #2118 from alexander-bauer/chart-bump

commit 24da3e5034
Author: Alexander Bauer <sasha@linux.com>
Date:   Mon Dec 5 02:51:35 2022 +0000

    Bump Helm Chart version to trigger release

commit b0ca57a7f0
Merge: cdd49c51 c864b3cd
Author: shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sun Dec 4 14:36:00 2022 -0800

    Merge pull request #2114 from paperless-ngx/v1.10.2-changelog

    [Documentation] Add v1.10.2 changelog

commit cdd49c5142
Author: Michael Shamoon <4887959+shamoon@users.noreply.github.com>
Date:   Sun Dec 4 14:32:08 2022 -0800

    Update frontend compilation info

commit c864b3cd19
Author: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Sun Dec 4 21:17:16 2022 +0000

    Changelog v1.10.2 - GHA

2022-12-29 20:09:58 -08:00

13 KiB

Raw Blame History

Troubleshooting

No files are added by the consumer

Check for the following issues:

Ensure that the directory you're putting your documents in is the folder paperless is watching. With docker, this setting is performed in the docker-compose.yml file. Without docker, look at the CONSUMPTION_DIR setting. Don't adjust this setting if you're using docker.
Ensure that redis is up and running. Paperless does its task processing asynchronously, and for documents to arrive at the task processor, it needs redis to run.
Ensure that the task processor is running. Docker does this automatically. Manually invoke the task processor by executing
```
$ celery --app paperless worker
```
Look at the output of paperless and inspect it for any errors.
Go to the admin interface, and check if there are failed tasks. If so, the tasks will contain an error message.

Consumer warns `OCR for XX failed`

If you find the OCR accuracy to be too low, and/or the document consumer warns that OCR for XX failed, but we're going to stick with what we've got since FORGIVING_OCR is enabled, then you might need to install the Tesseract language files marching your document's languages.

As an example, if you are running Paperless-ngx from any Ubuntu or Debian box, and your documents are written in Spanish you may need to run:

apt-get install -y tesseract-ocr-spa

Consumer fails to pickup any new files

If you notice that the consumer will only pickup files in the consumption directory at startup, but won't find any other files added later, you will need to enable filesystem polling with the configuration option PAPERLESS_CONSUMER_POLLING, see `here.

This will disable listening to filesystem changes with inotify and paperless will manually check the consumption directory for changes instead.

Paperless always redirects to /admin

You probably had the old paperless installed at some point. Paperless installed a permanent redirect to /admin in your browser, and you need to clear your browsing data / cache to fix that.

Operation not permitted

You might see errors such as:

chown: changing ownership of '../export': Operation not permitted

The container tries to set file ownership on the listed directories. This is required so that the user running paperless inside docker has write permissions to these folders. This happens when pointing these directories to NFS shares, for example.

Ensure that chown is possible on these directories.

Classifier error: No training data available

This indicates that the Auto matching algorithm found no documents to learn from. This may have two reasons:

You don't use the Auto matching algorithm: The error can be safely ignored in this case.
You are using the Auto matching algorithm: The classifier explicitly excludes documents with Inbox tags. Verify that there are documents in your archive without inbox tags. The algorithm will only learn from documents not in your inbox.

UserWarning in sklearn on every single document

You may encounter warnings like this:

/usr/local/lib/python3.7/site-packages/sklearn/base.py:315:
UserWarning: Trying to unpickle estimator CountVectorizer from version 0.23.2 when using version 0.24.0.
This might lead to breaking code or invalid results. Use at your own risk.

This happens when certain dependencies of paperless that are responsible for the auto matching algorithm are updated. After updating these, your current training data might not be compatible anymore. This can be ignored in most cases. This warning will disappear automatically when paperless updates the training data.

If you want to get rid of the warning or actually experience issues with automatic matching, delete the file classification_model.pickle in the data directory and let paperless recreate it.

504 Server Error: Gateway Timeout when adding Office documents

You may experience these errors when using the optional TIKA integration:

requests.exceptions.HTTPError: 504 Server Error: Gateway Timeout for url: http://gotenberg:3000/forms/libreoffice/convert

Gotenberg is a server that converts Office documents into PDF documents and has a default timeout of 30 seconds. When conversion takes longer, Gotenberg raises this error.

You can increase the timeout by configuring a command flag for Gotenberg (see also here). If using docker-compose, this is achieved by the following configuration change in the docker-compose.yml file:

# The gotenberg chromium route is used to convert .eml files. We do not
# want to allow external content like tracking pixels or even javascript.
command:
  - 'gotenberg'
  - '--chromium-disable-javascript=true'
  - '--chromium-allow-list=file:///tmp/.*'
  - '--api-timeout=60'

Permission denied errors in the consumption directory

You might encounter errors such as:

The following error occured while consuming document.pdf: [Errno 13] Permission denied: '/usr/src/paperless/src/../consume/document.pdf'

This happens when paperless does not have permission to delete files inside the consumption directory. Ensure that USERMAP_UID and USERMAP_GID are set to the user id and group id you use on the host operating system, if these are different from 1000. See Docker setup.

Also ensure that you are able to read and write to the consumption directory on the host.

OSError: [Errno 19] No such device when consuming files

If you experience errors such as:

File "/usr/local/lib/python3.7/site-packages/whoosh/codec/base.py", line 570, in open_compound_file
return CompoundStorage(dbfile, use_mmap=storage.supports_mmap)
File "/usr/local/lib/python3.7/site-packages/whoosh/filedb/compound.py", line 75, in __init__
self._source = mmap.mmap(fileno, 0, access=mmap.ACCESS_READ)
OSError: [Errno 19] No such device

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/django_q/cluster.py", line 436, in worker
res = f(*task["args"], **task["kwargs"])
File "/usr/src/paperless/src/documents/tasks.py", line 73, in consume_file
override_tag_ids=override_tag_ids)
File "/usr/src/paperless/src/documents/consumer.py", line 271, in try_consume_file
raise ConsumerError(e)

Paperless uses a search index to provide better and faster full text searching. This search index is stored inside the data folder. The search index uses memory-mapped files (mmap). The above error indicates that paperless was unable to create and open these files.

This happens when you're trying to store the data directory on certain file systems (mostly network shares) that don't support memory-mapped files.

Web-UI stuck at "Loading..."

This might have multiple reasons.

If you built the docker image yourself or deployed using the bare metal route, make sure that there are files in <paperless-root>/static/frontend/<lang-code>/. If there are no files, make sure that you executed collectstatic successfully, either manually or as part of the docker image build.

If the front end is still missing, make sure that the front end is compiled (files present in src/documents/static/frontend). If it is not, you need to compile the front end yourself or download the release archive instead of cloning the repository.

Check the output of the web server. You might see errors like this:

[2021-01-25 10:08:04 +0000] [40] [ERROR] Socket error processing request.
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 134, in handle
    self.handle_request(listener, req, client, addr)
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 190, in handle_request
    util.reraise(*sys.exc_info())
File "/usr/local/lib/python3.7/site-packages/gunicorn/util.py", line 625, in reraise
    raise value
File "/usr/local/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 178, in handle_request
    resp.write_file(respiter)
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 396, in write_file
    if not self.sendfile(respiter):
File "/usr/local/lib/python3.7/site-packages/gunicorn/http/wsgi.py", line 386, in sendfile
    sent += os.sendfile(sockno, fileno, offset + sent, count)
OSError: [Errno 22] Invalid argument

To fix this issue, add

SENDFILE=0

to your docker-compose.env file.