Skip to content

Feature/paperless-ngx connector #4609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

cbrown350
Copy link

@cbrown350 cbrown350 commented Apr 25, 2025

Description

This adds code for a Paperless-ngx connector: https://docs.paperless-ngx.com

How Has This Been Tested?

This includes the standard mock testing similar to other connectors. Additionally, it has been manually tested extensively. Anyone can easily test it using a clean Docker Paperless-ngx image by running the included script at backend\tests\daily\connectors\paperless_ngx\run_manual_test_paperless_server.sh. This script sets up a server and then provides credentials you can use to create a connector in a local Onyx instance.

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

#159
#776
#3107

onyx-dot-app/documentation#196

@cbrown350 cbrown350 requested a review from a team as a code owner April 25, 2025 05:39
Copy link

vercel bot commented Apr 25, 2025

@cbrown350 is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR adds a new Paperless-ngx connector implementation with comprehensive testing and documentation. Here are the key points:

  • Implements LoadConnector, PollConnector, and SlimConnector interfaces with proper date filtering, tag/user filtering, and error handling in /backend/onyx/connectors/paperless_ngx/connector.py

  • Includes a robust testing setup with both unit tests and a convenient Docker-based manual testing script (run_manual_test_paperless_server.sh) for local validation

  • Adds proper UI configuration in web/src/lib/connectors/connectors.tsx with fields for tag filtering, username filtering, and no-owner document inclusion

  • Bug in date field validation - the code raises an exception when ui_date_field matches a valid field instead of setting it, and lacks proper handling of master vs local date fields

  • Missing error handling for API response validation in _parse_document() - should add checks for required fields and data type validation

💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!

12 file(s) reviewed, 10 comment(s)
Edit PR Review Bot Settings | Greptile

@cbrown350 cbrown350 force-pushed the Feature/Paperless-ngx_Connector branch from 68bfd4a to 4a8b530 Compare April 25, 2025 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant