-
Notifications
You must be signed in to change notification settings - Fork 14
feat: AutoRAG - text extraction - enable table structure detection #111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
53a0aff
0bb3b60
c30f70c
5ed7a75
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,6 +13,7 @@ requires-python = ">=3.11" | |
| dependencies = [ | ||
| "kfp>=2.16.1", | ||
| "kfp-kubernetes>=2.16.1", | ||
| "opencv-python-headless", | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
# Verify available versions and recent vulnerabilities before choosing exact bounds.
curl -s https://pypi.org/pypi/opencv-python-headless/json | jq -r '.info.version, (.releases | keys[] )' | tail -n 20
# OSV vulnerability lookup for the package
curl -s https://api.osv.dev/v1/query -H 'Content-Type: application/json' \
-d '{"package":{"name":"opencv-python-headless","ecosystem":"PyPI"}}' | jqRepository: opendatahub-io/pipelines-components Length of output: 50391 Pin
🤖 Prompt for AI AgentsThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Along with I recommend running the installation in a fresh Linux environment and generating a new requirements file using pip freeze. This is required for the downstream hermetic build, which pre-fetches all dependencies for the offline image build step. For example: |
||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update downstream unit test expectation for
do_table_structure.Line 168 flips the PDF pipeline contract to
do_table_structure=True, butcomponents/data_processing/autorag/text_extraction/tests/test_component_unit.py(around Line 333-404 in the provided snippet) still assertsFalse. This will leave the test suite validating stale behavior.Proposed test fix
🤖 Prompt for AI Agents