Skip to content

Provide a library interface for vectara-ingest #190

@anand2312

Description

@anand2312

As I see it now, the only out-of-the box way to run the crawlers in this repo is to run it in a Docker container, installing all dependencies.
In my opinion, it would be more useful if you also provided a "library" interface, wherein I could import the crawler in my own Python project and call it as I need.
(I understand that this is technically already possible, but it's not as straightforward as it could be)

With that approach, you should be able to split up the dependencies so that one only needs to install the dependencies needed for the crawlers they plan on using, and not all dependencies (which seems to be the case right now).

I see you are using uv to install dependencies in the Dockerfile, why not use that to manage the project dependencies as well? That would simplify maintaining the dependencies as I described above.

Just wanted to start a discussion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions