Skip to content

How can I configure docsearch-scraper to run against a private internal documentation site that requires auth via oauth2? #51

Open
@liberty-wollerman-kr

Description

Description

This is a general question, not an issue per-se. I'd like to first understand if there is any support for scraping content that requires authentication via OAUTH2 available from the typsense docsearch-scraper. From what I have been reading (https://typesense.org/docs/guide/docsearch.html#tips-for-common-challenges-or-more-complex-use-cases), there are some other authentication services supported, but OAUTH2 does not appear to be mentioned.

I've begun researching some possible options (i.e. https://docsearch.algolia.com/docs/legacy/run-your-own#run-the-crawl-from-the-docker-image). This "run your own" option is interesting since I have chosen the self-hosted typesense installation anyway. I'd like some help understanding how to go about setting the configuration and environment up to handle this scenario.
Does this require me to fork the docsearch code and add an auth module to the docker image?

My UI is deployed in a rancher managed kubernetes cluster where I will also host the typesense server. Would it be possible to create an ingress rule that would allow, for example, a pipeline build agent through with no auth to run the scrape/indexing process?

Can you help me clarify what my options are here, and provide me with some guidance on how to implement a solution?

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions