Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(integration): Enable scrape again on spider #620

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

timonv
Copy link
Member

@timonv timonv commented Feb 14, 2025

Waiting on spider-rs/spider#269

Does not work yet, log output of one of the tests:

[INFO  swiftide_integrations::scraping::loader] Subscribed to spider
[INFO  swiftide_integrations::scraping::loader] [Spider] Starting scrape loop
[DEBUG reqwest::connect] starting new connection: http://127.0.0.1:51487/
[DEBUG hyper_util::client::legacy::connect::http] connecting to 127.0.0.1:51487
[DEBUG hyper_util::client::legacy::connect::http] connected to 127.0.0.1:51487
[DEBUG wiremock::mock_set] Handling request.
[DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("http", 127.0.0.1:51487)
[INFO  spider::utils] fetch http://127.0.0.1:51487
[INFO  spider::utils] fetch http://127.0.0.1:51487/other
[DEBUG hyper_util::client::legacy::pool] reuse idle connection for ("http", 127.0.0.1:51487)
[DEBUG wiremock::mock_set] Handling request.
[DEBUG swiftide_integrations::scraping::loader] [Spider] Received node from spider node=Ok(Node { id: 4bbb2370-2b46-3423-a5be-572a9a8b25c8, path: "http://127.0.0.1:51487", chunk: "<html><body><h1>Test Page</h1><a href=\"/other\">link</a></body></html>", metadata: {}, vectors: "", sparse_vectors: "", embed_mode: SingleWithMetadata })
[DEBUG hyper_util::client::legacy::pool] pooling idle connection for ("http", 127.0.0.1:51487)
[DEBUG swiftide_integrations::scraping::loader] [Spider] Received node from spider node=Ok(Node { id: 9d3aecea-d8f1-32f3-8479-0b4b50214d02, path: "http://127.0.0.1:51487/other", chunk: "<html><body><h1>Test Page 2</h1></body></html>", metadata: {}, vectors: "", sparse_vectors: "", embed_mode: SingleWithMetadata })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant