-
Notifications
You must be signed in to change notification settings - Fork 69
Add documentation on how to use TestingBot #268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jochen-testingbot
wants to merge
3
commits into
val-town:main
Choose a base branch
from
jochen-testingbot:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| --- | ||
| title: TestingBot | ||
| description: How to use TestingBot to scrape websites with Val Town | ||
| --- | ||
|
|
||
| Some websites are partially (or entirely) rendered on the client (aka your web | ||
| browser). If you try to search the initial HTML for elements that haven't | ||
| finished rendering, you won't find them. | ||
|
|
||
| One solution is to use a headless browser that runs a web browser in the | ||
| background that fetches the page, renders it, and _then_ allows you to search | ||
| the final document. | ||
|
|
||
| [TestingBot](https://testingbot.com/) | ||
| provides an API to interact with a remote headless browser. You can use a Function to [scrape a website](https://testingbot.com/support/functions/scrape) and fetch its contents. | ||
|
|
||
| ## Sign up to TestingBot and retrieve your credentials | ||
|
|
||
| Copy the API key and SECRET from the | ||
| [TestingBot dashboard](https://testingbot.com/members/user/security) | ||
| and save it as Val Town environment variables `testingbot_key` and `testingbot_secret`. | ||
|
|
||
| ## Make an API call to the [/scrape API](https://testingbot.com/support/functions/scrape) | ||
|
|
||
| Check the documentation for the | ||
| [/scrape API](https://testingbot.com/support/functions/scrape) and prepare your request. | ||
|
|
||
| For example, here's how you scrape the introduction paragraph of OpenAI's | ||
| Wikipedia page. | ||
|
|
||
| ```ts title="Scrape API example" val | ||
| import { fetchJSON } from "https://esm.town/v/stevekrouse/fetchJSON?v=41"; | ||
|
|
||
| export default async function ScrapeWebsite(url: String, selector: String) { | ||
| const res = await fetchJSON( | ||
| `https://cloud.testingbot.com/scrape?key=${Deno.env.get("testingbot_key")}&secret=${ | ||
| Deno.env.get("testingbot_secret") | ||
| }&browserName=chrome&version=latest&platform=WIN10`, | ||
| { | ||
| method: "POST", | ||
| body: JSON.stringify({ | ||
| url: url, | ||
| elements: [ | ||
| { | ||
| // The second <p> element on the page | ||
| selector: selector, | ||
| }, | ||
| ], | ||
| }), | ||
| }, | ||
| ); | ||
| // For this request, TestingBot will return one data item | ||
| const data = res[0]; | ||
| // That contains a single element | ||
| const elements = data.results; | ||
| // Which we want to turn into its innerText value | ||
| const intro = elements[0].text; | ||
| return intro; | ||
| } | ||
| console.log("Scrape result:", await ScrapeWebsite("https://en.wikipedia.org/wiki/OpenAI", "p:nth-of-type(2)")); | ||
| ``` | ||
|
|
||
| There are other functions available, such as [taking screenshots](https://testingbot.com/support/functions/screenshot), [generating PDFs](https://testingbot.com/support/functions/pdf) and more. | ||
|
|
||
| ## Use Puppeteer to instrument a remote browser | ||
|
|
||
| You can use the [Puppeteer](https://pptr.dev/) library to connect to a remote browser running on TestingBot. | ||
|
|
||
| Once you've navigated to a URL, you can run arbitrary JavaScript with | ||
| `page.evaluate` - like getting the text from a paragraph. | ||
|
|
||
| ```ts title="Puppeteer example" val | ||
| import { PuppeteerDeno } from "https://deno.land/x/[email protected]/src/deno/Puppeteer.ts"; | ||
|
|
||
| const puppeteer = new PuppeteerDeno({ | ||
| productName: "chrome", | ||
| }); | ||
| const capabilities = { | ||
| 'tb:options': { | ||
| key: Deno.env.get("testingbot_key"), | ||
| secret: Deno.env.get("testingbot_secret") | ||
| }, | ||
| browserName: 'chrome', | ||
| browserVersion: 'latest' | ||
| }; | ||
| const browser = await puppeteer.connect({ | ||
| browserWSEndpoint: `wss://cloud.testingbot.com/puppeteer?capabilities=${encodeURIComponent(JSON.stringify(capabilities))}`, | ||
| }); | ||
| const page = await browser.newPage(); | ||
| await page.goto("https://en.wikipedia.org/wiki/OpenAI"); | ||
| const intro = await page.evaluate( | ||
| `document.querySelector('p:nth-of-type(2)').innerText` | ||
| ); | ||
| await browser.close(); | ||
| console.log(intro); | ||
| ``` | ||
|
|
||
| ```txt title="Logs" | ||
| "OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI Incorporated and its for-profit subsidiary corporation OpenAI Limited Partnership. OpenAI conducts AI research with the declared intention of promoting and developing friendly AI." | ||
| ``` | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link a val here that readers can remix under a TestingBot account? Want to also make sure this runs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @charmaine - we added a val here: https://www.val.town/x/testingbot/scrape-website