Skip to content

Commit a70e870

Browse files
committed
chore: release v0.24.0 — Gen 21 + 26b + 28 + docs
1 parent f8a0f53 commit a70e870

3 files changed

Lines changed: 51 additions & 26 deletions

File tree

.evolve/current.json

Lines changed: 17 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,24 @@
11
{
22
"mode": "pursue",
33
"goal": "#1 on WebVoyager",
4-
"status": "ready-for-full-run",
4+
"status": "gen21-shipped",
55
"generation": 27,
6+
"subGeneration": "21-parallel-tabs",
67
"branch": "main",
7-
"changes": [
8-
"Cost cap 200k→300k (40 tasks were at old turn/cost limits)",
9-
"DuckDuckGo search fallback (Google/Bing block headless browsers)",
10-
"CAPTCHA checkbox solver (reCAPTCHA bypass, Cambridge Dict flipped)",
11-
"Form reset detection + keyboard auto-retry (Google Flights fix)",
12-
"Block-level snapshot dedup (93% compression on card-heavy pages)",
13-
"Progressive snapshot budget (4k→2.5k after 8+ same-page turns)",
14-
"Vision model cascade to gpt-4.1-mini (cost reduction)",
15-
"Form stall injection with origin+pathname matching",
16-
"Supervisor suggests DDG fallback on form stalls",
17-
"Batch fill 150ms settle delay between fields"
8+
"npmVersion": "0.23.0",
9+
"shipped": {
10+
"gen27": "stealth, anti-bot, form intelligence, CAPTCHA, card dedup",
11+
"gen21": "parallel tab execution (GoalDecomposer + ParallelRunner + EvidenceMerger)"
12+
},
13+
"heldOutResults": {
14+
"competitive": "10/10 (100%)",
15+
"webbench50": "44/50 (88% raw), 95.7% excl DataDome",
16+
"systemChromeUnblocked": "9/13 previously-blocked sites"
17+
},
18+
"nextActions": [
19+
"Gen 29-30 audit (bad-app production readiness)",
20+
"Full WebVoyager 590 run with Gen 27",
21+
"Gen 28: multi-model orchestrator (half day)"
1822
],
19-
"validatedFlips": [
20-
"booking-16: PASS 8t/$0.06 (was 19t/FAIL)",
21-
"booking-20: PASS 25t/$0.76 (was cost_cap)",
22-
"cambridge-dictionary-19: PASS 5t/$0.03 (was reCAPTCHA blocked)",
23-
"3/8 Google Flights via batch fill variance + cost cap"
24-
],
25-
"remainingBlockers": [
26-
"Google Flights form reset (keyboard retry shipped but untested at scale)",
27-
"Anti-bot on DDG/Bing/Skyscanner (headless browser detection)",
28-
"Google sorry page CAPTCHA (checkbox solver works but sorry page may not have reCAPTCHA)"
29-
],
30-
"expectedRange": "93-96% (549-566/590)",
31-
"updatedAt": "2026-04-11T09:25:00Z"
23+
"updatedAt": "2026-04-11T15:10:00Z"
3224
}

CHANGELOG.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,38 @@
11
# @tangle-network/browser-agent-driver
22

3+
## 0.24.0
4+
5+
### Minor Changes
6+
7+
- Gen 21 + 26b + 28: parallel tabs, site pattern learning, multi-model orchestration
8+
9+
**Gen 21 — Parallel Tab Execution:**
10+
11+
- GoalDecomposer classifies goals as simple vs compound (1 cheap LLM call)
12+
- ParallelRunner creates N tabs, runs sub-goals via Promise.all
13+
- EvidenceMerger combines results into one coherent answer
14+
- Opt-in via `parallelTabs: { enabled: true, maxTabs: 3 }`
15+
16+
**Gen 26b — Site Pattern Learning:**
17+
18+
- Mechanical pattern extraction after successful runs (no LLM call)
19+
- Learns: cookie banner dismissal, page load timing, search URL patterns, form field sequences
20+
- Confidence-scored facts: repeated observation boosts, contradiction decays, <0.1 auto-prunes
21+
- `knowledge.clearPatterns()` to wipe learned facts, `knowledge.reset()` for full reset
22+
- Stored in `.agent-memory/knowledge/<domain>.json` — commit to repo or cache in CI
23+
24+
**Gen 28 — Multi-Model Orchestration:**
25+
26+
- `models.planner/executor/verifier/supervisor` per-role config
27+
- Each role falls back to main model when not set
28+
- Use expensive models for planning, cheap models for execution
29+
30+
**Docs:**
31+
32+
- Comprehensive README rewrite with organized ToC
33+
- All Gen 21-28 features documented with examples
34+
- Benchmark results, competitive leaderboard, SDK surface
35+
336
## 0.23.0
437

538
### Minor Changes

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@tangle-network/browser-agent-driver",
3-
"version": "0.23.0",
3+
"version": "0.24.0",
44
"description": "LLM-driven browser agent for UI automation, testing, and evaluation",
55
"publishConfig": {
66
"access": "public"

0 commit comments

Comments
 (0)