Fix: 로그인 완료 자동 감지, completion_check LLM 판단, 프롬프트 규칙 보완 by SeoYeongBaek · Pull Request #88 · gdsc-ssu/surfy

SeoYeongBaek · 2026-03-20T02:55:32Z

Summary

로그인 완료 자동 감지: human_gateway_node를 async로 변경하여 인증(Human-in-the-Loop) 이후 페이지 상태(URL, DOM)를 실제로 비교. 변화가 없으면 재확인 interrupt를 발생시켜 "완료했나요?" 수동 질문 제거
completion_check LLM 판단 추가: completion_check_node를 async로 변경하여 anchor 달성 여부를 LLM으로 먼저 평가. 달성 안 됐으면 planner로 재라우팅하여 조기 완료 오탐 방지
Makefile doppler fallback: doppler 미설치 환경에서도 .env 파일로 실행 가능하도록 _RUN 변수로 분기
planner 프롬프트: 예매/예약/구매 시 조회 결과만으론 anchor 달성이 아님을 명시. 결제 버튼 클릭 직전까지 태스크 생성 필수
report 프롬프트: 사용자가 '표', '목록' 등 특정 출력 형식 요청 시 반드시 해당 형식으로 출력하도록 규칙 추가

Test plan

로그인 필요 사이트에서 Human-in-the-Loop 발생 → 로그인 전/후 상태 자동 감지 확인
로그인 미완료 상태에서 재확인 interrupt 발생 확인
예매 태스크에서 조회 후 자동 완료 처리되지 않고 예약하기 클릭 단계까지 진행하는지 확인
doppler 미설치 환경에서 make serve 정상 동작 확인
"표로 정리해줘" 요청 시 report가 마크다운 표로 출력되는지 확인

…valuate

rover0811 · 2026-03-20T04:25:01Z

            return {"done": True}
        if is_auth:
+            page_before = state.get("last_page_state")
+            page_after = await actor._browser.get_page_state()


graph 노드에서 actor._browser로 private 멤버에 직접 접근하고 있습니다. 헥사고날 아키텍처 규칙상 graph는 port 인터페이스만 사용해야 합니다. compile_graph에 browser: BrowserPort 파라미터를 추가하고 await browser.get_page_state()로 변경하는 것이 맞을 것 같습니다.

"헥사고날 아키텍처 규칙상 graph는 port 인터페이스만 사용" 이 부분은 제가 헥사고날 아키텍처에 대한 이해도가 떨어져서 잘못 구현한 것 같습니다.

claude가 실수할까봐 AGENTS.md에 넣어뒀는데, claude.md를 먼저읽어서 못읽었을 수도 있겠네요.

surfy/AGENTS.md

Lines 65 to 91 in 7e6597d

### Dependency Rules (CRITICAL)

These are absolute rules. Violations break the architecture.

1. **domain/ must NOT import from adapters/**. Ever. Domain is pure.

2. **domain/services/ must depend on ports (ABC) only**. Constructor params must be port types.

3. **adapters/ may import from domain/ports and domain/models only**. Not from other adapters.

4. **graph.py imports services, not adapters**. Wiring happens in main.py.

```python

# ✅ CORRECT — service depends on port

class ActorService:

def __init__(self, browser: BrowserPort, llm: LLMPort): ...

# ❌ WRONG — service depends on concrete adapter

class ScoutService:

def __init__(self, agent: BrowserUseAgentAdapter): ...

```

### Adding New External Dependencies

When you need a new external system (API, database, etc.):

1. Define a port in `domain/ports/` (ABC with abstract methods)

2. Create adapter in `adapters/` implementing that port

3. Use the port type in services — never the adapter directly

4. Wire adapter → port in `main.py`

rover0811 · 2026-03-20T04:25:01Z

+        if llm is not None and last_page_state is not None:
+            eval_result = await llm.evaluate(
+                SuccessCriteria(description=anchor),
+                last_page_state,


anchor 텍스트를 SuccessCriteria(description=...)에만 넣으면 구조적 체크(url_contains, text_visible)는 전부 skip되고 매번 LLM 호출이 발생합니다. completion마다 Gemini 호출 1회(4-6초)가 추가되는 점은 의도된 것인지 확인 부탁드립니다.

초반에는 정확한 완료 확인을 위해 LLM 체크가 필요하다고 생각했는데..생각보다 호출이 빈번하고 planner가 같은 역할을 하고 있어 중복이었습니다. LLM 호출을 제거하고 유저 interrupt로만 처리하도록 수정하겠습니다.

rover0811 · 2026-03-20T04:25:01Z

+            if not login_completed:
+                result2 = interrupt(
+                    {
+                        "type": "auth_verify_failed",


auth_verify_failed라는 새 interrupt 타입을 사용하고 있는데, Extension InterruptBlock.tsx에서 이 타입을 처리하는 UI가 없습니다. auto-approve에서의 동작도 정의되지 않았고요. Extension 쪽 대응 없이 머지되면 런타임에 fallback으로 빠질 수 있습니다.

네 수정하겠습니다.

rover0811 · 2026-03-20T04:25:01Z

 	@echo "Starting surfy server..."
 	@curl -sf http://localhost:3000/api/public/health >/dev/null 2>&1 || echo "⚠️  Langfuse not running. Run 'make langfuse' first for tracing."
-	@doppler run -- uv run python main.py --serve --port 8765 &
+	@$(if $(shell command -v doppler 2>/dev/null),doppler run -- uv run python main.py --serve --port 8765,uv run --env-file .env python main.py --serve --port 8765) &


serve에서는 _RUN 변수를 사용하는데, restart에서는 인라인 분기로 직접 처리하고 있어 동일 로직이 두 가지 방식으로 존재합니다. restart도 _RUN 변수로 통일하면 유지보수가 편할 것 같습니다.

restart에서 $(_RUN) 대신 $(if ...) 조건을 직접 인라인으로 썼고 fallback 명령도 serve와 다른 걸 발견해서 수정하겠습니다.

1. graph: actor._browser private 접근 제거 → compile_graph에 browser: BrowserPort 파라미터 추가 후 await browser.get_page_state() 사용 2. messages: InterruptMessageData에 auth_verify_failed 타입 추가, Extension InterruptBlock.tsx에서 해당 타입 처리 UI 추가 3. graph: completion_check에서 anchor 기반 LLM 평가 제거 (매 completion마다 불필요한 Gemini 호출 방지) 4. Makefile: restart에서 _RUN 변수 사용으로 통일

rover0811

리뷰 피드백 4건 모두 반영 확인했습니다.

수정 사항:

browser: BrowserPort | None — None일 때 warning 로그 추가
dom_text[:300] → [:500]으로 비교 범위 확장 (SPA 로그인 대응)

→ 수정 커밋 push 완료, merge 가능.

rover0811

리뷰 피드백 4건 반영 확인. 아래 2건 수정 후 merge.

rover0811 · 2026-03-20T08:22:41Z

@@ -349,11 +351,30 @@ def human_gateway_node(state: AgentState) -> dict[str, object]:
        )


browser가 None이면 로그인 감지가 경고 없이 스킵됨. warning 로그 추가하거나 required로 변경.

if is_auth and browser is None: logger.warning("browser port not provided — skipping login detection")

rover0811 · 2026-03-20T08:22:41Z

+            page_before = state.get("last_page_state")
+            page_after = await browser.get_page_state()
+
+            login_completed = (


SPA 로그인(URL 안 바뀌고 DOM만 변경)에서 300자 header가 동일하면 false negative. 당장은 OK, 이슈 생기면 비교 범위 확장 필요.

Suggested change

login_completed = (

or page_before.dom_text[:500] != page_after.dom_text[:500]

수정 커밋(9861d58) push 완료

fix: 로그인 완료 자동 감지, completion_check LLM 판단, 프롬프트 규칙 보완

21d0a0b

SeoYeongBaek requested a review from rover0811 as a code owner March 20, 2026 02:55

SeoYeongBaek changed the title ~~fix: 로그인 완료 자동 감지, completion_check LLM 판단, 프롬프트 규칙 보완~~ Fix: 로그인 완료 자동 감지, completion_check LLM 판단, 프롬프트 규칙 보완 Mar 20, 2026

SeoYeongBaek added 2 commits March 20, 2026 11:57

fix: pyright type error - narrow PageState | None before passing to e…

17da942

…valuate

fix: add AsyncMock for actor._browser.get_page_state in auth test

9540da8

rover0811 reviewed Mar 20, 2026

View reviewed changes

rover0811 previously requested changes Mar 20, 2026

View reviewed changes

fix: browser None warning 추가, dom_text 비교 범위 500자로 확장

9861d58

rover0811 merged commit 26442bf into main Mar 20, 2026
1 check passed

	### Dependency Rules (CRITICAL)

	These are absolute rules. Violations break the architecture.

	1. domain/ must NOT import from adapters/. Ever. Domain is pure.
	2. domain/services/ must depend on ports (ABC) only. Constructor params must be port types.
	3. adapters/ may import from domain/ports and domain/models only. Not from other adapters.
	4. graph.py imports services, not adapters. Wiring happens in main.py.

	```python
	# ✅ CORRECT — service depends on port
	class ActorService:
	def __init__(self, browser: BrowserPort, llm: LLMPort): ...

	# ❌ WRONG — service depends on concrete adapter
	class ScoutService:
	def __init__(self, agent: BrowserUseAgentAdapter): ...
	```

	### Adding New External Dependencies

	When you need a new external system (API, database, etc.):

	1. Define a port in `domain/ports/` (ABC with abstract methods)
	2. Create adapter in `adapters/` implementing that port
	3. Use the port type in services — never the adapter directly
	4. Wire adapter → port in `main.py`

		@@ -349,11 +351,30 @@ def human_gateway_node(state: AgentState) -> dict[str, object]:
		)

	login_completed = (
	or page_before.dom_text[:500] != page_after.dom_text[:500]

Conversation

SeoYeongBaek commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SeoYeongBaek Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rover0811 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rover0811 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SeoYeongBaek commented Mar 20, 2026 •

edited

Loading

SeoYeongBaek Mar 20, 2026 •

edited

Loading

rover0811 left a comment •

edited

Loading