Conversation
… script, devtools, options, popup, and new tab functionalities - Added background script to handle messages from popup - Implemented content script with basic logging - Created devtools page with a link to the repository - Developed options page to sync and display count from popup - Built popup with increment and decrement functionality for count - Designed new tab page displaying the current time with a background image - Established side panel to show synchronized count from popup - Configured manifest for Chrome extension with necessary permissions and pages - Set up Vite configuration for building the extension - Included TypeScript configuration for type safety - Added CSS styles for various components to enhance UI
…live logs Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
…e extension Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
…ference Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
- Implemented a WebSocket server to facilitate communication between the Chrome extension and Browser.AI using the Chrome DevTools Protocol (CDP). - Created CDPBrowserContextManager to manage browser contexts and commands for each tab. - Added event handling for attaching and detaching tabs, sending CDP commands, and starting tasks. - Defined WebSocket events for client-server communication, including connection acknowledgment and task status updates. - Included detailed logging and error handling for better debugging and monitoring. - Documented the WebSocket server architecture, usage examples, and command formats in the README.
… handling - Implemented notification HTML structure and CSS styles. - Created Notification component to display messages based on URL parameters or background script messages. - Added loading spinner and dynamic notification types (user interaction, task complete, error). - Developed index file to render Notification component. - Introduced ChatInput component for user message input with send functionality. - Created ControlButtons component for task control (start, pause, resume, stop). - Implemented ExecutionLog component to display logs with different levels and metadata. - Added TaskStatus component to show current task status with visual indicators.
- Created LOG_STREAMING_FIX.md to document the problem and solution for log streaming issues between the Browser.AI server and Chrome extension. - Unified LogEvent dataclass definitions in event_adapter.py and protocol.py to ensure compatibility and consistency in data types. - Updated websocket_server.py to simplify event broadcasting and remove redundant serialization methods. - Enhanced PROTOCOL.md with detailed WebSocket communication protocol, including data structures and event definitions. - Summarized protocol implementation changes in PROTOCOL_IMPLEMENTATION_SUMMARY.md, highlighting updates to TypeScript and Python protocol files. - Added type-safe protocol definitions in protocol.ts for TypeScript and protocol.py for Python, ensuring type safety and consistency. - Developed verify_log_streaming_fix.py to validate the compatibility of LogEvent objects with the protocol and ensure proper serialization.
…Chrome APIs Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
…actor WebSocket server error handling and testing
…enhance event emission and error handling in websocket server
…rowser.AI Chrome extension - Added detailed documentation on state management and persistence in STATE_MANAGEMENT.md - Summarized state management changes and benefits in STATE_MANAGEMENT_SUMMARY.md - Introduced modern UI features and components in UI_FEATURES.md - Created a visual guide for the extension's UI in UI_GUIDE.md - Enhanced user experience with a professional settings page in UX_IMPROVEMENTS.md - Improved chat interface and execution log formatting for better clarity - Integrated Chrome API for settings management and real-time updates - Optimized rendering and memory management for better performance
…d-b0a6-3dd7274a776d Add Chrome Extension with Side Panel UI, Settings Page, User-Friendly Messages, Global State Persistence, and WebSocket Control for Browser Automation
… script, devtools, options, popup, and new tab functionalities - Added background script to handle messages from popup - Implemented content script with basic logging - Created devtools page with a link to the repository - Developed options page to sync and display count from popup - Built popup with increment and decrement functionality for count - Designed new tab page displaying the current time with a background image - Established side panel to show synchronized count from popup - Configured manifest for Chrome extension with necessary permissions and pages - Set up Vite configuration for building the extension - Included TypeScript configuration for type safety - Added CSS styles for various components to enhance UI
…live logs Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
…e extension Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
…ference Co-authored-by: Sathursan-S <84266926+Sathursan-S@users.noreply.github.com>
- Implemented a WebSocket server to facilitate communication between the Chrome extension and Browser.AI using the Chrome DevTools Protocol (CDP). - Created CDPBrowserContextManager to manage browser contexts and commands for each tab. - Added event handling for attaching and detaching tabs, sending CDP commands, and starting tasks. - Defined WebSocket events for client-server communication, including connection acknowledgment and task status updates. - Included detailed logging and error handling for better debugging and monitoring. - Documented the WebSocket server architecture, usage examples, and command formats in the README.
- Added VoiceConversationService to manage continuous voice interactions. - Integrated voice recognition and text-to-speech functionalities. - Implemented automatic turn-taking, silence detection, and message handling. - Enhanced UI with Live Voice Mode, including status indicators and live transcript display. - Updated ConversationMode component to support live voice interactions. - Removed deprecated VoiceInput and VoiceSettings components. - Updated CSS styles for new live voice features. - Modified agent configuration to use the latest planner model and added planner interval setting.
…b.com/Sathursan-S/Browser.AI into feat/extension-with-chatbot-and-voice
- Created comprehensive documentation for Live Voice Mode including guides, quick start, and implementation details. - Added state diagrams and user journey visualizations to enhance understanding of the Live Voice Mode workflow. - Introduced a new service `VoiceConversation.ts` to manage the voice interaction flow. - Enhanced UI components to support Live Voice Mode with real-time transcript display and animated status indicators. - Implemented error handling and configuration options for a better user experience.
…composing and sending
…b.com/Sathursan-S/Browser.AI into feat/extension-with-chatbot-and-voice
…with-chatbot-and-voice
- Removed deprecated styles and components related to conversation header and buttons. - Introduced new input field structure with integrated voice and send buttons. - Enhanced voice button functionality to handle live and manual voice input modes. - Updated button states and titles for better user experience. - Improved styling for input fields, buttons, and live voice status to ensure consistency and clarity.
…b.com/Sathursan-S/Browser.AI into feat/extension-with-chatbot-and-voice
… ChatInput components; remove unused ConversationView files
- Introduced new PNG icons in various sizes (16x16, 32x32, 48x48, 128x128) for the browser AI extension. - Icons are added to the public icons and img directories to enhance the visual representation of the extension.
…b.com/Sathursan-S/Browser.AI into feat/extension-with-chatbot-and-voice
…nd visual guides - Added Voice Input (Speech-to-Text) and Speech Output (Text-to-Speech) services - Integrated UI components for microphone and speech toggle functionality - Enhanced styling for a professional appearance and smooth user experience - Created comprehensive documentation including implementation details, usage examples, and error handling - Developed visual guides and workflow diagrams for user interactions and troubleshooting - Established a memory tracking pattern for intelligent website selection workflows
- Refactor `Controller` by moving actions to `browser_ai/actions/` with categorized modules (navigation, interaction, extraction, utility). - Refactor `Agent` by extracting media generation logic to `browser_ai/agent/media.py`. - Refactor `browser_ai_gui` by moving `TaskManager` to `browser_ai_gui/services/task_manager.py`. - Fix `EventType` usage in `test_extension_server.py`. - General cleanup of imports and structure.
- Unify Chat and Agent modes into a single seamless interface. - Implement "Smart Steps" visualization to group technical logs into high-level task steps. - Add Light/Dark mode theming with a clean "Copilot-like" default. - Inject a floating status overlay into the active web page. - Update Tailwind configuration and React components to support the new design.
… cross-platform compatibility
- Unify Chat and Agent modes into a single seamless interface. - Implement "Smart Steps" visualization to group technical logs into high-level task steps. - Add Light/Dark mode theming with a clean "Copilot-like" default. - Inject a floating status overlay into the active web page. - Add "Concentration Mode" with Jarvis-style voice visualization. - Add Sticky Task Status header that expands on completion. - Update Tailwind configuration and React components to support the new design.
Refactor: Clean code and restructure folders
- Removed unused ChatMessage component. - Updated ConversationMode.css for improved styling and added visual effects. - Enhanced ConversationMode.tsx to include audio visualization logic using Web Audio API. - Modified TaskStatusHeader to remove auto-expand functionality and improve user control. - Introduced VoiceVisualizer component for dynamic audio level visualization during voice interactions. - Added task and chat history management functions in state.ts for persistent storage.
Jules ux redesign 2
|
Important Installation incomplete: to start using Gemini Code Assist, please ask the organization owner(s) to visit the Gemini Code Assist Admin Console and sign the Terms of Services. |
WalkthroughThis PR introduces significant enhancements to Browser.AI including intelligent website selection, a comprehensive event bus system, location detection services, and a full Chrome extension with real-time chat and voice capabilities. Additionally, documentation is restructured and environment/build configurations are expanded. Changes
Sequence DiagramssequenceDiagram
participant User
participant Extension as Chrome Extension<br/>(SidePanel)
participant Server as Browser.AI<br/>Server
participant Agent as Agent Service
participant LLM as LLM (Gemini)
participant Browser as Browser<br/>Controller
User->>Extension: Enter task
Extension->>Server: emit start_task (task, CDP endpoint)
Server->>Agent: Initialize with task
Agent->>LLM: Get initial planning
LLM-->>Agent: Plan steps
loop Agent Execution Loop
Agent->>Agent: Get state (screenshots, DOM)
Agent->>LLM: Decide next action
LLM-->>Agent: Return action (type, params)
Agent->>Browser: Execute action
Browser-->>Agent: Action result
Agent->>Server: emit log_event (step completed)
Server->>Extension: Broadcast log_event
Extension->>Extension: Update UI (step status)
end
Agent->>LLM: Final result assessment
LLM-->>Agent: Task complete
Agent->>Server: emit task_completed
Server->>Extension: Broadcast status (completed)
Extension->>User: Display result & offer GIF download
sequenceDiagram
participant User
participant Extension as Extension:<br/>ConversationMode
participant VoiceRec as Voice<br/>Recognition
participant VoiceConv as Voice<br/>Conversation Service
participant TTS as Text-to-<br/>Speech
participant Server as Server
participant Chatbot as Chatbot<br/>Service
User->>Extension: Click "Start Live Voice"
Extension->>VoiceRec: Initialize speech recognition
VoiceRec->>VoiceRec: Start listening
User->>VoiceRec: Speak task intent
VoiceRec->>VoiceConv: Transcript received (interim/final)
VoiceConv->>Server: Send user message
Server->>Chatbot: Process message
Chatbot->>Chatbot: Generate response + detect intent
Chatbot-->>Server: Response + intent
Server->>Extension: Emit chat_response
Extension->>TTS: Synthesize bot response
TTS->>User: Play audio
alt If intent detected (task ready)
Chatbot-->>Server: Task detected
Server->>Extension: Emit intent (task_description, is_ready)
Extension->>Extension: Show "Ready to automate" prompt
User->>Extension: Confirm automation
Extension->>Server: emit start_clarified_task
Server->>Agent: Start task execution
else
VoiceConv->>VoiceRec: Resume listening
User->>VoiceRec: Continue conversation
end
sequenceDiagram
participant User as User/<br/>Website
participant Agent as Agent
participant Controller as Controller
participant LocationDet as Location<br/>Detector
participant WebsiteFinder as Find Best<br/>Website Action
participant Navigator as Search<br/>Navigation
participant Ecommerce as Search<br/>Ecommerce Action
Agent->>Agent: Detect shopping task
Agent->>Controller: Execute find_best_website
Controller->>LocationDet: Detect user location
LocationDet->>LocationDet: Query geolocation API
LocationDet-->>Controller: Location (country, currency, region)
Controller->>WebsiteFinder: find_best_website(purpose, category)
WebsiteFinder->>Navigator: search_google_with_ai (location-aware query)
Navigator->>User: Open Google + AI mode
User-->>Navigator: Render AI search results
Navigator-->>WebsiteFinder: Extract top websites
WebsiteFinder-->>Controller: Recommended website + context
Controller->>Ecommerce: search_ecommerce(query, site=recommended)
Ecommerce->>Navigator: Navigate to website + currency context
Navigator->>User: Load ecommerce site
User-->>Navigator: Render products
Navigator-->>Ecommerce: Product list + page content
Ecommerce-->>Agent: Search completed with results
Agent->>Agent: Continue with product selection/purchase
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Complexity factors:
Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
browser_ai/agent/prompts.py (1)
125-265: Fix inconsistent rule numbering in important_rules().The rule numbers are inconsistent and duplicated:
- Line 125: "6. SEARCH STRATEGIES"
- Line 149: "7. LOCATION-AWARE SHOPPING"
- Line 166: "7. INTELLIGENT WEBSITE SELECTION" (duplicate 7)
- Line 186: "8. FAST PRODUCT RESULTS"
- Line 197: "9. MULTI-SITE SEARCH STRATEGY"
- Line 206: "10. TASK COMPLETION"
- Line 228: "9. VISUAL CONTEXT" (should be 11)
- Line 236: "10. Form filling" (should be 12)
- Line 239: "11. ACTION SEQUENCING" (should be 13)
- Line 254: "9. Long tasks" (should be 14)
- Line 258: "10. SCROLLING BEHAVIOR" (should be 15)
- Line 266: "11. Extraction" (should be 16)
- Line 269: "12. DOCUMENT DOWNLOADING" (should be 17)
- Line 283: "13. EMAIL SENDING" (should be 18)
This could confuse the LLM when referencing specific rules.
🟠 Major comments (27)
browser_ai_extension/browse_ai/src/sidepanel/components/Visuals/VoiceVisualizer.tsx-17-142 (1)
17-142: Stale closure:audioLevelchanges won't update the visualization.The
draw()function capturesaudioLevelfrom the closure whenuseEffectruns. SinceaudioLevelis not in the dependency array (line 142), the animation will always use the initial/stale value and won't respond to real-time audio level changes—defeating the visualizer's purpose.Use a ref to hold the current
audioLevelvalue so the animation loop always reads the latest value without restarting:export const VoiceVisualizer: React.FC<VoiceVisualizerProps> = ({ isListening, isSpeaking, audioLevel = 0, }) => { const canvasRef = useRef<HTMLCanvasElement>(null) const animationRef = useRef<number>() + const audioLevelRef = useRef(audioLevel) + + // Keep ref in sync with prop + useEffect(() => { + audioLevelRef.current = audioLevel + }, [audioLevel]) useEffect(() => { const canvas = canvasRef.current ... const draw = () => { + const currentAudioLevel = audioLevelRef.current ... - const dynamicScale = isListening && audioLevel > 0.01 ? 1 + audioLevel * 0.5 : 1 + const dynamicScale = isListening && currentAudioLevel > 0.01 ? 1 + currentAudioLevel * 0.5 : 1 // Apply the same pattern to all other audioLevel usages in draw() ... } ... - }, [isListening, isSpeaking]) + }, [isListening])Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/components/CustomScroll.tsx-16-58 (1)
16-58: Fix global style tag lifecycle to support multiple mounted instancesWhen multiple
CustomScrollinstances are mounted concurrently, each creates its ownstyleSheetreference but all share the same#custom-scroll-styleselement. The cleanup logic uses instance-specific reference comparison (existingStyle === styleSheet), which causes the first instance to remove the style when unmounting—breaking scrollbar styling for remaining instances.Since scrollbar CSS is global and should exist once per page, replace the per-instance lifecycle with idempotent creation:
- React.useEffect(() => { - const scrollElement = scrollRef.current - if (!scrollElement) return - - // Add custom scrollbar styles via JavaScript - const styleSheet = document.createElement('style') - styleSheet.textContent = ` - .custom-scroll::-webkit-scrollbar { - width: 8px; - } - - .custom-scroll::-webkit-scrollbar-track { - background: rgba(255, 255, 255, 0.05); - border-radius: 4px; - } - - .custom-scroll::-webkit-scrollbar-thumb { - background: rgba(255, 255, 255, 0.2); - border-radius: 4px; - transition: background-color 0.2s ease; - } - - .custom-scroll::-webkit-scrollbar-thumb:hover { - background: rgba(255, 255, 255, 0.3); - } - - .custom-scroll::-webkit-scrollbar-corner { - background: transparent; - } - ` - - if (!document.head.querySelector('#custom-scroll-styles')) { - styleSheet.id = 'custom-scroll-styles' - document.head.appendChild(styleSheet) - } - - return () => { - const existingStyle = document.head.querySelector('#custom-scroll-styles') - if (existingStyle && existingStyle === styleSheet) { - document.head.removeChild(styleSheet) - } - } - }, []) + React.useEffect(() => { + if (typeof document === 'undefined') return + + if (!document.head.querySelector('#custom-scroll-styles')) { + const styleSheet = document.createElement('style') + styleSheet.id = 'custom-scroll-styles' + styleSheet.textContent = ` + .custom-scroll::-webkit-scrollbar { + width: 8px; + } + + .custom-scroll::-webkit-scrollbar-track { + background: rgba(255, 255, 255, 0.05); + border-radius: 4px; + } + + .custom-scroll::-webkit-scrollbar-thumb { + background: rgba(255, 255, 255, 0.2); + border-radius: 4px; + transition: background-color 0.2s ease; + } + + .custom-scroll::-webkit-scrollbar-thumb:hover { + background: rgba(255, 255, 255, 0.3); + } + + .custom-scroll::-webkit-scrollbar-corner { + background: transparent; + } + ` + document.head.appendChild(styleSheet) + } + }, [])browser_ai_extension/browse_ai/LICENSE-1-21 (1)
1-21: Replace the**placeholder with the actual copyright holderThe MIT text is correct, but Line 3 still has
**as a placeholder. Replace this with the actual copyright owner name (individual or organization) so the license is legally clear and unambiguous.browser_ai_extension/browse_ai/src/popup/Popup.tsx-4-58 (1)
4-58: Replace boilerplate popup with actual Browser.AI functionality.This appears to be template/boilerplate code from
create-chrome-extwith a counter demo. The popup should reflect actual Browser.AI features (e.g., status display, quick actions, settings) rather than a counter interface with a generator watermark.Do you want me to suggest a popup design aligned with the Browser.AI agent functionality described in the PR?
browser_ai_extension/browse_ai/src/devtools/DevTools.css-90-94 (1)
90-94: Remove duplicatemainselector.The
mainelement is already styled at lines 8-17 with different properties. This duplicate definition will override previous styles and cause conflicts (e.g.,padding: 1emvspadding: 32px 20px).-main { - text-align: center; - padding: 1em; - margin: 0 auto; -}If center alignment is needed, add it to the first
mainblock (lines 8-17).browser_ai_extension/browse_ai/src/sidepanel/components/ExecutionLog.tsx-152-152 (1)
152-152: Remove console.log statement.Debug logging should be removed from production code to avoid console clutter and potential information leakage.
Apply this diff:
const formatUserMessage = (log: LogEvent): string => { // In dev mode, return original message if (devMode) return log.message - console.log('Formatting user message:', log) - const msg = log.message.toLowerCase()browser_ai/agent/prompts.py-430-442 (1)
430-442: Avoid mutable default argument.Using
[]as a default argument is a Python anti-pattern because the same list instance is shared across all calls.def __init__( self, state: BrowserState, result: Optional[List[ActionResult]] = None, - include_attributes: list[str] = [], + include_attributes: Optional[list[str]] = None, max_error_length: int = 400, step_info: Optional[AgentStepInfo] = None, ): self.state = state self.result = result self.max_error_length = max_error_length - self.include_attributes = include_attributes + self.include_attributes = include_attributes if include_attributes is not None else [] self.step_info = step_infobrowser_ai_extension/browse_ai/src/notification/index.tsx-6-7 (1)
6-7: Add runtime check for missing notification-root element.Unlike
devtools/index.tsx, this code uses a non-null assertion without first verifying the element exists. Ifnotification-rootis missing fromnotification.html, this will throw a cryptic error at runtime.Apply this diff to add proper error handling:
-const container = document.getElementById('notification-root') -const root = createRoot(container!) +const container = document.getElementById('notification-root') +if (!container) { + throw new Error( + 'Notification root element #notification-root not found. Ensure notification.html includes this element.', + ) +} +const root = createRoot(container)browser_ai_extension/browse_ai/notification.html-6-6 (1)
6-6: Fix broken icon path that references non-existent file.The notification.html references
/img/logo-48.pngwhich does not exist in the repository. Devtools.html and options.html use/icons/logo.ico. The actual icon files are located atpublic/icons/logo.icoandpublic/icons/logo.svg. Update notification.html to use the correct icon path consistent with the other files.browser_ai_extension/browse_ai/src/sidepanel/components/ExecutionLog.css-1-36 (1)
1-36: Remove duplicate style definitions.Lines 1-36 define a complete set of styles that are immediately overridden by lines 37-318. All class names (
.execution-log-container,.execution-log-header,.log-badge, etc.) are defined twice with different values, making the first 36 lines dead code that will never be applied.This appears to be an incomplete refactoring from dark to light theme. Consider:
- Option 1 (recommended): Remove lines 1-36 entirely and use CSS custom properties with theme classes for dark/light mode support (similar to the pattern in
browser_ai_extension/browse_ai/src/sidepanel/index.css).- Option 2: Keep only lines 37-318 if light theme is the final design direction.
Apply this diff to remove the duplicate definitions:
-.execution-log-container { display:flex; flex-direction:column; height:100%; background: rgba(255,255,255,0.02); border-radius:12px; overflow:hidden; } -.execution-log-header { display:flex; align-items:center; justify-content:space-between; padding:12px 16px; border-bottom:1px solid rgba(255,255,255,0.02); } -.log-header-title { display:flex; align-items:center; gap:10px; color: rgba(255,255,255,0.9); } -.log-header-title svg { color: #8aa0ff; filter: drop-shadow(0 6px 18px rgba(2,6,23,0.6)); } -.log-header-title h3 { font-size:14px; font-weight:600; margin:0; } -.log-count { display:inline-flex; align-items:center; justify-content:center; min-width:24px; height:20px; padding:0 6px; background:#667eea; color:white; font-size:11px; font-weight:600; border-radius:10px; } -.log-clear-btn { display:flex; align-items:center; gap:4px; padding:6px 12px; background: transparent; border:1px solid rgba(255,255,255,0.04); border-radius:6px; font-size:12px; font-weight:500; color: rgba(255,255,255,0.8); cursor:pointer; transition: all 0.2s ease; } -.log-clear-btn:hover { background: rgba(255,255,255,0.03); border-color: rgba(255,255,255,0.06); color: white; } -.execution-log-content { flex:1; overflow-y:auto; background: transparent; } -.log-empty-state { display:flex; flex-direction:column; align-items:center; justify-content:center; height:100%; padding:40px 20px; color: rgba(255,255,255,0.6); text-align:center; } -.log-empty-state svg { margin-bottom:16px; opacity:0.5; } -.log-empty-state p { font-size:15px; font-weight:600; color: rgba(255,255,255,0.9); margin:0 0 6px 0; } -.log-empty-state span { font-size:13px; color: rgba(255,255,255,0.6); } -.log-entries { padding:12px; } -.log-entry { position:relative; background: rgba(255,255,255,0.02); border:1px solid rgba(255,255,255,0.03); border-radius:8px; padding:12px 14px; margin-bottom:8px; transition: all 0.2s ease; opacity:0; animation: slideIn 0.3s ease forwards; } -.log-entry:hover { box-shadow: 0 6px 18px rgba(2,6,23,0.6); border-color: rgba(255,255,255,0.06); } -.log-entry-header { display:flex; align-items:center; gap:8px; margin-bottom:8px; } -.log-icon { font-size:16px; line-height:1; } -.log-timestamp { font-size:11px; color: rgba(255,255,255,0.6); font-weight:500; font-family: 'Courier New', monospace; } -.log-badge { font-size:10px; font-weight:600; padding:2px 8px; border-radius:4px; text-transform:uppercase; letter-spacing:0.5px; } -.log-badge.log-info { background: rgba(138,160,255,0.12); color: #8aa0ff; } -.log-badge.log-error { background: rgba(255,120,120,0.12); color: #ff9a9a; } -.log-badge.log-warning { background: rgba(255,210,120,0.08); color: #fbbf24; } -.log-badge.log-result { background: rgba(34,197,94,0.08); color: #86efac; } -.log-badge.log-debug { background: rgba(255,255,255,0.02); color: rgba(255,255,255,0.8); } -.log-message { font-size:13px; line-height:1.6; color: rgba(255,255,255,0.9); word-wrap: break-word; white-space: pre-wrap; } -.log-entry.log-error { border-left: 3px solid rgba(255,120,120,0.9); } -.log-entry.log-warning { border-left: 3px solid rgba(255,210,120,0.9); } -.log-entry.log-result { border-left: 3px solid rgba(16,185,129,0.9); } -.log-entry.log-step-entry { border-left: 3px solid #667eea; } -.log-metadata { margin-top:10px; padding:8px 10px; background: rgba(255,255,255,0.02); border-radius:4px; border:1px solid rgba(255,255,255,0.02); font-size:11px; } -.metadata-item { display:flex; gap:6px; padding:2px 0; font-family: 'Courier New', monospace; } -.metadata-key { color: rgba(255,255,255,0.6); font-weight:600; } -.metadata-value { color: rgba(255,255,255,0.9); word-break:break-all; } - -@keyframes slideIn { from { opacity:0; transform: translateY(10px);} to { opacity:1; transform: translateY(0);} } .execution-log-container {Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/manifest.ts-4-59 (1)
4-59: Update@crxjs/vite-pluginto the latest stable version to resolve type mismatches.The manifest contains 15
@ts-ignoredirectives suppressing type checking. The project currently uses version2.0.0-beta.26; upgrading to the latest stable release2.2.1(Oct 2025) should resolve these type mismatches, as the stable version has improved TypeScript support.After updating the dependency, remove the
@ts-ignorecomments and address any remaining type errors with explicit type assertions if needed.browser_ai_extension/browse_ai/src/options/Options.tsx-94-120 (1)
94-120: Network requests lack timeout handling.The
fetchcalls at Lines 99 and 136 have no timeout. If the server is unresponsive, the UI will hang indefinitely in "loading" or "saving" state.Add timeout using
AbortController:const loadServerConfig = async () => { setConfigStatus('loading') setConnectionStatus('connecting') + const controller = new AbortController() + const timeoutId = setTimeout(() => controller.abort(), 10000) // 10s timeout try { - const response = await fetch(`${settings.serverUrl}/api/config`) + const response = await fetch(`${settings.serverUrl}/api/config`, { + signal: controller.signal, + }) + clearTimeout(timeoutId) // ... } catch (error) { + clearTimeout(timeoutId) console.error('Failed to load server config:', error) // ... } }Also applies to: 122-161
browser_ai_extension/browse_ai/src/sidepanel/components/ConversationMode.tsx-319-336 (1)
319-336: Stale closure:messagesarray captured at callback registration time.The
setMessages([...messages, userMessage])inside the callback uses themessagesvalue from whentoggleLiveVoiceModewas called, not the current state. Use functional update to avoid stale state.// Message ready callback - auto send (message: string) => { console.log('🎙️ Auto-sending message:', message) // Add user message to chat const userMessage: Message = { role: 'user', content: message, timestamp: new Date().toISOString(), } - setMessages([...messages, userMessage]) + setMessages((prevMessages) => [...prevMessages, userMessage]) setIsProcessing(true) isWaitingForResponseRef.current = true setLiveTranscript('') // Send to backend socket.emit('chat_message', { message }) },browser_ai/location_service.py-269-318 (1)
269-318: Location detection navigates away from user's current page.
detect_location_from_browserusespage.goto()to navigate to ipapi.co, which replaces the user's current page content. Consider opening a new tab, detecting location, then closing it to avoid disrupting the user's browsing session.async def detect_location_from_browser(self, browser_context) -> Optional[LocationInfo]: try: page = await browser_context.get_current_page() + original_url = page.url # Use a geolocation detection service await page.goto("https://ipapi.co/json/", wait_until="networkidle") - await page.wait_for_load_state("networkidle") # ... extraction logic ... + # Navigate back to original page if it wasn't blank + if original_url and original_url != "about:blank": + await page.goto(original_url, wait_until="load") + return location_infoAlternatively, use a new tab pattern similar to
search_google_with_ai.Committable suggestion skipped: line range outside the PR's diff.
browser_ai/actions/navigation.py-184-190 (1)
184-190: Usebrowser.navigate_to()instead of directpage.goto()for consistent URL validation.The
go_to_urlfunction bypasses the allowlist security checks implemented inBrowserContext.navigate_to(). Replace line 186 withawait browser.navigate_to(params.url)to enforce URL validation ifallowed_domainsare configured.browser_ai_extension/browse_ai/src/sidepanel/components/ConversationMode.tsx-514-521 (1)
514-521: Incomplete cleanup on unmount: MediaStream tracks not stopped.The unmount cleanup only closes AudioContext but doesn't stop the MediaStream tracks, which keeps the microphone indicator active.
// Cleanup audio on unmount useEffect(() => { return () => { + if (streamRef.current) { + streamRef.current.getTracks().forEach(track => track.stop()) + } if (audioContextRef.current) { audioContextRef.current.close() } } }, [])browser_ai_extension/browse_ai/src/sidepanel/components/ConversationMode.tsx-436-512 (1)
436-512: Audio stream not stopped on cleanup; potential resource leak.The MediaStream obtained from
getUserMediais never stopped. When cleaning up, the stream's tracks should be stopped to release the microphone.+ const streamRef = useRef<MediaStream | null>(null) + // Audio analysis for visualization useEffect(() => { if ((isListening || isLiveVoiceMode) && !audioContextRef.current) { const initAudio = async () => { try { const stream = await navigator.mediaDevices.getUserMedia({ audio: true }) + streamRef.current = stream // ... rest of setup ... } catch (err) { console.error('Error initializing audio visualization:', err) } } initAudio() } else if (!isListening && !isLiveVoiceMode && audioContextRef.current) { // Cleanup if (animationFrameRef.current) { cancelAnimationFrame(animationFrameRef.current) } + if (streamRef.current) { + streamRef.current.getTracks().forEach(track => track.stop()) + streamRef.current = null + } if (sourceRef.current) { sourceRef.current.disconnect() }browser_ai/agent/media.py-154-156 (1)
154-156:regular_font.pathmay not exist on default fonts.When fonts fail to load (lines 67-71),
ImageFont.load_default()is used, which doesn't have a.pathattribute. Line 154 then attemptsregular_font.path, which will raise anAttributeError.Add a fallback or skip the larger font creation when using default fonts:
+ # Check if we can create a larger font + if hasattr(regular_font, 'path'): larger_font = ImageFont.truetype( regular_font.path, regular_font.size + 16 ) + else: + larger_font = regular_font # Use default font as-isbrowser_ai/agent/media.py-67-71 (1)
67-71:goal_fontis assigned but never used.The variable
goal_fontis loaded from the font file but never referenced anywhere in the code. This is dead code.Remove the unused variable or use it where intended (perhaps in
_add_overlay_to_image?):regular_font = ImageFont.truetype(font_name, font_size) title_font = ImageFont.truetype(font_name, title_font_size) - goal_font = ImageFont.truetype(font_name, goal_font_size) font_loaded = True breakBased on Ruff hint (F841).
Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/sidepanel/SidePanel.tsx-240-271 (1)
240-271:task_startedhandler saves potentially stale state to history.The handler captures
taskStatus,taskResult,logs,mode, andmessagesat socket creation time due to stale closures. Whentask_startedfires later, these values may be outdated, causing incorrect history entries.Use refs or functional state updates to capture current values:
newSocket.on('task_started', (data: { message: string }) => { + // Use functional updates to access current state - if (taskStatus.current_task || taskResult || logs.length > 0) { - setTaskHistory((prev) => [ - ...prev, - { - task: taskStatus.current_task || 'Unknown Task', - result: taskResult, - logs: [...logs], + setTaskHistory((prev) => { + // Access current values via refs if needed + return [...prev, /* ... */] + })Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/sidepanel/SidePanel.tsx-175-332 (1)
175-332: Socket effect has missing dependencies and potential stale closure issues.The socket connection
useEffectat line 176-332 has several concerns:
Missing dependencies: The effect uses
mode,messages, andtaskStatusbut doesn't include them in the dependency array. This causes stale closure issues where event handlers capture outdated values.Stale
modeandmessages: Theconnecthandler (line 192) andtask_startedhandler (lines 240-265) referencemodeandmessageswhich won't update when these values change.Stale
taskStatus: Thetask_startedhandler referencestaskStatus(line 242) but it's not in deps.Consider using refs for values needed in socket callbacks, or restructuring the effect:
+ const modeRef = useRef(mode) + const messagesRef = useRef(messages) + const taskStatusRef = useRef(taskStatus) + + useEffect(() => { modeRef.current = mode }, [mode]) + useEffect(() => { messagesRef.current = messages }, [messages]) + useEffect(() => { taskStatusRef.current = taskStatus }, [taskStatus]) // Then in socket handlers, use modeRef.current instead of modeCommittable suggestion skipped: line range outside the PR's diff.
browser_ai/agent/media.py-326-444 (1)
326-444:_create_framefunction appears to be dead code with deprecated API usage.This function:
- Is never called anywhere in the file
- Uses deprecated
draw.textsize()(removed in Pillow 10.0.0)- Has unused local variables (
max_text_width,title_font)- Uses a bare
except:clauseConsider removing this dead code entirely, or if it's needed for future use, fix the deprecated API calls and issues:
- def _create_frame( - screenshot: str, - text: str, - step_number: int, - width: int = 1200, - height: int = 800, - ) -> Image.Image: - ...entire function...Based on Ruff hints (F841, E722) and Pillow API deprecation.
Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/services/VoiceConversation.ts-234-241 (1)
234-241: Infinite restart loop possible on persistent recognition errors.If
startListeningcontinuously fails (e.g., microphone permission denied), the error handler will keep retrying every 1 second indefinitely. This could drain battery and spam logs.Add a retry limit or exponential backoff:
+ private recognitionRetryCount: number = 0 + private static readonly MAX_RECOGNITION_RETRIES = 3 (error: string) => { console.error('Recognition error:', error) if (this.onError) this.onError(error) // On error, try to restart if still active - if (this.isActive) { + if (this.isActive && this.recognitionRetryCount < VoiceConversationService.MAX_RECOGNITION_RETRIES) { + this.recognitionRetryCount++ setTimeout(() => { if (this.isActive) { this.startListening() } }, 1000) + } else if (this.isActive) { + this.stop() + if (this.onError) this.onError('Max recognition retries exceeded') } },Committable suggestion skipped: line range outside the PR's diff.
browser_ai/agent/media.py-32-40 (1)
32-40: Duplicate validation checks for history.Lines 32-34 check if
history.historyis empty, and lines 38-40 check the same condition again. The second check is redundant.Consolidate the checks:
def create_history_gif(...) -> None: """Create a GIF from the agent's history with overlaid task and goal text.""" - if not history.history: - logger.warning("No history to create GIF from") - return - - images = [] - # if history is empty or first screenshot is None, we can't create a gif - if not history.history or not history.history[0].state.screenshot: + if not history.history or not history.history[0].state.screenshot: logger.warning("No history or first screenshot to create GIF from") return + + images = []browser_ai/agent/service.py-62-62 (1)
62-62: Avoid mutable default argument:Controller()called in parameter defaults.Calling
Controller()in the function signature creates a single shared instance across all calls where no controller is provided. This can lead to unexpected state sharing between Agent instances.- controller: Controller = Controller(), + controller: Controller | None = None,Then initialize within
__init__:self.controller = controller if controller is not None else Controller()browser_ai/agent/service.py-90-90 (1)
90-90: Unused parameter:tool_call_in_content.This parameter is accepted but never used in the constructor or elsewhere. Either implement its functionality or remove it to avoid confusion.
browser_ai/agent/service.py-453-463 (1)
453-463: Critical:returninsidefinallyblock silences exceptions.The
returnstatement on line 460 inside thefinallyblock will silence any exceptions that were raised in thetryorexceptblocks. This can mask errors and make debugging difficult.Additionally,
actionsvariable (line 454-458) is assigned but never used.finally: - actions = ( - [a.model_dump(exclude_unset=True) for a in model_output.action] - if model_output - else [] - ) if not result: - return + result = [] if state: self._make_history_item(model_output, state, result)
🟡 Minor comments (34)
browser_ai/dom/buildDomTree.js-46-59 (1)
46-59: Verify visual distinguishability with this grayscale palette—minimum color distance is only 3.5 (3.5% RGB difference).The grayscale palette is technically correct but raises valid distinguishability concerns. Color distance analysis reveals #C0C0C0 and #BEBEBE (indices 4 and 5) differ by only 3.5 in RGB space—these are virtually indistinguishable. When highlighting 10+ elements, users cycling through similar gray shades via the modulo indexing (line 60) will struggle to match numeric labels to visual borders.
The 12-color palette cycles in a way that puts these nearly identical colors adjacent (index 4→5), which is problematic when multiple elements are highlighted simultaneously.
Consider visual testing with sequential element highlighting to confirm users can reliably distinguish between highlighted borders at different indices. If distinguishability proves insufficient, increase contrast between adjacent palette entries or reduce the palette size.
.github/copilot-instructions.md-113-113 (1)
113-113: Fix minor typo in docs path descriptionChange “relevent” to “relevant” in the line:
docs/- Documentation and technical specifications (keep all docs and md files here within relevent folders)to avoid the spelling error in user-facing documentation.
browser_ai_extension/browse_ai/CHANGELOG.md-12-12 (1)
12-12: Verify the version timestamp.The timestamp
2025.10.04(October 4) predates the PR creation (November 30, 2025). Ensure this date reflects the actual initial version release or update it to align with current development.browser_ai_extension/browse_ai/package.json-5-6 (1)
5-6: Complete the package metadata.The
authorfield contains a placeholder (**) anddescriptionis empty. These should be populated before publication.- "author": "**", - "description": "", + "author": "Your Name <email@example.com>", + "description": "Browser.AI Chrome extension with intelligent browsing capabilities",browser_ai_extension/browse_ai/CHANGELOG.md-15-15 (1)
15-15: Fix markdown syntax for the link.The image syntax
will attempt to render as an image. Use link syntax instead.-- feat: generator by  +- feat: generator by [create-chrome-ext](https://github.com/guocaoyi/create-chrome-ext)browser_ai_extension/browse_ai/src/devtools/DevTools.css-73-75 (1)
73-75: Remove duplicate utility class definitions.
.h-10(line 74) duplicates line 46, and.w-full(line 75) duplicates line 47. These redundant definitions can cause confusion and maintenance issues.-/* pill button style (bottom) */ -.rounded-full { border-radius: 9999px; } -.h-10 { height: 40px; } -.w-full { width: 100%; } +/* pill button style (bottom) */ +.rounded-full { border-radius: 9999px; }Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/ui/Card.tsx-31-44 (1)
31-44: Fix type mismatch in CardTitle.CardTitle declares
React.forwardRef<HTMLParagraphElement, ...>but renders an<h3>element (HTMLHeadingElement). The ref type should match the actual element type.Apply this diff:
const CardTitle = React.forwardRef< - HTMLParagraphElement, + HTMLHeadingElement, React.HTMLAttributes<HTMLHeadingElement> >(({ className, ...props }, ref) => (CHANGELOG.md-262-262 (1)
262-262: Fix typo in branch name.The branch name contains a typo: "extention" should be "extension". This should be corrected if the actual branch name is
feat/extension-with-chatbot-and-voice(as indicated in the PR title).Apply this diff:
- **Project**: Browser.AI - **Repository**: Browser.AI by Sathursan-S -- **Branch**: feat/browser-extention (development) +- **Branch**: feat/extension-with-chatbot-and-voice (development) - **License**: See LICENSE filebrowser_ai_extension/browse_ai/src/manifest.ts-57-59 (1)
57-59: Remove the unuseddebuggerpermission from manifest.The
debuggerpermission (line 57) is declared but not actively used anywhere in the codebase—all code utilizingchrome.debuggeris commented out and disabled. The extension uses direct CDP connections via local Playwright setup instead of the extension-proxy mode that required this permission.The
<all_urls>host_permissions (line 59) is legitimate and necessary to support the content scripts that inject on all URLs (manifest line 43). However, this powerful permission scope should be documented in the README to clarify its necessity for users.Actions:
- Remove
'debugger'from the permissions array (line 57)- Add a brief note to README.md explaining why
<all_urls>host_permissions are required (e.g., "The extension injects content scripts across all websites to enable browser automation capabilities")browser_ai_extension/browse_ai/src/options/Options.css-437-452 (1)
437-452: Resolve duplicatefont-sizeon.btnto satisfy linter and avoid confusionIn the
.btnrule you definefont-sizetwice (15pxthen13px), and the linter rightly flags this as suspicious:.btn { ... font-size: 15px; ... font-size: 13px; }Since only the last declaration takes effect, you should remove one of them (probably the earlier
15px) to make the intended size explicit and clear to tooling:.btn { padding: 16px 32px; border: none; border-radius: 12px; - font-size: 15px; font-weight: 600; cursor: pointer; transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1); font-family: inherit; position: relative; overflow: hidden; text-transform: uppercase; letter-spacing: 0.5px; font-size: 13px; }browser_ai_extension/browse_ai/src/ui/Button.tsx-6-41 (1)
6-41: Default buttontypeto"button"to prevent accidental form submissionsThe
<button>element currently relies on the browser default (type="submit"), which can cause unintended form submissions if this component is used inside a<form>element. As a shared UI primitive, it's safer to default totype="button"and let callers explicitly opt into"submit"when needed.Suggested change:
const Button = React.forwardRef<HTMLButtonElement, ButtonProps>( - ({ className = '', variant = 'default', size = 'default', ...props }, ref) => { + ({ className = '', variant = 'default', size = 'default', type = 'button', ...props }, ref) => { return ( <button className={`${getButtonClasses(variant, size)} ${className}`} ref={ref} + type={type} {...props} /> ) } )browser_ai_extension/browse_ai/src/sidepanel/components/ConversationMode.css-556-559 (1)
556-559: Missing animation keyframe definitions.The
livePulse(Line 558) andrecordPulse(Line 608) animations are referenced but not defined in this file. This will cause the animations to silently fail.Add the missing keyframe definitions:
+@keyframes livePulse { + 0%, 100% { + transform: scale(1); + box-shadow: 0 0 20px rgba(16, 185, 129, 0.4); + } + 50% { + transform: scale(1.05); + box-shadow: 0 0 30px rgba(16, 185, 129, 0.6); + } +} + +@keyframes recordPulse { + 0%, 100% { + transform: scale(1); + box-shadow: 0 0 20px rgba(239, 68, 68, 0.4); + } + 50% { + transform: scale(1.05); + box-shadow: 0 0 30px rgba(239, 68, 68, 0.6); + } +}Also applies to: 608-609
browser_ai_extension/browse_ai/src/sidepanel/components/ControlButtons.tsx-123-126 (1)
123-126: Status indicator color inconsistent with paused state.The status dot is always green (
bg-green-400) even whenisPausedis true and the text shows "Paused". Consider using yellow/amber for the paused state to match the Pause button styling.<div className="flex items-center gap-2"> - <div className="w-3 h-3 rounded-full bg-green-400 animate-pulse shadow-lg shadow-green-400/40"></div> + <div className={`w-3 h-3 rounded-full animate-pulse shadow-lg ${ + isPaused + ? 'bg-yellow-400 shadow-yellow-400/40' + : 'bg-green-400 shadow-green-400/40' + }`}></div> <span className="text-sm font-medium text-white/90">{isPaused ? 'Paused' : 'Running'}</span> </div>browser_ai_extension/browse_ai/src/options/Options.tsx-88-92 (1)
88-92: MissingloadServerConfigin useEffect dependency array.
loadServerConfigis called inside the effect but not listed as a dependency. SinceloadServerConfigreferencessettings.serverUrl, this can lead to stale closures. Either addloadServerConfigto dependencies (withuseCallback), or inline the fetch logic.Consider wrapping
loadServerConfigwithuseCallback:+import { useState, useEffect, useCallback } from 'react' -import { useState, useEffect } from 'react' ... - const loadServerConfig = async () => { + const loadServerConfig = useCallback(async () => { setConfigStatus('loading') // ... rest of function - } + }, [settings.serverUrl]) useEffect(() => { if (settings.serverUrl) { loadServerConfig() } - }, [settings.serverUrl]) + }, [settings.serverUrl, loadServerConfig])Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/options/Options.tsx-346-349 (1)
346-349:parseIntwithout NaN validation.If the user clears the input field,
parseInt('')returnsNaN, which will be stored in state. This could cause unexpected behavior downstream.Add validation:
onChange={(e) => handleChange('maxLogs', parseInt(e.target.value))} + onChange={(e) => { + const val = parseInt(e.target.value, 10) + if (!isNaN(val)) handleChange('maxLogs', val) + }}The same issue applies to Lines 462-463 (
parseFloat), 561-562 (parseInt), and 581-582 (parseInt).Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/utils/theme.tsx-18-20 (1)
18-20: Validate localStorage value before using asTheme.
localStorage.getItem('theme')could return any string (e.g., if manually edited). The type assertionas Themeis unsafe and could lead to unexpected values being used.The suggested fix above addresses this by explicitly checking for
'dark'.browser_ai_extension/browse_ai/src/notification/Notification.tsx-19-24 (1)
19-24: Validatetypeparameter before assignment.The type assertion on line 20 is unsafe.
params.get('type')can return any string ornull, but the code assumes it's a validNotificationData['type']. This could lead to unexpected behavior if an invalid type is passed.const params = new URLSearchParams(window.location.search) - const type = params.get('type') as NotificationData['type'] + const rawType = params.get('type') + const validTypes = ['user_interaction', 'task_complete', 'error'] as const + const type = validTypes.includes(rawType as any) ? rawType as NotificationData['type'] : null const message = params.get('message') || ''browser_ai_extension/browse_ai/src/notification/Notification.tsx-144-147 (1)
144-147: Handle invalid timestamp gracefully.If
notification.timestampcontains an invalid date string,new Date(notification.timestamp).toLocaleString()will display "Invalid Date" to users. Consider adding validation.<div className="notification-timestamp"> - {new Date(notification.timestamp).toLocaleString()} + {(() => { + const date = new Date(notification.timestamp) + return isNaN(date.getTime()) ? 'Unknown time' : date.toLocaleString() + })()} </div>browser_ai/utils.py-60-68 (1)
60-68: Singleton decorator is not thread-safe.If
wrapperis called concurrently from multiple threads, the check-then-set oninstance[0]can result in multiple instances being created.If thread-safety is needed:
+import threading + def singleton(cls): instance = [None] + lock = threading.Lock() def wrapper(*args, **kwargs): - if instance[0] is None: - instance[0] = cls(*args, **kwargs) + if instance[0] is None: + with lock: + if instance[0] is None: + instance[0] = cls(*args, **kwargs) return instance[0] return wrapperbrowser_ai_extension/browse_ai/src/utils/helpers.ts-17-27 (1)
17-27: Add error handling toloadSettings.
saveSettingscheckschrome.runtime.lastError, butloadSettingsdoes not. Add consistent error handling for storage operations.export async function loadSettings(): Promise<ExtensionSettings> { return new Promise((resolve) => { chrome.storage.sync.get(['settings'], (result: any) => { + if (chrome.runtime.lastError) { + console.warn('Failed to load settings:', chrome.runtime.lastError) + resolve(DEFAULT_SETTINGS) + return + } if (result.settings) { resolve({ ...DEFAULT_SETTINGS, ...result.settings }) } else { resolve(DEFAULT_SETTINGS) } }) }) }browser_ai/controller/service.py-40-46 (1)
40-46: Avoid mutable default argument.
exclude_actions: list[str] = []is a mutable default that can cause unexpected behavior if modified. UseNoneand initialize inside the function.def __init__( self, - exclude_actions: list[str] = [], + exclude_actions: Optional[list[str]] = None, output_model: Optional[Type[BaseModel]] = None, latency_analyzer: Optional[LatencyAnalyzer] = None, ): - self.exclude_actions = exclude_actions + self.exclude_actions = exclude_actions or []browser_ai/actions/extraction.py-137-138 (1)
137-138: Add error handling and timeout for navigation.
page.goto()can fail due to network errors, DNS failures, or invalid constructed URLs (especially for the generic fallback at line 89). Consider wrapping in try/except and adding a timeout.- await page.goto(search_url) - await page.wait_for_load_state() + try: + await page.goto(search_url, timeout=30000) + await page.wait_for_load_state(timeout=30000) + except Exception as e: + logger.warning(f"Navigation to {search_url} failed: {e}") + return ActionResult(extracted_content=f"⚠️ Failed to navigate to {site}: {e}", include_in_memory=True)browser_ai/actions/extraction.py-64-64 (1)
64-64: Use proper URL encoding for search queries.
replace(" ", "+")only handles spaces. Special characters like&,=,#,?in queries will break URLs or cause incorrect searches.+from urllib.parse import quote_plus + async def search_ecommerce( params: SearchEcommerceAction, browser: BrowserContext, location_detector: LocationDetector ): page = await browser.get_current_page() - search_query = params.query.replace(" ", "+") + search_query = quote_plus(params.query)Committable suggestion skipped: line range outside the PR's diff.
browser_ai/location_service.py-257-257 (1)
257-257: Typo in e-commerce site URL.
"mighty ape.co.nz"should be"mightyape.co.nz"(no space).- preferred_ecommerce_sites=["trademe.co.nz", "themarket.co.nz", "mighty ape.co.nz"] + preferred_ecommerce_sites=["trademe.co.nz", "themarket.co.nz", "mightyape.co.nz"]browser_ai/actions/navigation.py-31-40 (1)
31-40: Inconsistent URL encoding: useurllib.parse.quote_plusinstead of manual replacement.Manual space replacement with
+doesn't handle other special characters. For consistency withsearch_google_with_ai, useurllib.parse.quote_plus.async def search_youtube(params: SearchYouTubeAction, browser: BrowserContext): page = await browser.get_current_page() - search_query = params.query.replace(" ", "+") await page.goto( - f"https://www.youtube.com/results?search_query={search_query}" + f"https://www.youtube.com/results?search_query={urllib.parse.quote_plus(params.query)}" )Committable suggestion skipped: line range outside the PR's diff.
browser_ai_extension/browse_ai/src/services/TextToSpeech.ts-183-187 (1)
183-187: Inconsistent error handling: throws after callingonError.The
speakmethod callsonErrorcallback and then throws. Callers might not expect both. Consider either throwing OR calling the callback, not both.public speak( text: string, options: TextToSpeechOptions = {}, onProgress?: SpeechProgressCallback, onEnd?: SpeechEndCallback, onError?: SpeechErrorCallback ): void { if (!this.isSupported) { const error = 'Speech Synthesis not supported' - if (onError) onError(error) - throw new Error(error) + if (onError) { + onError(error) + return + } + throw new Error(error) }browser_ai/actions/navigation.py-22-29 (1)
22-29: Missing URL encoding for search query.The
params.queryis directly interpolated into the URL without encoding. Special characters (e.g.,&,#,+) could break the URL or cause unexpected behavior. Useurllib.parse.quote_plusfor consistency withsearch_google_with_ai.+import urllib.parse + async def search_google(params: SearchGoogleAction, browser: BrowserContext): page = await browser.get_current_page() # Try to avoid CAPTCHAs by not using shopping mode for general searches - await page.goto(f"https://www.google.com/search?q={params.query}") + await page.goto(f"https://www.google.com/search?q={urllib.parse.quote_plus(params.query)}") await page.wait_for_load_state()browser_ai/actions/navigation.py-163-165 (1)
163-165: Missing URL encoding forfind_best_websitesearch query.Manual space replacement doesn't handle special characters. Use
urllib.parse.quote_plusfor consistency.- encoded_query = search_query.replace(" ", "+") - await page.goto(f"https://www.google.com/search?q={encoded_query}") + await page.goto(f"https://www.google.com/search?q={urllib.parse.quote_plus(search_query)}")Committable suggestion skipped: line range outside the PR's diff.
browser_ai/actions/interaction.py-135-139 (1)
135-139: Default mutable argument pattern:browser: BrowserContext = None.Having
browserdefault toNonewhile the function body assumes it's a validBrowserContextis misleading. If called withoutbrowser, line 141 will raiseAttributeError.Either make
browserrequired or handle theNonecase:async def wait_for_url_change( contains_text: str = "", timeout_seconds: int = 10, - browser: BrowserContext = None + browser: BrowserContext ) -> ActionResult:browser_ai_extension/browse_ai/src/sidepanel/SidePanel.tsx-505-514 (1)
505-514: Unexpected behavior: clicking main content area clears conversation state.The
onClickhandler on the main content div dismisses the task header and clears conversation messages/intent when clicked anywhere in the content area. This seems unintentional and could frustrate users who accidentally click and lose their conversation.Consider removing this behavior or making it more intentional (e.g., a dedicated "clear" button):
<div ref={scrollRef} className="flex-1 overflow-y-auto ..." - onClick={() => { - setTaskHeaderDismissed(!taskStatus.is_running) - if (mode === 'conversation') { - setMessages([]) - setIntent(null) - } - }} >browser_ai_extension/browse_ai/src/services/VoiceConversation.ts-369-369 (1)
369-369: Regex cannot properly match emoji characters without unicode flag.The regex on line 369 attempts to remove emojis but uses character classes that cannot match surrogate pairs (multi-byte emoji characters) without the
uflag. This means some emojis may not be removed.Add the unicode flag to properly handle emoji:
- .replace(/[✅🚀👋🎧🤔❓💡📝🤖🎙️🔊👂📤🚫]/g, '') // Remove common emojis + .replace(/[✅🚀👋🎧🤔❓💡📝🤖🎙️🔊👂📤🚫]/gu, '') // Remove common emojisBased on static analysis hint from Biome.
browser_ai/actions/interaction.py-181-182 (1)
181-182: Bareexceptwithpasssilently swallows all errors.This pattern hides potentially important errors and makes debugging difficult.
At minimum, catch a specific exception or log the error:
try: locator = page.get_by_text(text, exact=False) if await locator.count() > 0: text_found = True - except: - pass + except Exception as e: + logger.debug(f"get_by_text lookup failed: {e}")Based on Ruff hints (E722, S110).
browser_ai/agent/service.py-76-87 (1)
76-87: Avoid mutable default argument forinclude_attributeslist.Mutable default arguments in Python are shared across all calls, which can cause unexpected behavior if the list is modified.
- include_attributes: list[str] = [ - "title", - "type", - "name", - "role", - "tabindex", - "aria-label", - "placeholder", - "value", - "alt", - "aria-expanded", - ], + include_attributes: list[str] | None = None,Then initialize within
__init__:self.include_attributes = include_attributes if include_attributes is not None else [ "title", "type", "name", "role", "tabindex", "aria-label", "placeholder", "value", "alt", "aria-expanded", ]browser_ai_extension/browse_ai/src/utils/state.ts-59-70 (1)
59-70: Inconsistent error handling:loadCdpEndpointdoesn't checkchrome.runtime.lastError.Unlike
loadTaskStatusand other load functions,loadCdpEndpointdoesn't check forchrome.runtime.lastError. This inconsistency could hide storage errors.export const loadCdpEndpoint = async (): Promise<string | null> => { return new Promise((resolve) => { chrome.storage.local.get([STATE_KEYS.CDP_ENDPOINT], (result) => { - if (result[STATE_KEYS.CDP_ENDPOINT]) { + if (chrome.runtime.lastError) { + console.error('[State] Failed to load CDP endpoint:', chrome.runtime.lastError) + resolve(null) + } else if (result[STATE_KEYS.CDP_ENDPOINT]) { console.log('[State] Loaded CDP endpoint:', result[STATE_KEYS.CDP_ENDPOINT]) resolve(result[STATE_KEYS.CDP_ENDPOINT]) } else { resolve(null) } }) }) }
| // Create notification window | ||
| const width = 500 | ||
| const height = 400 | ||
| const left = Math.round((screen.width - width) / 2) | ||
| const top = Math.round((screen.height - height) / 2) |
There was a problem hiding this comment.
screen object may be unavailable in service worker context.
Service workers don't have access to window or screen objects. This code may throw a ReferenceError when executed.
Use fallback dimensions or query the active window:
async function handleShowNotification(
request: ShowNotificationMessage,
sendResponse: (response: any) => void,
) {
try {
const { notificationType, message, details, result } = request
const timestamp = new Date().toISOString()
// Create notification window
const width = 500
const height = 400
- const left = Math.round((screen.width - width) / 2)
- const top = Math.round((screen.height - height) / 2)
+ // Get current window to center the popup
+ const currentWindow = await chrome.windows.getCurrent()
+ const left = currentWindow.left !== undefined && currentWindow.width !== undefined
+ ? Math.round(currentWindow.left + (currentWindow.width - width) / 2)
+ : undefined
+ const top = currentWindow.top !== undefined && currentWindow.height !== undefined
+ ? Math.round(currentWindow.top + (currentWindow.height - height) / 2)
+ : undefined| /// <reference types="vite/client" /> | ||
|
|
||
| declare const __APP_VERSION__: string |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and examine vite configuration files
fd 'vite\.config\.(ts|js)' -x cat -n {}Repository: Sathursan-S/Browser.AI
Length of output: 1033
🏁 Script executed:
# Search for all occurrences of __APP_VERSION__ in the codebase
rg -n '__APP_VERSION__' -B2 -A2Repository: Sathursan-S/Browser.AI
Length of output: 289
🏁 Script executed:
# List the structure to understand the project layout
git ls-files | head -30Repository: Sathursan-S/Browser.AI
Length of output: 865
__APP_VERSION__ is declared but not injected in the Vite configuration.
The ambient declaration correctly references Vite client types, but the global __APP_VERSION__ constant is not defined in vite.config.ts. Either inject this constant using Vite's define option in the configuration, or remove the declaration if it's not needed.
🤖 Prompt for AI Agents
In browser_ai_extension/browse_ai/src/global.d.ts lines 1-3, the ambient
constant __APP_VERSION__ is declared but not provided by Vite; either remove
this declaration if you don't need a compile-time constant, or add the constant
to vite.config.ts using the define option (e.g. define: { __APP_VERSION__:
JSON.stringify(<source>) }) so Vite injects a string value (commonly from
package.json or an env var) matching the declared string type.
| /** | ||
| * Voice Recognition Service | ||
| * | ||
| * Provides speech-to-text constructor() { | ||
| // Check for browser support | ||
| const SpeechRecognitionClass = | ||
| (window as any).SpeechRecognition || | ||
| (window as any).webkitSpeechRecognition | ||
|
|
||
| if (SpeechRecognitionClass) { | ||
| this.isSupported = true | ||
| this.recognition = new SpeechRecognitionClass() as SpeechRecognition | ||
| console.log('✅ Voice Recognition API is supported and initialized') | ||
| } else { | ||
| console.warn('❌ Speech Recognition API not supported in this browser') | ||
| console.log('Available window properties:', Object.keys(window).filter(k => k.toLowerCase().includes('speech'))) | ||
| } | ||
| }y using Web Speech API. | ||
| * Handles microphone input, continuous recognition, and interim results. | ||
| */ |
There was a problem hiding this comment.
Fix malformed JSDoc comment containing code.
Lines 1-20 contain a corrupted JSDoc comment block with constructor code embedded inside it. This appears to be a merge error or paste mistake. The constructor is properly defined again at lines 81-93.
Apply this diff to fix the structure:
/**
* Voice Recognition Service
- *
- * Provides speech-to-text constructor() {
- // Check for browser support
- const SpeechRecognitionClass =
- (window as any).SpeechRecognition ||
- (window as any).webkitSpeechRecognition
-
- if (SpeechRecognitionClass) {
- this.isSupported = true
- this.recognition = new SpeechRecognitionClass() as SpeechRecognition
- console.log('✅ Voice Recognition API is supported and initialized')
- } else {
- console.warn('❌ Speech Recognition API not supported in this browser')
- console.log('Available window properties:', Object.keys(window).filter(k => k.toLowerCase().includes('speech')))
- }
- }y using Web Speech API.
+ *
+ * Provides speech-to-text functionality using Web Speech API.
* Handles microphone input, continuous recognition, and interim results.
*/🤖 Prompt for AI Agents
In browser_ai_extension/browse_ai/src/services/VoiceRecognition.ts around lines
1 to 20, the top JSDoc block is corrupted by embedded constructor code; remove
the constructor code from the JSDoc so the comment only contains valid
documentation and not executable code, then ensure the actual constructor
implementation remains intact (it's already correctly defined at lines ~81-93).
After editing, run a quick lint/typecheck to confirm no syntax errors introduced
and that the file still compiles.
| # Center goal text horizontally, place above step number | ||
| x_goal = (image.width - goal_width) // 2 | ||
| y_goal = y_step - goal_height - padding * 4 # More space between step and goal |
There was a problem hiding this comment.
Potential NameError: y_step and padding may be undefined.
When display_step is False, the variables y_step and padding are never assigned, but line 252 references them:
y_goal = y_step - goal_height - padding * 4Add fallback values or restructure the logic:
+ # Initialize defaults
+ y_step = image.height - margin - 10
+ padding = 20
+
if display_step:
# ... existing step number drawing code ...
+ y_step = image.height - margin - step_height - 10
+ padding = 20
# Draw goal text (centered, bottom)
...
y_goal = y_step - goal_height - padding * 4🤖 Prompt for AI Agents
In browser_ai/agent/media.py around lines 250-252, the computation y_goal =
y_step - goal_height - padding * 4 can raise NameError because y_step and
padding may be undefined when display_step is False; ensure both variables are
always defined before use by providing sensible fallbacks or moving their
assignments outside the display_step branch. Specifically, define a default
padding (e.g., set padding = existing_padding_value or a small constant) before
any conditional, and compute y_step with a fallback (for example derive y_step
from image.height and step_height or set y_step = image.height - step_height -
padding) so y_goal can be calculated regardless of display_step; update logic so
that when display_step is False you still calculate y_goal using the fallback
values.
| # Create planner message history using full message history | ||
| planner_messages = [ | ||
| PlannerPrompt(self.action_descriptions).get_system_message(), | ||
| *self.message_manager.get_messages()[ | ||
| 1: | ||
| ], # Use full message history except the first | ||
| ] |
There was a problem hiding this comment.
self.action_descriptions is undefined - will raise AttributeError.
The _run_planner method references self.action_descriptions on line 1158, but this attribute is never set in __init__ or elsewhere in the class. This will cause a runtime error when the planner is invoked.
Based on the MessageManager initialization (line 168), the action descriptions come from self.controller.registry.get_prompt_description(). Apply this fix:
# Create planner message history using full message history
planner_messages = [
- PlannerPrompt(self.action_descriptions).get_system_message(),
+ PlannerPrompt(self.controller.registry.get_prompt_description()).get_system_message(),
*self.message_manager.get_messages()[
], # Use full message history except the first
]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create planner message history using full message history | |
| planner_messages = [ | |
| PlannerPrompt(self.action_descriptions).get_system_message(), | |
| *self.message_manager.get_messages()[ | |
| 1: | |
| ], # Use full message history except the first | |
| ] | |
| # Create planner message history using full message history | |
| planner_messages = [ | |
| PlannerPrompt(self.controller.registry.get_prompt_description()).get_system_message(), | |
| *self.message_manager.get_messages()[ | |
| 1: | |
| ], # Use full message history except the first | |
| ] |
🤖 Prompt for AI Agents
In browser_ai/agent/service.py around lines 1156 to 1162, _run_planner
references self.action_descriptions which is not defined and will raise
AttributeError; set self.action_descriptions =
self.controller.registry.get_prompt_description() when the MessageManager is
initialized (around line 168) or in __init__ so the planner has the expected
descriptions available; ensure the assignment occurs before any call to
_run_planner and remove any duplicate lookups if MessageManager already stores
the same value.
This pull request introduces several documentation, configuration, and changelog updates to the Browser.AI project, focusing on improving developer experience, transparency, and support for new features such as the Chrome extension and intelligent website selection. The most significant changes include the creation of a comprehensive root-level changelog, documentation of a major new intelligent site selection feature, and updates to the GUI documentation to reflect new capabilities.
Project Documentation and Changelog Improvements
CHANGELOG.mddocumenting all major changes, features, and updates across the Browser.AI project, following the Keep a Changelog format and including migration guides and version history. [1] [2]CHANGELOG_SUMMARY.mddescribing the structure, purpose, and usage of the new changelog system for maintainers and users.Major Feature Release: Intelligent Website Selection
CHANGELOG_INTELLIGENT_SELECTION.md, detailing a new action for dynamic site selection, multi-site fallback strategies, enhanced prompts, backward compatibility, and migration notes.GUI and Extension Documentation Updates
GUI_README.mdto include the Chrome extension as a supported interface, describe its features, and provide installation instructions and usage workflows. [1] [2] [3].github/copilot-instructions.mdto clarify the docs directory structure.Configuration and Environment Updates
.env.exampleto remove the hardcoded Gemini API key and clarify the environment variable setup for LLM providers..python-versionto specify Python 3.12 for the project.Summary by CodeRabbit
Release Notes
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.