Skip to content

Commit e7a019e

Browse files
committed
feat: integrate Readability.js and refine content extraction strategy
1 parent a84e60d commit e7a019e

14 files changed

Lines changed: 3231 additions & 73 deletions

README.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,6 @@ AI Sidekick is a minimalist, "Arc-style" Chrome Extension that brings multi-LLM
1515
- **Secure Vault**: Client-side encryption (PBKDF2 + AES-GCM) for API keys.
1616
- **Session Persistence**: Keys stay unlocked for the browser session.
1717
- **Auto-Lock**: Automatically locks after 15 minutes of inactivity.
18-
- **Web Mode**: Fallback to `gemini.google.com` (free) if no API key is available.
1918
- **Contextual Actions**:
2019
- Right-click to Explain, Summarize, or Analyze pages.
2120
- **Summarize Button**: One-click summary of the current conversation.
@@ -85,14 +84,21 @@ node tests/run_tests.js
8584

8685
You can customize the prompts used for context menu actions and page analysis in the **Options** page (`Right-click extension icon -> Options`).
8786

87+
#### Automatic Content Extraction
88+
89+
AI Sidekick uses **Mozilla Readability.js** to automatically extract the main content of the active page. This includes:
90+
- **Link Preservation**: Links are preserved in the text as `Text [URL]`.
91+
- **Whitespace Cleaning**: Noise and excessive whitespace are removed for a cleaner prompt.
92+
- **Smart Logic**: Boilerplate (menus, ads) is automatically filtered out.
93+
8894
#### Available Variables
8995

9096
Use these placeholders to insert dynamic content into your prompts:
9197

9298
| Variable | Description | Context |
9399
| :-------------- | :----------------------------------------------------------------- | :------------------ |
94100
| `{{selection}}` | The text currently selected by the user. | Select Menu Actions |
95-
| `{{content}}` | The full text content of the active page (truncated if necessary). | Analyze Page |
101+
| `{{content}}` | The full text content of the active page (including links). | Analyze Page |
96102
| `{{url}}` | The URL of the active page. | All Actions |
97103
| `{{title}}` | The title of the active page. | All Actions |
98104

THIRD-PARTY-NOTICES.txt

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,3 +26,22 @@ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
2626
SOFTWARE.
2727

2828
================================================================================
29+
30+
MOZILLA READABILITY (Readability.js)
31+
Source: https://github.com/mozilla/readability
32+
License: Apache License 2.0
33+
Copyright (c) 2010 Arc90 Inc
34+
35+
Licensed under the Apache License, Version 2.0 (the "License");
36+
you may not use this file except in compliance with the License.
37+
You may obtain a copy of the License at
38+
39+
http://www.apache.org/licenses/LICENSE-2.0
40+
41+
Unless required by applicable law or agreed to in writing, software
42+
distributed under the License is distributed on an "AS IS" BASIS,
43+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
44+
See the License for the specific language governing permissions and
45+
limitations under the License.
46+
47+
================================================================================

docs/changelog.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,29 @@ All notable changes to this project will be documented in this file.
55
## [1.0.2] - 2026-01-18
66

77
### Added
8+
- **Smart Content Extraction**: Integrated Mozilla's Readability.js library for production-grade article extraction (same algorithm as Firefox Reader View).
9+
- **Path Audit Integration**: `tests/audit_paths.js` now runs during `npm test` to verify asset integrity.
810
- **Versioning Protocol**: Established strict version sync rules in `.agent/rules/02-versioning.md`.
911

1012
### Changed
1113
- **Documentation**: Removed "draft" warnings from README; project promoted to Beta status.
1214
- **Audit**: Validated repository consistency and cleanup.
15+
- **Refined Content Extraction**:
16+
- Always-on strategy (removed Auto/URL-only toggle for simplicity).
17+
- Preserves links in `Text [URL]` format using live DOM analysis.
18+
- Improved whitespace cleaning and boilerplate removal.
19+
- Refactored `analyzeCurrentPage` in `sidepanel.js` to use Mozilla Readability.js.
20+
- **Static Analysis**: Added `tests/audit_paths.js` to prevent broken links during refactors.
21+
- **Side Panel Path**: Corrected `openExtension` logic to point to `src/sidepanel.html` after migration.
22+
- **Debug Logging**: Added stylized console logs in the Side Panel to monitor full chat exchanges (prompts, history context, responses).
23+
- **Multi-Model Support**: Integrated Gemini 1.5 Flash/Pro and DeepSeek V3/R1.
24+
- **Client-Side Encryption**: Implemented `CryptoUtils` class for PBKDF2 + AES-GCM local key storage.
1325

1426
## [1.0.1] - 2026-01-18
1527

1628
### Added
1729
- **Antigravity Structure**: Established `.agent/rules` and strict project identity.
1830
- **Source Reorganization**: Moved all extension code to `src/` for cleaner root.
19-
- **Static Analysis**: Added `tests/audit_paths.js` to prevent broken links during refactors.
20-
- **Side Panel Path**: Corrected `openExtension` logic to point to `src/sidepanel.html` after migration.
21-
- **Debug Logging**: Added stylized console logs in the Side Panel to monitor full chat exchanges (prompts, history context, responses).
22-
- **Multi-Model Support**: Integrated Gemini 1.5 Flash/Pro and DeepSeek V3/R1.
23-
- **Client-Side Encryption**: Implemented `CryptoUtils` class for PBKDF2 + AES-GCM local key storage.
2431
- **Arc Browser Support**:
2532
- **[Implemented]** **Single Instance Window**: Added logic to `background.js` using `chrome.storage.session` to track and focus the existing extension window, preventing duplicates.
2633
- **[Added]** **UI Context Bar**: Visual indicator of the active page URL being analyzed in the Side Panel/Popup.

docs/context.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
- Implemented `chrome.storage.session` for resilient window tracking in Arc.
1818
- Added UI Context Bar and "Open Sidekick Here" menu action.
1919
- Integrated ESLint into the build pipeline.
20+
- **Integrated Mozilla Readability.js for production-grade content extraction**.
2021

2122
## Active Questions
2223
- (Empty)

docs/drafts/ui-id-collision-fix.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Draft: Fix Chat UI ID Collision
2+
3+
## Problem
4+
In `src/sidepanel.js`, the `addMessage` function generates message IDs using `Date.now()`:
5+
```javascript
6+
function addMessage(role, text, isRaw = false) {
7+
const div = document.createElement('div');
8+
div.className = `message ${role}`;
9+
div.id = 'msg-' + Date.now(); // <-- RISK OF COLLISION
10+
// ...
11+
}
12+
```
13+
When a User message is added and an AI message (placeholder) follows immediately, they may receive the same ID. `updateMessage` then finds the first element with that ID (the User bubble) and overwrites its content with the AI response.
14+
15+
## Proposed Strategy
16+
Replace `Date.now()` with a more robust unique identifier.
17+
18+
### Option A: Timestamp + Metadata
19+
```javascript
20+
div.id = `msg-${Date.now()}-${role}-${Math.floor(Math.random() * 1000)}`;
21+
```
22+
23+
### Option B: Global Counter (Cleaner)
24+
Add a counter to the global state:
25+
```javascript
26+
const state = {
27+
// ...
28+
msgCounter: 0
29+
};
30+
31+
function addMessage(role, text, isRaw = false) {
32+
state.msgCounter++;
33+
const div.id = `msg-${Date.now()}-${state.msgCounter}`;
34+
// ...
35+
}
36+
```
37+
38+
## Implementation Note
39+
- Apply changes to `addMessage` in `src/sidepanel.js`.
40+
- No changes needed to `updateMessage` or `scrollToBottom`.
41+
- Ensure `loadSettings` (rendering history) also uses unique IDs if it calls `addMessage`.
Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
# Web Mode Implementation - Technical Design
2+
3+
> **Status**: 🚧 Draft - Not Yet Implemented
4+
> **Priority**: Medium
5+
> **Related**: [Roadmap](../roadmap.md) - "Web Mode Fallback (BROKEN)"
6+
7+
## Problem Statement
8+
9+
The current "Gemini Web (Free)" option ([`sidepanel.js:746`](file:///Users/NOBKP/ai-sidekick/src/sidepanel.js#L746)) simply opens `gemini.google.com` in a new tab, providing no integration with the extension. Users without API keys have no fallback for in-panel chat.
10+
11+
## Goals
12+
13+
1. **Zero-Friction Fallback**: Users without API keys can still use AI assistants directly from the side panel.
14+
2. **Multi-Provider Support**: Extend to Gemini, DeepSeek, ChatGPT, and Claude.
15+
3. **Seamless UX**: Minimize context-switching and manual copy-paste.
16+
17+
---
18+
19+
## Technical Approaches
20+
21+
### Option A: Iframe Embedding ⚠️ Limited Viability
22+
23+
**Concept**: Embed web chat interfaces directly in the side panel using `<iframe>`.
24+
25+
**Pros**:
26+
- User stays in side panel.
27+
- No tab switching.
28+
29+
**Cons**:
30+
- **Security Restrictions**: Most AI providers block `<iframe>` embedding via `X-Frame-Options: DENY` or CSP headers.
31+
-**Gemini**: May work (needs testing).
32+
-**ChatGPT**: Blocked.
33+
-**Claude**: Blocked.
34+
- ⚠️ **DeepSeek**: Unknown (needs testing).
35+
36+
**Verdict**: Only viable for Gemini. Not a universal solution.
37+
38+
---
39+
40+
### Option B: Managed Tab with Context Injection
41+
42+
**Concept**: Open web interface in a new tab, but inject user's prompt automatically.
43+
44+
**Implementation**:
45+
1. User selects "Gemini Web (Free)" and types a message.
46+
2. Extension opens `https://gemini.google.com/app` in a new tab.
47+
3. Use `chrome.scripting.executeScript` to:
48+
- Wait for page load.
49+
- Find chat input field.
50+
- Insert user's prompt.
51+
- Optionally trigger "Send" button.
52+
53+
**Pros**:
54+
- Works with all providers (no iframe restrictions).
55+
- User sees response in native interface.
56+
57+
**Cons**:
58+
- **Login Required**: User must be logged into each service.
59+
- **DOM Fragility**: Injection relies on stable DOM selectors (e.g., `textarea[aria-label="Chat input"]`). Providers can break this at any time.
60+
- **Privacy Concerns**: Users may not want extension manipulating web pages.
61+
62+
**Verdict**: Technically feasible but fragile and requires maintenance.
63+
64+
---
65+
66+
### Option C: Hybrid Approach (Recommended)
67+
68+
**Concept**: Show an "onboarding dialog" when user selects Web Mode without API key.
69+
70+
**UX Flow**:
71+
```
72+
User selects "Gemini Web (Free)" → No API key detected
73+
74+
┌─────────────────────────────────────────────┐
75+
│ ⚠️ No API Key Found │
76+
│ │
77+
│ To use AI Sidekick, you need: │
78+
│ │
79+
│ [Option 1] Enter your API keys (Recommended)│
80+
│ → Private, encrypted, full control │
81+
│ → Get keys: [Gemini Key] [DeepSeek Key] │
82+
│ │
83+
│ [Option 2] Use free web version │
84+
│ → Opens gemini.google.com in new tab │
85+
│ → Requires Google account login │
86+
│ │
87+
│ [ Set Up API Keys ] [ Continue to Web ] │
88+
└─────────────────────────────────────────────┘
89+
```
90+
91+
If user clicks **"Continue to Web"**:
92+
- Open `gemini.google.com/app` in new tab.
93+
- Copy user's prompt to clipboard automatically.
94+
- Show toast: "Prompt copied! Paste it in Gemini to continue."
95+
96+
**Pros**:
97+
-**Clear UX**: Sets expectations (no magic integration).
98+
-**No Fragility**: No reliance on DOM selectors.
99+
-**Privacy-Friendly**: Extension doesn't manipulate web pages.
100+
-**Extensible**: Easy to add ChatGPT, Claude, DeepSeek with same pattern.
101+
102+
**Cons**:
103+
- User must manually paste (1 extra step).
104+
105+
**Verdict**: Best balance of simplicity, reliability, and user experience.
106+
107+
---
108+
109+
## Recommended Implementation (Phase 1)
110+
111+
### 1. Update Model Selection Logic
112+
113+
**File**: `src/sidepanel.js`
114+
115+
**Current Code** (Line 746):
116+
```javascript
117+
if (model === 'gemini-web') {
118+
window.open('https://gemini.google.com/app', '_blank');
119+
return;
120+
}
121+
```
122+
123+
**New Code**:
124+
```javascript
125+
if (model === 'gemini-web') {
126+
showWebModeDialog('gemini', userMessage);
127+
return;
128+
}
129+
```
130+
131+
### 2. Create Web Mode Dialog
132+
133+
**File**: `src/lib/web-mode-dialog.js` (new)
134+
135+
```javascript
136+
function showWebModeDialog(provider, userPrompt) {
137+
const providerConfig = {
138+
gemini: {
139+
name: 'Gemini',
140+
url: 'https://gemini.google.com/app',
141+
keyUrl: 'https://aistudio.google.com/app/apikey'
142+
},
143+
deepseek: {
144+
name: 'DeepSeek',
145+
url: 'https://chat.deepseek.com',
146+
keyUrl: 'https://platform.deepseek.com/api_keys'
147+
},
148+
chatgpt: {
149+
name: 'ChatGPT',
150+
url: 'https://chatgpt.com',
151+
keyUrl: null // No API key option
152+
},
153+
claude: {
154+
name: 'Claude',
155+
url: 'https://claude.ai',
156+
keyUrl: 'https://console.anthropic.com/'
157+
}
158+
};
159+
160+
const config = providerConfig[provider];
161+
162+
// Show dialog with options:
163+
// 1. "Set Up API Keys" → chrome.runtime.openOptionsPage()
164+
// 2. "Continue to Web" → copyToClipboard(userPrompt) + open(config.url)
165+
}
166+
```
167+
168+
### 3. Add Clipboard Permission
169+
170+
**File**: `manifest.json`
171+
172+
```json
173+
"permissions": [
174+
"clipboardWrite", // Already exists ✅
175+
// ...
176+
]
177+
```
178+
179+
---
180+
181+
## Future Enhancements (Phase 2+)
182+
183+
### Multi-Provider Dropdown
184+
185+
Instead of just "🌐 Gemini Web (Free)", show:
186+
```html
187+
<optgroup label="🌐 Web Mode (Free, No Keys)">
188+
<option value="gemini-web">Gemini (Google Account)</option>
189+
<option value="deepseek-web">DeepSeek (Free Account)</option>
190+
<option value="chatgpt-web">ChatGPT (OpenAI Account)</option>
191+
<option value="claude-web">Claude (Anthropic Account)</option>
192+
</optgroup>
193+
```
194+
195+
### Browser Action Integration
196+
197+
Add a "Quick Switch" button in the side panel:
198+
```
199+
[API Mode: Gemini Flash ▼]
200+
→ Gemini 2.5 Pro (API)
201+
→ DeepSeek R1 (API)
202+
───────────────
203+
→ Gemini Web (Free) 🌐
204+
→ ChatGPT Web (Free) 🌐
205+
```
206+
207+
---
208+
209+
## Testing Checklist
210+
211+
- [ ] Verify dialog appears when Web Mode selected without API key.
212+
- [ ] Confirm clipboard copy works (test on Mac/Windows/Linux).
213+
- [ ] Ensure correct URL opens for each provider.
214+
- [ ] Test "Set Up API Keys" button redirects to Options.
215+
- [ ] Validate toast notification shows after clipboard copy.
216+
217+
---
218+
219+
## Open Questions
220+
221+
1. **Should we support iframe for Gemini if it works?**
222+
→ Decision: Test first. If unreliable, use unified "open tab + copy" approach.
223+
224+
2. **Should we inject prompt automatically using content scripts?**
225+
→ Decision: No. Too fragile. Manual paste is acceptable trade-off.
226+
227+
3. **Should we track which web interface the user prefers?**
228+
→ Decision: Phase 2. Store in `chrome.storage.local`.
229+
230+
---
231+
232+
## References
233+
234+
- Current Implementation: [`src/sidepanel.js:746`](file:///Users/NOBKP/ai-sidekick/src/sidepanel.js#L746)
235+
- Roadmap Task: [`docs/roadmap.md`](file:///Users/NOBKP/ai-sidekick/docs/roadmap.md)

0 commit comments

Comments
 (0)