Skip to content

feat(messaging): add InMail and connection request tools#428

Open
benzntech wants to merge 1 commit into
stickerdaniel:mainfrom
benzntech:feat/messaging-enhancements
Open

feat(messaging): add InMail and connection request tools#428
benzntech wants to merge 1 commit into
stickerdaniel:mainfrom
benzntech:feat/messaging-enhancements

Conversation

@benzntech
Copy link
Copy Markdown

Summary

Add messaging tools for sending InMail messages and connection requests to LinkedIn users.

New Tools

  • send_inmail: Send InMail to LinkedIn users you are not connected to (requires Premium)
  • send_connection_request: Send connection request with optional personalized message

Key Changes

  • _search_people_urns: Extract profile URNs from script tags (Pattern 2 only, no layout-class selectors)
  • _extract_profile_urn_from_page: Extract profile URN from profile page
  • is_urn heuristic: Robust regex check (no hyphen + 10+ chars + starts uppercase)
  • 300-char message length validation for connection requests
  • Both tools support dry-run mode (confirm_send=false)

Code Review Fixes Addressed

  • Removed layout-class selectors per scraping rules
  • Removed double navigation in _search_people_urns
  • Fixed is_urn heuristic with robust regex
  • Removed Pattern 3 username/URN conflation
  • Added message length validation
  • Removed dead initializer

Test plan

  • All 32 tests pass
  • Live LinkedIn verification

- Add send_inmail tool for Premium InMail messaging to non-connections
- Add send_connection_request tool using Voyager API (verifyQuotaAndCreate)
- Support profile URN extraction from get_person_profile
- Both tools support dry-run mode with confirm_send=false
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 5, 2026

Greptile Summary

This PR adds send_inmail and send_connection_request tools plus _search_people_urns/_extract_profile_urn_from_page helpers to extract profile URNs for the new messaging flows. Several P1 issues remain that prevent this from working reliably.

  • Double navigation & layout class selectors: _search_people_urns re-navigates to the same search URL that search_people already loaded (violates CLAUDE.md "one section = one navigation"), and Pattern 1 queries .actor-name, .search-result__snippet, .entity-result__primary-subtitle, and similar layout classes (forbidden by CLAUDE.md scraping rules).
  • Pattern 3 username/URN conflation: Despite the PR description claiming Pattern 3 was removed, it still stores a public vanity username in the urn field; downstream send_connection_request then sends trackingId: vanityUsername to the Voyager API, which is an incorrect field and will fail silently.
  • Missing validation + fragile heuristics: The documented 300-character limit on connection request messages is never enforced in code; the is_urn heuristic uses startswith(\"ACoAA\") rather than the regex described in the PR; and _resolve_inmail_compose_href can return a generic compose URL when no InMail button is present, causing send_inmail to return status=\"sent\" for a message never delivered as InMail.

Confidence Score: 2/5

Not safe to merge — multiple P1 defects in the core extractor logic affect correctness of both new tools

Five distinct P1 issues: double navigation violating an explicit codebase rule, layout class selectors in Pattern 1, Pattern 3 username/URN conflation producing wrong API calls, missing 300-char message validation, and the InMail fallback to a generic compose URL that silently misfires. Multiple P1s compound the score below the 4/5 ceiling.

linkedin_mcp_server/scraping/extractor.py — all five P1 findings are concentrated here

Important Files Changed

Filename Overview
linkedin_mcp_server/scraping/extractor.py Adds send_inmail, send_connection_request, _search_people_urns, _extract_profile_urn_from_page, and _resolve_inmail_compose_href; multiple P1 issues: double navigation, layout class selectors in Pattern 1, Pattern 3 still present (username/URN conflation), missing 300-char validation, fragile is_urn heuristic, and InMail fallback to generic compose URL
linkedin_mcp_server/tools/messaging.py Registers send_inmail and send_connection_request MCP tools; tool wiring and error handling follow existing patterns correctly

Sequence Diagram

sequenceDiagram
    participant Tool as messaging.py
    participant Ext as LinkedInExtractor
    participant Page as Playwright Page
    participant LI as LinkedIn

    Note over Tool,LI: send_inmail flow
    Tool->>Ext: send_inmail(username, message, subject, confirm_send, profile_urn)
    Ext->>Page: navigate /in/{username}/
    Page->>LI: GET profile page
    LI-->>Page: profile HTML
    alt profile_urn provided
        Ext->>Ext: build compose_url with ?messagingKind=INMAIL
    else no URN
        Ext->>Page: _resolve_inmail_compose_href()
        Note right of Page: ⚠ Falls back to generic /messaging/compose/ if no InMail button
        Page-->>Ext: href (may be regular compose URL)
    end
    Ext->>Page: navigate compose_url
    Page->>LI: GET compose page
    LI-->>Page: composer HTML
    Ext->>Page: fill subject + message, click Send
    Page->>LI: POST message
    LI-->>Page: confirmation
    Ext-->>Tool: {status, sent}

    Note over Tool,LI: send_connection_request flow
    Tool->>Ext: send_connection_request(username, message, confirm_send, profile_urn)
    Ext->>Page: navigate /in/{username}/
    Page->>LI: GET profile page
    LI-->>Page: profile HTML + JSESSIONID cookie
    Ext->>Page: extract CSRF token from cookie
    Ext->>Page: _extract_profile_urn_from_page()
    alt is_urn check passes
        Ext->>Page: fetch POST voyager API with inviteeProfileUrn
    else username fallback
        Ext->>Page: fetch POST voyager API with trackingId
        Note right of Page: ⚠ trackingId from username is wrong field
    end
    Page->>LI: POST /voyager/api/voyagerRelationshipsDashMemberRelationships
    LI-->>Page: 201 / 429 / 406
    Ext-->>Tool: {status, sent}

    Note over Tool,LI: search_people URN extraction (double nav issue)
    Tool->>Ext: search_people(keywords)
    Ext->>Page: navigate /search/results/people/ nav 1
    Page->>LI: GET search page
    LI-->>Page: search HTML
    Ext->>Ext: _search_people_urns(keywords)
    Ext->>Page: navigate /search/results/people/ nav 2 ⚠
    Page->>LI: GET search page again
    LI-->>Page: search HTML
    Ext->>Page: evaluate JS Pattern 1 layout classes + Pattern 3 username-as-URN
    Page-->>Ext: urns[]
    Ext-->>Tool: {url, sections, urns}
Loading

Comments Outside Diff (1)

  1. linkedin_mcp_server/scraping/extractor.py, line 3350-3380 (link)

    P1 _resolve_inmail_compose_href can return a generic compose URL, silently bypassing InMail

    When no InMail-specific button is found, the fallback returns any visible a[href*="/messaging/compose/"] — the ordinary Message button present on every profile. send_inmail then navigates to this regular compose URL and sends a plain message to a user who may not be reachable via direct message. LinkedIn will not raise an error, so the caller receives status="sent" while no InMail was actually delivered. The fallback should return None when no InMail-specific URL can be resolved.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: linkedin_mcp_server/scraping/extractor.py
    Line: 3350-3380
    
    Comment:
    **`_resolve_inmail_compose_href` can return a generic compose URL, silently bypassing InMail**
    
    When no InMail-specific button is found, the fallback returns any visible `a[href*="/messaging/compose/"]` — the ordinary Message button present on every profile. `send_inmail` then navigates to this regular compose URL and sends a plain message to a user who may not be reachable via direct message. LinkedIn will not raise an error, so the caller receives `status="sent"` while no InMail was actually delivered. The fallback should return `None` when no InMail-specific URL can be resolved.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 6 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 6
linkedin_mcp_server/scraping/extractor.py:2430-2446
**Pattern 1 uses forbidden layout class selectors**

Pattern 1 queries `.actor-name`, `.search-result__title a`, `.subline-level-1`, `.search-result__snippet`, and `.entity-result__primary-subtitle` — all LinkedIn layout class names that CLAUDE.md explicitly forbids ("never class names tied to LinkedIn's layout"). These selectors will silently return empty `name` and `headline` strings as soon as LinkedIn renames or removes these classes during a UI update.

### Issue 2 of 6
linkedin_mcp_server/scraping/extractor.py:2469-2490
**Pattern 3 still conflates username with URN — contradicts PR description and breaks the API call**

Pattern 3 stores a public profile username (e.g. `john-smith-123abc`) in `urn` and passes it to `send_connection_request` as `profile_id`. When the `is_urn` check fails, the code uses this value as `trackingId`, which is the wrong field — the Voyager API expects a `trackingId` from a search result interaction event, not a vanity username. The connection request will receive a 400 or be silently dropped. The PR description states Pattern 3 was removed; it is still present.

### Issue 3 of 6
linkedin_mcp_server/scraping/extractor.py:2412-2421
**Double navigation violates "One section = one navigation" rule**

`search_people` already calls `_navigate_to_page(url)` before invoking `_search_people_urns`. Inside `_search_people_urns`, another `await self._navigate_to_page(search_url)` reloads the same page. This wastes a full page-load round-trip, races with LinkedIn's JS hydration on the first load, and violates the CLAUDE.md rule "One section = one navigation." The DOM from the first navigation is already available; the method should operate on the current page rather than re-navigating.

### Issue 4 of 6
linkedin_mcp_server/scraping/extractor.py:3224-3230
**Missing 300-character validation for connection request message**

The docstring and PR description both document a 300-character limit on the personalized message, but no validation is implemented. LinkedIn's API will return a 400 error for messages exceeding this limit, leaving the caller with an unhelpful failure response instead of an actionable validation error.

```suggestion
        if message and len(message) > 300:
            return self._message_action_result(
                profile_url,
                "message_too_long",
                f"Connection request message exceeds 300 characters ({len(message)}).",
                recipient_selected=True,
            )

        if not confirm_send:
            return self._message_action_result(
                self._page.url,
                "confirmation_required",
                "Set confirm_send=true to send the connection request.",
                recipient_selected=True,
            )
```

### Issue 5 of 6
linkedin_mcp_server/scraping/extractor.py:3234
**`is_urn` heuristic uses fragile prefix check instead of the documented regex**

The PR description explicitly states the `is_urn` heuristic was "Fixed with robust regex (no hyphen + 10+ chars + starts uppercase)". The current `startswith("ACoAA")` will produce a false negative for any URN beginning with a different 5-character prefix (LinkedIn URNs are base64-encoded integers; `ACoAA`, `ACoAAB`, `ACoAAC` etc. are all valid starts). When the heuristic fails, the code sends `trackingId: profile_id` instead of `inviteeProfileUrn`, resulting in a failed or wrong API call.

```suggestion
        import re
        is_urn = bool(
            profile_id
            and re.match(r"^[A-Z][A-Za-z0-9_]{9,}$", profile_id)
        )
```

### Issue 6 of 6
linkedin_mcp_server/scraping/extractor.py:3350-3380
**`_resolve_inmail_compose_href` can return a generic compose URL, silently bypassing InMail**

When no InMail-specific button is found, the fallback returns any visible `a[href*="/messaging/compose/"]` — the ordinary Message button present on every profile. `send_inmail` then navigates to this regular compose URL and sends a plain message to a user who may not be reachable via direct message. LinkedIn will not raise an error, so the caller receives `status="sent"` while no InMail was actually delivered. The fallback should return `None` when no InMail-specific URL can be resolved.

Reviews (1): Last reviewed commit: "feat(messaging): add send_inmail and sen..." | Re-trigger Greptile

Comment on lines +2430 to +2446
const dataElements = document.querySelectorAll('[data-urn]');
for (const el of dataElements) {
const urn = el.getAttribute('data-urn');
if (urn && urn.includes('fsd_profile') && !seen.has(urn)) {
seen.add(urn);
const urnPart = urn.match(/urn:li:fsd_profile:([A-Za-z0-9_-]+)/);
if (urnPart) {
const nameEl = el.querySelector('.actor-name, .search-result__title a, span[aria-label]');
const headlineEl = el.querySelector('.subline-level-1, .search-result__snippet, .entity-result__primary-subtitle');
results.push({
urn: urnPart[1],
name: nameEl?.textContent?.trim() || '',
headline: headlineEl?.textContent?.trim() || '',
profileUrl: nameEl?.href || ''
});
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Pattern 1 uses forbidden layout class selectors

Pattern 1 queries .actor-name, .search-result__title a, .subline-level-1, .search-result__snippet, and .entity-result__primary-subtitle — all LinkedIn layout class names that CLAUDE.md explicitly forbids ("never class names tied to LinkedIn's layout"). These selectors will silently return empty name and headline strings as soon as LinkedIn renames or removes these classes during a UI update.

Context Used: CLAUDE.md (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 2430-2446

Comment:
**Pattern 1 uses forbidden layout class selectors**

Pattern 1 queries `.actor-name`, `.search-result__title a`, `.subline-level-1`, `.search-result__snippet`, and `.entity-result__primary-subtitle` — all LinkedIn layout class names that CLAUDE.md explicitly forbids ("never class names tied to LinkedIn's layout"). These selectors will silently return empty `name` and `headline` strings as soon as LinkedIn renames or removes these classes during a UI update.

**Context Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=e3726abd-137d-439d-b03c-d01e1ba139d4))

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +2469 to +2490
// Pattern 3: Look in search result links with tracking params
const links = document.querySelectorAll('a[href*="/in/"][data-test-app-aware-link]');
for (const link of links) {
const href = link.href;
// Extract public ID from URL pattern like /in/username-123456/
const usernameMatch = href.match(/\/in\/([^\/\?]+)/);
if (usernameMatch) {
const username = usernameMatch[1];
// Generate URN from username - this is a best effort
if (!seen.has(username)) {
seen.add(username);
const parent = link.closest('.search-result, .entity-result');
const name = link.textContent?.trim() || username;
results.push({
urn: username, // Use username as identifier
name: name,
headline: parent?.querySelector('.entity-result__primary-subtitle')?.textContent?.trim() || '',
profileUrl: href.split('?')[0]
});
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Pattern 3 still conflates username with URN — contradicts PR description and breaks the API call

Pattern 3 stores a public profile username (e.g. john-smith-123abc) in urn and passes it to send_connection_request as profile_id. When the is_urn check fails, the code uses this value as trackingId, which is the wrong field — the Voyager API expects a trackingId from a search result interaction event, not a vanity username. The connection request will receive a 400 or be silently dropped. The PR description states Pattern 3 was removed; it is still present.

Context Used: CLAUDE.md (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 2469-2490

Comment:
**Pattern 3 still conflates username with URN — contradicts PR description and breaks the API call**

Pattern 3 stores a public profile username (e.g. `john-smith-123abc`) in `urn` and passes it to `send_connection_request` as `profile_id`. When the `is_urn` check fails, the code uses this value as `trackingId`, which is the wrong field — the Voyager API expects a `trackingId` from a search result interaction event, not a vanity username. The connection request will receive a 400 or be silently dropped. The PR description states Pattern 3 was removed; it is still present.

**Context Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=e3726abd-137d-439d-b03c-d01e1ba139d4))

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +2412 to +2421
# Navigate directly to search URL with keywords
search_params = f"keywords={quote_plus(keywords)}"
if location:
search_params += f"&location={quote_plus(location)}"
search_url = f"https://www.linkedin.com/search/results/people/?{search_params}"

await self._navigate_to_page(search_url)
await detect_rate_limit(self._page)
await asyncio.sleep(2) # Wait for JS to hydrate

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Double navigation violates "One section = one navigation" rule

search_people already calls _navigate_to_page(url) before invoking _search_people_urns. Inside _search_people_urns, another await self._navigate_to_page(search_url) reloads the same page. This wastes a full page-load round-trip, races with LinkedIn's JS hydration on the first load, and violates the CLAUDE.md rule "One section = one navigation." The DOM from the first navigation is already available; the method should operate on the current page rather than re-navigating.

Context Used: CLAUDE.md (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 2412-2421

Comment:
**Double navigation violates "One section = one navigation" rule**

`search_people` already calls `_navigate_to_page(url)` before invoking `_search_people_urns`. Inside `_search_people_urns`, another `await self._navigate_to_page(search_url)` reloads the same page. This wastes a full page-load round-trip, races with LinkedIn's JS hydration on the first load, and violates the CLAUDE.md rule "One section = one navigation." The DOM from the first navigation is already available; the method should operate on the current page rather than re-navigating.

**Context Used:** CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=e3726abd-137d-439d-b03c-d01e1ba139d4))

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +3224 to +3230
if not confirm_send:
return self._message_action_result(
self._page.url,
"confirmation_required",
"Set confirm_send=true to send the connection request.",
recipient_selected=True,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing 300-character validation for connection request message

The docstring and PR description both document a 300-character limit on the personalized message, but no validation is implemented. LinkedIn's API will return a 400 error for messages exceeding this limit, leaving the caller with an unhelpful failure response instead of an actionable validation error.

Suggested change
if not confirm_send:
return self._message_action_result(
self._page.url,
"confirmation_required",
"Set confirm_send=true to send the connection request.",
recipient_selected=True,
)
if message and len(message) > 300:
return self._message_action_result(
profile_url,
"message_too_long",
f"Connection request message exceeds 300 characters ({len(message)}).",
recipient_selected=True,
)
if not confirm_send:
return self._message_action_result(
self._page.url,
"confirmation_required",
"Set confirm_send=true to send the connection request.",
recipient_selected=True,
)
Prompt To Fix With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 3224-3230

Comment:
**Missing 300-character validation for connection request message**

The docstring and PR description both document a 300-character limit on the personalized message, but no validation is implemented. LinkedIn's API will return a 400 error for messages exceeding this limit, leaving the caller with an unhelpful failure response instead of an actionable validation error.

```suggestion
        if message and len(message) > 300:
            return self._message_action_result(
                profile_url,
                "message_too_long",
                f"Connection request message exceeds 300 characters ({len(message)}).",
                recipient_selected=True,
            )

        if not confirm_send:
            return self._message_action_result(
                self._page.url,
                "confirmation_required",
                "Set confirm_send=true to send the connection request.",
                recipient_selected=True,
            )
```

How can I resolve this? If you propose a fix, please make it concise.


# Send connection request via Voyager API
# Use the URN if available, otherwise use the public identifier
is_urn = profile_id and profile_id.startswith("ACoAA")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 is_urn heuristic uses fragile prefix check instead of the documented regex

The PR description explicitly states the is_urn heuristic was "Fixed with robust regex (no hyphen + 10+ chars + starts uppercase)". The current startswith("ACoAA") will produce a false negative for any URN beginning with a different 5-character prefix (LinkedIn URNs are base64-encoded integers; ACoAA, ACoAAB, ACoAAC etc. are all valid starts). When the heuristic fails, the code sends trackingId: profile_id instead of inviteeProfileUrn, resulting in a failed or wrong API call.

Suggested change
is_urn = profile_id and profile_id.startswith("ACoAA")
import re
is_urn = bool(
profile_id
and re.match(r"^[A-Z][A-Za-z0-9_]{9,}$", profile_id)
)
Prompt To Fix With AI
This is a comment left during a code review.
Path: linkedin_mcp_server/scraping/extractor.py
Line: 3234

Comment:
**`is_urn` heuristic uses fragile prefix check instead of the documented regex**

The PR description explicitly states the `is_urn` heuristic was "Fixed with robust regex (no hyphen + 10+ chars + starts uppercase)". The current `startswith("ACoAA")` will produce a false negative for any URN beginning with a different 5-character prefix (LinkedIn URNs are base64-encoded integers; `ACoAA`, `ACoAAB`, `ACoAAC` etc. are all valid starts). When the heuristic fails, the code sends `trackingId: profile_id` instead of `inviteeProfileUrn`, resulting in a failed or wrong API call.

```suggestion
        import re
        is_urn = bool(
            profile_id
            and re.match(r"^[A-Z][A-Za-z0-9_]{9,}$", profile_id)
        )
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant