feat(post): add search_posts content-search tool#532
Open
AkaNebur wants to merge 3 commits into
Open
Conversation
Adds a new MCP tool search_posts(keywords, date_posted=None, max_pages=3)
that drives LinkedIn's global "Posts" content-search tab. It surfaces
informal hiring posts ("we're hiring", "Buscamos ...", "join our team")
that often appear before a formal job listing exists -- distinct from
get_feed (the authenticated user's home feed) and get_company_posts (a
single company page), so it gets its own tools/post.py module mirroring
feed.py rather than folding into either.
Follows the existing search-tool conventions:
- New LinkedInExtractor.search_posts method plus a pure
_build_content_search_url static helper that composes
/search/results/content/?keywords=...&origin=FACETED_SEARCH and appends
the datePosted facet as a URL-encoded one-element JSON list via the
existing _encode_list_facet helper, mirroring how search_people encodes
its network/currentCompany facets (content search uses literal
datePosted tokens rather than job search's f_TPR=r<seconds> codes).
Underscore aliases normalise onto LinkedIn's tokens via
_CONTENT_DATE_POSTED_MAP.
- Content search is an infinite scroll with no &start= pagination, so
max_pages maps to scroll depth (~5 scrolls/page via
_CONTENT_SCROLLS_PER_PAGE).
- Returns the canonical {url, sections, references?, section_errors?}
shape: raw innerText under search_results plus feed_post permalink
references. No structured per-post objects (no stable, locale-
independent selector), matching the deliberate get_feed decision and the
AGENTS.md scraping philosophy.
- Invalid date_posted raises FilterValidationError (a ValueError
subclass), re-raised in the tool layer as ToolError so the actionable
message survives mask_error_details. Rate-limit responses surface as a
typed section_errors entry, mirroring get_feed.
- Thin tools/post.py:register_post_tools wired into server.py after
register_feed_tools; tools/__init__.py docstring updated.
- Two-layer tests (tests/test_scraping.py + tests/test_tools.py): URL
building, alias normalisation, scroll-depth mapping,
FilterValidationError -> ToolError surfacing, empty and rate-limited
results, the Field(ge=1, le=10) boundary, and search_posts added to both
timeout sweeps.
Adds the search_posts row to the README tool table (status: working), a Features bullet to docs/docker-hub.md, and the tool entry to the manifest.json tools array, per the CONTRIBUTING.md "Adding a New Tool" checklist.
Contributor
Addresses review feedback on stickerdaniel#532: - _build_content_search_url now guards on date_posted.strip(), so a whitespace-only value (e.g. " ") is omitted from the URL instead of being appended as an invalid datePosted facet. The stripped value is also used as the alias-map fallback so passthrough tokens are normalised. This keeps the builder in sync with the search_posts validation, which already short-circuits on a falsy strip(). - Add a regression test for the whitespace case, plus a test for the previously-uncovered `elif extracted.error:` branch (a navigation error surfaces a typed section_errors entry, mirroring search_people).
Author
|
Thanks for the review! Both findings are addressed in db7d14b:
Full gate green locally and in CI: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #531
Adds a new MCP tool
search_posts(keywords, date_posted=None, max_pages=3)that drivesLinkedIn's global "Posts" content-search tab and returns the matching posts. This is the
surface for catching informal hiring posts — "we're hiring", "Buscamos ...", "estamos
contratando", "join our team" — that often appear before a formal job listing exists, so
it is distinct from
get_feed(the authenticated user's home feed) andget_company_posts(a single company page).
The tool follows the existing search-tool conventions. A new
LinkedInExtractor.search_postsmethod owns all logic; a thin wrapper in the new
linkedin_mcp_server/tools/post.py(
register_post_tools, wired intoserver.pyafterregister_feed_tools) only doesget_ready_extractor+ctx.report_progress+ the standard error double-catch. A pure@staticmethod _build_content_search_urlcomposes/search/results/content/?keywords=...&origin=FACETED_SEARCHand appends thedatePostedfacet as a URL-encoded one-element JSON list (
["past-week"]) via the existing_encode_list_facethelper — mirroring howsearch_peopleencodes itsnetwork/currentCompanyfacets, but with content search's literaldatePostedtokens instead ofjob search's
f_TPR=r<seconds>codes. Underscore aliases (past_week) normalise ontoLinkedIn's exact tokens via the new
_CONTENT_DATE_POSTED_MAP.Content search is an infinite scroll with no
&start=pagination, somax_pagesisexpressed as scroll depth (
max_scrolls = max_pages * _CONTENT_SCROLLS_PER_PAGE, ~5scrolls/page). The result is the canonical
{url, sections, references?, section_errors?}shape: raw
innerTextundersections["search_results"]for the LLM to parse, withreferences["search_results"]surfacingfeed_postpermalinks plus post authors/companies.Following the AGENTS.md scraping philosophy and the deliberate choice made for
get_feed,no attempt is made to build structured per-post objects — there is no stable, locale-
independent selector for that, so permalinks are surfaced via
referencesand the rest staysraw text. A dedicated module (rather than folding into
feed.py) keeps global content searchseparate from home-feed scraping, mirroring how
feed.pyandmessaging.pyare their ownmodules. Invalid
date_postedraisesFilterValidationError(aValueErrorsubclass),re-raised in the tool layer as a
ToolErrorso the actionable message survivesmask_error_details— identical tosearch_people. Rate-limit responses are surfaced as atyped
section_errors["search_results"]entry rather than an exception, mirroringget_feed.Docs and tests land per the CONTRIBUTING.md "Adding a New Tool" checklist: README tool-table
row (
working),manifest.jsonentry,docs/docker-hub.mdFeatures bullet, and atools/__init__.pycategory bullet. 13 new tests cover both layers — URL building, aliasnormalisation, scroll-depth mapping,
FilterValidationError/ToolErrorsurfacing, empty andrate-limited results, and the
Field(ge=1, le=10)boundary rejection — andsearch_postsisadded to both timeout sweeps.
Verified locally on a clean rebase onto
main:uv run ruff check .,uv run ruff format --check ., anduv run ty checkall clean;uv run pytestgreen (the 13 new tests pass andthe existing suite is unaffected).
Synthetic prompt
Generated with Claude Opus 4.8 (1M context)