fix discourse thread links in article comments#33
Conversation
Deploying landing-page with
|
| Latest commit: |
de02eec
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://cd1bb35c.landing-page-334.pages.dev |
| Branch Preview URL: | https://opencode-curious-lagoon.landing-page-334.pages.dev |
WalkthroughThis PR implements end-to-end resolution of Discourse discussion URLs for individual articles. It adds a new sync script ( Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/sync-discourse-discussions.ts`:
- Around line 17-28: The current unbounded Promise.all over posts (creating
discussionEntries) can burst the Discourse endpoint; replace the
Promise.all(posts.map(...)) usage with a concurrency-limited mapper (e.g., use
p-map or an async worker pool) to call resolveDiscourseDiscussionUrl at a
controlled rate (suggest 5–10 concurrency). Keep the existing logic that builds
articleUrl via getAbsolutePostUrl and returns either [articleUrl, discussionUrl]
or null, but run those calls through the limiter so discussionEntries is
produced deterministically without overwhelming the endpoint; update the
variable discussionEntries to receive the limiter's results instead of
Promise.all.
In `@src/lib/discourse.server.ts`:
- Around line 9-35: The HTTP request in client.request can hang because there is
no timeout; update the call that creates the request (the request returned by
client.request inside discourse.server.ts) to apply a socket/response timeout
(e.g., 10_000 ms) using request.setTimeout(...) and on timeout destroy/reject
the request with a clear error message so the surrounding Promise rejects; keep
the existing request.on("error", ...) handler intact and ensure the timeout path
triggers reject with a descriptive error (e.g., "Discourse request timed out")
so sync/build flows won't hang.
In `@src/lib/discourse.ts`:
- Around line 7-8: The two anchor regex literals (/
<a[^>]*class=(['"])[^'"]*\bbutton\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu and the
similar
/<a[^>]*class=(['"])[^'"]*\bpost-date\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu)
only match when class appears before href; update these patterns (and the other
similar patterns in the block around lines 15-29) to be attribute-order
resilient by using order-agnostic matching (e.g., combine two lookaheads like
(?=[^>]*\bclass=(['"])[^'"]*\bbutton\b\1)(?=[^>]*\bhref=(['"])([^'"]+)\2)<a[^>]*>
to capture the href regardless of attribute order) and replace the existing
regex literals accordingly so extraction returns the href when href appears
before class.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e100073e-6c27-40c5-ae5d-0766f8f0367b
⛔ Files ignored due to path filters (1)
src/generated/discourse-discussions.jsonis excluded by!**/generated/**
📒 Files selected for processing (16)
.gitignoreAGENTS.mdpackage.jsonscripts/sync-discourse-discussions.tssrc/app/(ar)/[id]/page.tsxsrc/app/(ar)/blog/[slug]/page.tsxsrc/app/(en)/blog/[slug]/ArticlePageClient.tsxsrc/app/(en-root)/en/blog/[slug]/page.tsxsrc/components/DiscourseComments.tsxsrc/lib/article-pages.tssrc/lib/discourse.server.tssrc/lib/discourse.tstest/article-page-social-banners.test.tsxtest/article-pages.test.tstest/discourse-server.test.tstest/discourse.test.ts
| const discussionEntries = await Promise.all( | ||
| posts.map(async (post) => { | ||
| const articleUrl = getAbsolutePostUrl( | ||
| post.lang, | ||
| post.slug, | ||
| post.wpType === "post" && post.wpId === post.slug, | ||
| ); | ||
| const discussionUrl = await resolveDiscourseDiscussionUrl(articleUrl); | ||
|
|
||
| return discussionUrl ? ([articleUrl, discussionUrl] as const) : null; | ||
| }), | ||
| ); |
There was a problem hiding this comment.
Bound resolver concurrency to avoid burst failures against Discourse.
Resolving every post with unbounded Promise.all can overwhelm the endpoint and produce flaky sync output under larger post sets.
Proposed fix
import fs from "node:fs/promises";
import path from "node:path";
+import pLimit from "p-limit";
@@
async function main() {
const posts = [...getAllPosts("ar"), ...getAllPosts("en")].filter((post) => post.commentsEnabled);
+ const limit = pLimit(5);
const discussionEntries = await Promise.all(
- posts.map(async (post) => {
+ posts.map((post) => limit(async () => {
const articleUrl = getAbsolutePostUrl(
post.lang,
post.slug,
post.wpType === "post" && post.wpId === post.slug,
);
const discussionUrl = await resolveDiscourseDiscussionUrl(articleUrl);
return discussionUrl ? ([articleUrl, discussionUrl] as const) : null;
- }),
+ })),
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const discussionEntries = await Promise.all( | |
| posts.map(async (post) => { | |
| const articleUrl = getAbsolutePostUrl( | |
| post.lang, | |
| post.slug, | |
| post.wpType === "post" && post.wpId === post.slug, | |
| ); | |
| const discussionUrl = await resolveDiscourseDiscussionUrl(articleUrl); | |
| return discussionUrl ? ([articleUrl, discussionUrl] as const) : null; | |
| }), | |
| ); | |
| const limit = pLimit(5); | |
| const discussionEntries = await Promise.all( | |
| posts.map((post) => limit(async () => { | |
| const articleUrl = getAbsolutePostUrl( | |
| post.lang, | |
| post.slug, | |
| post.wpType === "post" && post.wpId === post.slug, | |
| ); | |
| const discussionUrl = await resolveDiscourseDiscussionUrl(articleUrl); | |
| return discussionUrl ? ([articleUrl, discussionUrl] as const) : null; | |
| })), | |
| ); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/sync-discourse-discussions.ts` around lines 17 - 28, The current
unbounded Promise.all over posts (creating discussionEntries) can burst the
Discourse endpoint; replace the Promise.all(posts.map(...)) usage with a
concurrency-limited mapper (e.g., use p-map or an async worker pool) to call
resolveDiscourseDiscussionUrl at a controlled rate (suggest 5–10 concurrency).
Keep the existing logic that builds articleUrl via getAbsolutePostUrl and
returns either [articleUrl, discussionUrl] or null, but run those calls through
the limiter so discussionEntries is produced deterministically without
overwhelming the endpoint; update the variable discussionEntries to receive the
limiter's results instead of Promise.all.
| const request = client.request( | ||
| url, | ||
| { | ||
| headers: { | ||
| Accept: "text/html,application/xhtml+xml", | ||
| }, | ||
| method: "GET", | ||
| }, | ||
| (response) => { | ||
| const chunks: Buffer[] = []; | ||
|
|
||
| response.on("data", (chunk) => { | ||
| chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk)); | ||
| }); | ||
|
|
||
| response.on("end", () => { | ||
| resolve({ | ||
| ok: (response.statusCode ?? 500) >= 200 && (response.statusCode ?? 500) < 300, | ||
| body: Buffer.concat(chunks).toString("utf8"), | ||
| }); | ||
| }); | ||
| }, | ||
| ); | ||
|
|
||
| request.on("error", reject); | ||
| request.end(); | ||
| }); |
There was a problem hiding this comment.
Add a request timeout to prevent hanging sync/build flows.
The HTTP request has no timeout, so stalled network calls can block the sync pipeline indefinitely.
Proposed fix
function fetchText(url: URL): Promise<{ ok: boolean; body: string }> {
const client = url.protocol === "http:" ? http : https;
+ const timeoutMs = 10_000;
return new Promise((resolve, reject) => {
const request = client.request(
@@
},
);
+ request.setTimeout(timeoutMs, () => {
+ request.destroy(new Error(`Request timed out after ${timeoutMs}ms`));
+ });
request.on("error", reject);
request.end();
});
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const request = client.request( | |
| url, | |
| { | |
| headers: { | |
| Accept: "text/html,application/xhtml+xml", | |
| }, | |
| method: "GET", | |
| }, | |
| (response) => { | |
| const chunks: Buffer[] = []; | |
| response.on("data", (chunk) => { | |
| chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk)); | |
| }); | |
| response.on("end", () => { | |
| resolve({ | |
| ok: (response.statusCode ?? 500) >= 200 && (response.statusCode ?? 500) < 300, | |
| body: Buffer.concat(chunks).toString("utf8"), | |
| }); | |
| }); | |
| }, | |
| ); | |
| request.on("error", reject); | |
| request.end(); | |
| }); | |
| function fetchText(url: URL): Promise<{ ok: boolean; body: string }> { | |
| const client = url.protocol === "http:" ? http : https; | |
| const timeoutMs = 10_000; | |
| return new Promise((resolve, reject) => { | |
| const request = client.request( | |
| url, | |
| { | |
| headers: { | |
| Accept: "text/html,application/xhtml+xml", | |
| }, | |
| method: "GET", | |
| }, | |
| (response) => { | |
| const chunks: Buffer[] = []; | |
| response.on("data", (chunk) => { | |
| chunks.push(Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk)); | |
| }); | |
| response.on("end", () => { | |
| resolve({ | |
| ok: (response.statusCode ?? 500) >= 200 && (response.statusCode ?? 500) < 300, | |
| body: Buffer.concat(chunks).toString("utf8"), | |
| }); | |
| }); | |
| }, | |
| ); | |
| request.setTimeout(timeoutMs, () => { | |
| request.destroy(new Error(`Request timed out after ${timeoutMs}ms`)); | |
| }); | |
| request.on("error", reject); | |
| request.end(); | |
| }); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/discourse.server.ts` around lines 9 - 35, The HTTP request in
client.request can hang because there is no timeout; update the call that
creates the request (the request returned by client.request inside
discourse.server.ts) to apply a socket/response timeout (e.g., 10_000 ms) using
request.setTimeout(...) and on timeout destroy/reject the request with a clear
error message so the surrounding Promise rejects; keep the existing
request.on("error", ...) handler intact and ensure the timeout path triggers
reject with a descriptive error (e.g., "Discourse request timed out") so
sync/build flows won't hang.
| /<a[^>]*class=(['"])[^'"]*\bbutton\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu, | ||
| /<a[^>]*class=(['"])[^'"]*\bpost-date\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu, |
There was a problem hiding this comment.
Make discussion-link extraction resilient to attribute order.
Current patterns only match <a> tags when class appears before href. If Discourse emits href first, extraction fails and returns null even though the link exists.
Proposed fix
-const DISCUSSION_LINK_PATTERNS = [
- /<a[^>]*class=(['"])[^'"]*\bbutton\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu,
- /<a[^>]*class=(['"])[^'"]*\bpost-date\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu,
-];
+const DISCUSSION_LINK_PATTERNS = [
+ /<a(?=[^>]*\bclass=(['"])[^'"]*\bbutton\b[^'"]*\1)(?=[^>]*\bhref=(['"])([^'"]+)\2)[^>]*>/iu,
+ /<a(?=[^>]*\bclass=(['"])[^'"]*\bpost-date\b[^'"]*\1)(?=[^>]*\bhref=(['"])([^'"]+)\2)[^>]*>/iu,
+];Also applies to: 15-29
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/discourse.ts` around lines 7 - 8, The two anchor regex literals (/
<a[^>]*class=(['"])[^'"]*\bbutton\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu and the
similar
/<a[^>]*class=(['"])[^'"]*\bpost-date\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu)
only match when class appears before href; update these patterns (and the other
similar patterns in the block around lines 15-29) to be attribute-order
resilient by using order-agnostic matching (e.g., combine two lookaheads like
(?=[^>]*\bclass=(['"])[^'"]*\bbutton\b\1)(?=[^>]*\bhref=(['"])([^'"]+)\2)<a[^>]*>
to capture the href regardless of attribute order) and replace the existing
regex literals accordingly so extraction returns the href when href appears
before class.
There was a problem hiding this comment.
Pull request overview
This PR resolves exact Discourse discussion thread URLs at build/dev sync time and passes them into static article pages so the comments CTA can link directly to the matching forum thread.
Changes:
- Added Discourse URL helpers, a server-side resolver, a generated discussion map, and a sync script wired into
dev/build. - Updated article routes and
DiscourseCommentsto pass/use resolved discussion URLs. - Added Vitest coverage and updated related docs/ignore rules.
Reviewed changes
Copilot reviewed 15 out of 17 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
test/discourse.test.ts |
Adds helper tests for article URL building and discussion-link extraction. |
test/discourse-server.test.ts |
Adds resolver tests against a local embed endpoint. |
test/article-pages.test.ts |
Adds generated-map lookup tests for article discussion URLs. |
test/article-page-social-banners.test.tsx |
Adds UI coverage for the comments CTA using a resolved URL. |
src/lib/discourse.ts |
Introduces shared Discourse constants, URL construction, and HTML extraction helpers. |
src/lib/discourse.server.ts |
Adds server-side embed-page fetching and discussion URL resolution. |
src/lib/article-pages.ts |
Adds article-level discussion URL lookup from the generated map. |
src/generated/discourse-discussions.json |
Adds the generated article-to-thread URL map. |
src/components/DiscourseComments.tsx |
Accepts and uses an optional resolved discussion URL for CTA/fallback links. |
src/app/(en)/blog/[slug]/ArticlePageClient.tsx |
Computes canonical article URLs through the shared helper and passes discussion URLs to comments. |
src/app/(en-root)/en/blog/[slug]/page.tsx |
Resolves discussion URLs for English article pages. |
src/app/(ar)/blog/[slug]/page.tsx |
Resolves discussion URLs for Arabic blog pages. |
src/app/(ar)/[id]/page.tsx |
Resolves discussion URLs for root-level Arabic WordPress article pages. |
scripts/sync-discourse-discussions.ts |
Adds the build/dev sync process for generated discussion maps. |
package.json |
Runs the Discourse sync during dev and build. |
AGENTS.md |
Updates local workflow/build documentation. |
.gitignore |
Ignores the generated public discussion-map artifact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| - Use `pnpm dev` for local work; it runs `tsx scripts/sync-content-assets.ts` first, then `next dev` on port `25211` by default. | ||
| - Use `pnpm build` to verify changes; this repo is a static export (`output: 'export'`), so `pnpm start` serves the generated `out/` directory, not a long-running Next server. | ||
| - Use `pnpm dev` for local work; it runs `tsx scripts/sync-content-assets.ts`, `tsx scripts/sync-discourse-discussions.ts`, then `next dev` on port `25211` by default. |
| const discussionUrl = await resolveDiscourseDiscussionUrl(articleUrl); | ||
|
|
||
| return discussionUrl ? ([articleUrl, discussionUrl] as const) : null; |
| const request = client.request( | ||
| url, | ||
| { | ||
| headers: { | ||
| Accept: "text/html,application/xhtml+xml", | ||
| }, | ||
| method: "GET", | ||
| }, |
| const discussionEntries = await Promise.all( | ||
| posts.map(async (post) => { |
| const DISCUSSION_LINK_PATTERNS = [ | ||
| /<a[^>]*class=(['"])[^'"]*\bbutton\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu, | ||
| /<a[^>]*class=(['"])[^'"]*\bpost-date\b[^'"]*\1[^>]*href=(['"])([^'"]+)\2/iu, | ||
| ]; | ||
|
|
||
| export function getAbsolutePostUrl(lang: Lang, slug: string, isWordPressPost = false): string { | ||
| return new URL(getPostPath(lang, slug, isWordPressPost), SITE_URL).toString(); | ||
| } | ||
|
|
||
| export function extractDiscourseDiscussionUrl(html: string): string | null { | ||
| for (const pattern of DISCUSSION_LINK_PATTERNS) { | ||
| const match = html.match(pattern); | ||
| const href = match?.[3]?.trim(); | ||
|
|
||
| if (!href) { | ||
| continue; | ||
| } | ||
|
|
||
| try { | ||
| return new URL(href, DISCOURSE_URL).toString(); | ||
| } catch { | ||
| continue; | ||
| } | ||
| } | ||
|
|
||
| return null; | ||
| } |
| try { | ||
| return new URL(href, DISCOURSE_URL).toString(); |
Summary
Verification
out/blog/aosus-v3.htmlandout/en/blog/aosus-v3.htmlcontainhttps://discourse.aosus.org/t/topic/5323/2