Skip to content

Conversation

@dcshzj
Copy link
Contributor

@dcshzj dcshzj commented Jul 22, 2025

Problem

What problem are you trying to solve? What issue does this close?

Closes ISOM-2006.

Solution

How did you solve the problem?

Features:

  • Add a check on the destination URL to ensure that it is not malicious just before redirecting the user.
  • Save the malicious scan result into the database with an expiry date.
  • Automatically disable the link and email the link owner when the destination URL is detected to be malicious.

Tests

What tests should be run to confirm functionality?

TO BE ADDED

@dcshzj dcshzj requested review from a team and Copilot July 22, 2025 10:56
@linear
Copy link

linear bot commented Jul 22, 2025

This comment was marked as outdated.

@dcshzj dcshzj force-pushed the feat/actively-check-malicious branch 2 times, most recently from d384206 to d9f9416 Compare July 22, 2025 11:02
@dcshzj dcshzj requested a review from Copilot July 22, 2025 11:03

This comment was marked as outdated.

@dcshzj dcshzj force-pushed the feat/actively-check-malicious branch from d9f9416 to 2ce81ff Compare July 22, 2025 13:46
@dcshzj dcshzj requested a review from Copilot July 22, 2025 13:46
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements malicious link detection and prevention in the URL shortening service. When users access shortened URLs, the system now performs safety checks on the destination URLs and automatically deactivates malicious links while notifying link owners.

  • Adds real-time malicious URL detection during redirects with database caching of scan results
  • Implements automatic link deactivation and email notifications for detected threats
  • Updates repository and service interfaces to support safety browsing functionality

Reviewed Changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/server/repositories/UrlRepository.ts Core implementation for safe browsing expiry tracking and URL deactivation
src/server/modules/redirect/services/RedirectService.ts Malicious link detection logic during redirect flow
src/server/services/email.ts Email notification service for deactivated malicious links
src/server/modules/user/services/UrlManagementService.ts Service methods for deactivating malicious URLs and notifying owners
src/server/util/safeBrowsing.ts Utility for calculating safe browsing result expiry dates
Comments suppressed due to low confidence (2)

test/server/repositories/UrlRepository.test.ts:711

  • Test implementation is skipped for updateSafeBrowsingExpiry method. This leaves the happy path uncovered for a critical security feature.
    it.skip('should update the safe browsing expiry for an existing shortUrl', () => {})

test/server/repositories/UrlRepository.test.ts:725

  • Test implementation is skipped for deactivateShortUrl method. This leaves the happy path uncovered for a critical security feature.
    it.skip('should deactivate an existing shortUrl', () => {})

@dcshzj dcshzj force-pushed the feat/actively-check-malicious branch from 2ce81ff to 705b25f Compare July 23, 2025 02:19
@harishv7 harishv7 requested review from gweiying and thanhdatle July 23, 2025 04:58
Copy link
Collaborator

@seaerchin seaerchin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issues wrt the data structure being changed impacting the code, especially because we rely on it for doing our code scanning.

resolve(cacheLongUrl)

try {
const redirectDestination = JSON.parse(cacheLongUrl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[issue] i think there are 2 issues here:

first, i think we updated the data structure from string -> json, which might cause issues when we're trying to read the new value. i think our new code cannot reject outright and we'd need to handle the fallback of parsing a string value rather than our json encoded value.

secondly, because our deploy window is so long, we need to handle the case where our old code reads a json encoded value. this will cause our existing flow to mess up, so it needs to be handled properly too.

from our offline discussion, i think we can do a small release so that we wrap the JSON.parse method here to try reading it as a JSON value first, then fallback to raw string

Copy link
Contributor Author

@dcshzj dcshzj Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 647fee7, will push a second PR to revert back the changes

logger.error(`Unable to cache short URL: ${error}`),
)
return longUrl
public getLongUrl: (shortUrl: string) => Promise<RedirectDestination> =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to relax this return condition to RedirectDestination | string due to the previously mentioned issue of data format changing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me do a split of the PRs so they get released separately.

@seaerchin
Copy link
Collaborator

(some stuff that claude pointed out)

  1. our scanning takes place in the hot path, which might delay the redirect, especially because this is user -> us -> webrisk
  2. rate limiting for threat scanning API call (i think we don't need this because we cache our results and we also check if we need to scan first)
  3. silent failure on UrlManagementService:237 when we try to lookup users (<- afaict, we don't throw an error but we do catch the null case)

@adriangohjw
Copy link
Contributor

bugbot run

`longUrl not found in cache:\tshortUrl=${shortUrl}`,
),
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Cache Parsing Error Causes Premature Cache Misses

The cache format for short URL redirects changed from plain strings to JSON objects. getLongUrlFromCache attempts to JSON.parse() existing string-formatted cache entries. On parsing failure, it incorrectly rejects with NotFoundError, causing valid cached URLs to return 404s or be treated as cache misses until old entries expire. The code should gracefully handle the legacy string format.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful, this was the same point raised by @seaerchin earlier. To mitigate, we will do as a 3-step release:

  1. Update app code to handle both cases, using the JSON.parse function optimistically, then falling back to using strings
  2. Update app code to start inserting using the new JSON object
  3. Update app code to remove the handling of the string case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise bugbot

@dcshzj dcshzj force-pushed the feat/actively-check-malicious branch 2 times, most recently from 6e62156 to 647fee7 Compare August 4, 2025 16:03
@dcshzj dcshzj requested a review from seaerchin August 5, 2025 02:59
const redirectDestination = await this.getLongUrlFromDatabase(shortUrl)
// FIXME: Cache the entire RedirectDestination object once all app
// clients are updated to use the new RedirectDestination type.
this.cacheShortUrl(shortUrl, redirectDestination.longUrl).catch(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] why are we not awaiting the cache call? doesn't seem like it has an impact, but just curious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think it's cos the adding to cache part can be done later, so it doesn't impact the person being redirected

@dcshzj dcshzj changed the base branch from feat/add-safebrowsing-expiry to develop August 20, 2025 10:14
@dcshzj dcshzj force-pushed the feat/actively-check-malicious branch from 647fee7 to 2203591 Compare August 20, 2025 10:25
@dcshzj dcshzj merged commit ad43c42 into develop Aug 20, 2025
16 checks passed
dcshzj added a commit that referenced this pull request Aug 20, 2025
* Merge pull request #2402 from opengovsg/feat/add-safebrowsing-expiry

feat: add safeBrowsingExpiry column to urls table

* feat: check for malicious link just before redirect (#2403)

* feat: check for malicious link just before redirect

* fix: do not update as JSON object yet

* fix: update method signature

* 1.83.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants