Skip to content

ActivityCollector collects wrong URLs with non-HTTP/FTP/File protocols #1324

@yuhui

Description

@yuhui

This issue should be considered to be a MAJOR BUG!

Expected Behaviour

ActivityCollector collects link URLs that conform to the expected URI schema. For example:

  • https://www.domain.com/path1/path2/page
  • file:///computer/folder1/folder2/file
  • mailto:[email protected]
  • tel:8675309
  • javascript:

Actual Behaviour

For non-HTTP/FTP/File protocols, ActivityCollector inserts the current page's URL path into the link URL. For example:

  • mailto:[email protected] becomes mailto://www.domain.com/path1/path2/mailto:[email protected]
  • tel:8675309 becomes tel://www.domain.com/path1/path2/tel:8675309
  • javascript: becomes javascript://www.domain.com/path1/path2/javascript:

Reason

When determining the URL to use with the link, ActivityCollector relies on

return !url ? false : /^[a-z0-9]+:\/\//i.test(url);

As you can see, this code not only checks for an alphanumeric string, but also for ://.

But non-HTTP/FTP/File URLs do not have :// in their URLs. For example, mailto:[email protected] is a valid URL.

As a result, in this code block:

if (href && !urlStartsWithScheme(href)) {
if (!protocol) {
protocol = loc.protocol ? loc.protocol : "";
}
protocol = protocol ? `${protocol}//` : "";
if (!host) {
host = loc.host ? loc.host : "";
}
let path = "";
if (href.substring(0, 1) !== "/") {
let indx = loc.pathname.lastIndexOf("/");
indx = indx < 0 ? 0 : indx;
path = loc.pathname.substring(0, indx);
}
href = `${protocol}${host}${path}/${href}`;
}

Since urlStartsWithScheme() returns false, so the link URL is assumed to be invalid or assumed to be a path only. So the current page's domain and path get inserted into the link URL. As a result, mailto:[email protected] becomes mailto://www.domain.com/path1/path2/mailto:[email protected].

Reproduce Scenario (including but not limited to)

Steps to Reproduce

  1. In a test web page, find a non-HTTP/FTP/File link, for example, a mailto:, tel: or javascript: link.
  2. Click the link.
  3. Observe the link URL that ActivityCollector collects.

Platform and Version

All

Suggestions

  1. In https://github.com/adobe/alloy/blob/c14795bf4ce8aad3ab3f5704b2c253ecafc1f5a8/src/components/ActivityCollector/utils/dom/getAbsoluteUrlFromAnchorElement.js, instead of building your own URL validator, use the URL() constructor to construct a valid URL from a string. See https://developer.mozilla.org/en-US/docs/Web/API/URL/URL.
  2. Improve the unit test to include URLs with a wider range of protocols in
    const urlsThatStartsWithScheme = [
    "http://example.com",
    "https://example.com",
    "https://example.com:123/example?example=123",
    "file://example.txt",
    ];
    .
  3. Expand the unit test in https://github.com/adobe/alloy/blob/c14795bf4ce8aad3ab3f5704b2c253ecafc1f5a8/test/unit/specs/components/ActivityCollector/utils/dom/getAbsoluteUrlFromAnchorElement.spec.js to test for a wider variety of URLs with other protocols.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions