Skip to content

Commas in srcset-URLs are not handled correctly #458

@grob

Description

@grob

Although #243 is merged, srcset-URLs with commas in them are still not parsed/rewritten correctly, see https://web.archive.org/web/*/https://orf.at/ for example.

The original URLs used in srcset attributes look like this: https://assets.orf.at/mims/2022/03/26/crops/w=875,q=90/1204287_opener_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=bad56ac4b6df02892d3bd744c8e9494d4fd72b50.

a complete srcset example used in this site:

<source media="(max-width: 600px)" srcset="https://assets.orf.at/mims/2022/03/26/crops/w=800,h=450,q=70/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=baff281a0ee94f81ed19d576f7eff4f0ed6e44c9 800w, https://assets.orf.at/mims/2022/03/26/crops/w=1280,h=720,q=60/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=735e42760bcc348a2afed7dde20a17bf2857caaf 1280w">

results in (see here):

<source media="(max-width: 600px)" srcset="https://web.archive.org/web/20220114214021im_/https://assets.orf.at/mims/2022/03/26/crops/w=800, /web/20220114214021im_/https://orf.at/stories/3243632/h=450, /web/20220114214021im_/https://orf.at/stories/3243632/q=70/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=baff281a0ee94f81ed19d576f7eff4f0ed6e44c9 800w, https://web.archive.org/web/20220114214021im_/https://assets.orf.at/mims/2022/03/26/crops/w=1280, /web/20220114214021im_/https://orf.at/stories/3243632/h=720, /web/20220114214021im_/https://orf.at/stories/3243632/q=60/1204282_master_429226_coronavirus_schule_tests_vorschau_v1_a.jpg?s=735e42760bcc348a2afed7dde20a17bf2857caaf 1280w">

Metadata

Metadata

Assignees

No one assigned

    Labels

    archive.orgarchive.org services not (just) Heritrix

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions