Skip to content

Regexes sorting is incorrect #33

@gkalabin

Description

@gkalabin

According to specification:

The list of regular-expressions regex shall be evaluated for a given user-agent string beginning with the first regex-item in the list to the last item. The first matching regex stops processing the list. Regex-matching shall be case sensitive.

Here is the proof that sorting of regexes will cause wrong detection results:

package main

import (
    "fmt"
    "log"

    "github.com/uap-go/uaparser"
)

const (
    // specificUA is matched by X and Y, X preceeds Y in regex list
    specificUA = "Opera/9.80 (VRE; Opera Mini/4.2/28.2794; U; en) Presto/2.8.119 Version/11.10"
    // broadUA is matched by Y
    broadUA    = "Opera/9.80 (Windows NT 5.1; U; ru) Presto/2.5.24 Version/10.53"
)

func main() {
    sortThreshold := 100001
    parser, err := uaparser.NewWithOptions("./uap-core/regexes.yaml", uaparser.EUserAgentLookUpMode, sortThreshold, 0, true, true)
    if err != nil {
        log.Fatal(err)
    }

    // specificUA is matched by X, everything is fine
    beforeSort := parser.Parse(specificUA).UserAgent

    // cause regexes sort by parsing broadUA many times: it will cause bubbling up of regex Y
    for i := 0; i < sortThreshold; i++ {
        parser.Parse(broadUA)
    }

    // specificUA is now matched by Y which bubbled up after sort. This causes wrong parsing results
    afterSort := parser.Parse(specificUA).UserAgent
    fmt.Printf("before sort:\t %#v\n after sort:\t %#v\n", beforeSort, afterSort)
}

Result:

$ go run test.go 
2016-08-10 18:21:37.284461949 +0300 MSK Sorting UserAgents slice
before sort:     &uaparser.UserAgent{Family:"Opera Mini", Major:"4", Minor:"2", Patch:""}
 after sort:     &uaparser.UserAgent{Family:"Opera", Major:"11", Minor:"10", Patch:""}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions