Skip to content

Incorrect processing of closing round bracket #33

@app13y

Description

@app13y

Naive example

Regular expression (foo)(.*(2.)) does not match foo123.

Explanation

This is due the incorrect processing of the closing bracket in foo() function:

/* ... */
if (re[i] == ')') {
    int ind = info->brackets[info->num_brackets - 1].len == -1 
         ? info->num_brackets - 1 
         : depth;
    info->brackets[ind].len = (int) (&re[i] - info->brackets[ind].ptr);
    depth--;
    FAIL_IF(depth < 0, SLRE_UNBALANCED_BRACKETS);
    FAIL_IF(i > 0 && re[i - 1] == '(', SLRE_NO_MATCH);
}
/* ... */

How it is done

  • Last added capturing group is considered:
    • If it is not enclosed, encountered bracket matches opening bracket of this group.
    • Otherwise, encountered bracket matches opening bracket of group with index depth, which is incorrect in general.

In example shown above, last closing bracket will enclose capturing group with index 1, which is foo, and not capturing group .*(2.). Moreover, this issue can be exploited with a wide range of regular expressions; demonstrated one is merely a proof of concept constructed from one of your unit tests.

How it shall be done

Encountered closing bracket shall match last unmatched opening bracket. One shall search for last capturing group, which length is set to -1, and not just assume capturing group with index depth.

/* ... */
if (re[i] == ')') {
    int ind = search_for_unmatched_bracket(info);
    info->brackets[ind].len = (int) (&re[i] - info->brackets[ind].ptr);
    depth--;
    FAIL_IF(depth < 0, SLRE_UNBALANCED_BRACKETS);
    FAIL_IF(i > 0 && re[i - 1] == '(', SLRE_NO_MATCH);
}
/* ... */

Regards, Arseny.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions