Skip to content
This repository was archived by the owner on May 14, 2020. It is now read-only.
This repository was archived by the owner on May 14, 2020. It is now read-only.

Regexen with UTF8 characters inside a class lead to incorrect rules #1481

Open
@fgsch

Description

@fgsch

Type of Issue

Bug

Description

Rules with UTF8 characters inside a class will get the code points used instead of the correct uf8 character. For example in 941110 we get:

(?i)[<\xef\xbc\x9c]script[^>\xef\xbc\x9e]*[>\xef\xbc\x9e][\s\S]*?

instead of:

(?i)[<\x{ff1c}]script[^>\x{ff1e}]*[>\x{ff1e}][\s\S]*?

This currently affect 6 rules:

941110 fixed
942430
942420
942431
942421
942432

Your Environment

  • CRS version: v3.2/dev
  • ModSecurity version: 2.9.3
  • Web Server and version: apache 2.4.39
  • Operating System and version: Debian stretch

Confirmation

[x] I have removed any personal data (email addresses, IP addresses,
passwords, domain names) from any logs posted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions