Skip to content

[BUG] MQTT Connector Fails to Match Topics with Regex Metacharacters #2084

@tijn-hedgehog

Description

@tijn-hedgehog

Connector name
MQTT Connector

Describe the bug
The topic_to_regex() function in tb_utility.py does not escape regex metacharacters in topic filters before they are used with fullmatch() for internal message routing. This causes topics containing characters like parentheses (), brackets [], dots ., or other regex-special characters to silently fail matching. Messages are successfully received via MQTT subscription but are dropped during internal routing with a debug log entry: "Received message to topic ... with unknown interpreter data".

The root cause is in tb_utility.py line 114:
topic.replace("+", "[^/]+").replace("#", ".+").replace('$', '\\$')

This only converts MQTT wildcards (+, #) and escapes $, but does not escape other regex metacharacters. For example, a topic filter containing "(CCS2)" is interpreted as a regex capture group instead of literal parentheses, causing fullmatch() to fail.

Escaping the parentheses in the config (\(CCS2\)) is not a viable workaround because topicFilter is used for both MQTT subscription (mqtt_connector.py line 485, raw string) and regex matching (mqtt_connector.py line 628, converted string). Escaped parentheses would fix the regex but break the MQTT subscription since the broker receives literal backslashes.

Steps to Reproduce

  1. Configure an MQTT connector with a topic filter containing parentheses:
{
  "topicFilter": "my/topic/device name (model)/+",
  "converter": {
    "type": "json",
    "deviceInfo": {
      "deviceNameExpressionSource": "constant",
      "deviceNameExpression": "MyDevice",
      "deviceProfileExpressionSource": "constant",
      "deviceProfileExpression": "default"
    },
    "timeseries": [{ "type": "double", "key": "value", "value": "${value}" }]
  }
}
  1. Publish an MQTT message to my/topic/device name (model)/data with payload {"value": 42}
  2. Observe that the gateway log shows:
    Received message to topic "my/topic/device name (model)/data" with unknown interpreter data:
  3. The message is received but never routed to the converter — no device is created, no telemetry is stored

Error traceback:
No exception is raised. The message is silently dropped with a DEBUG level log:

|DEBUG| - [mqtt_connector.py] - mqtt_connector - _process_on_message - 695 - Received message to topic "my/topic/device name (model)/data" with unknown interpreter data:

Versions
OS: Debian (Docker container, ARM64)
ThingsBoard IoT Gateway version: 3.8.2
Python version: 3.11

Additional context
Affected source files:

  • tb_utility.py line 114 (topic_to_regex — missing re.escape())
  • mqtt_connector.py line 628 (fullmatch(regex, message.topic) — fails on unescaped metacharacters)
  • mqtt_connector.py line 485 (self.__subscribe(mapping["topicFilter"], ...) — uses raw string, which is correct)
    Any MQTT topic containing regex metacharacters ((, ), [, ], ., *, ?, {, }, |, ^) in the literal topic path will trigger this bug. The current workaround is to use + wildcards to skip topic levels containing these characters and extract the device name from the topic using deviceNameExpressionSource: "topic".

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions