Skip to content

Border router cannot resolve DNS-SD queries sent to itself (TZ-1681) #137

@projectgus

Description

@projectgus

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate
  • Read the documentation to confirm the issue is not addressed there and your configuration is set correctly
  • Tested with the latest version to ensure the issue hasn't been fixed

How often does this bug occurs?

always

Expected behavior

I expect the border router device to be able to resolve DNS-SD requests when the DNS server is configured as the border router's mesh address (default behaviour).

Actual behavior (suspected bug)

DNS requests made on the border router always time out.

This doesn't always happen if Wi-Fi STA is associated, but it always happens if Wi-Fi is not associated. For my application we can't guarantee the Border Router will be permanently connected to Wi-Fi, so it needs to be able to resolve services inside the Thread network even if Wi-Fi is unavailable.

Error logs or terminal output

Running the basic_thread_border_router example:

> dataset set active 0e080000000000000000000300000b4a0300000b35060004001fffe00208dead00beef00cafe0708fddead00beef0000051022e41ddb7963a912d2bb8c9ec5e317b6030a4f70656e546872656164010221aa0410410e63e1177e27c8fa2380274673b4790c0402a0f7f8

Done
> dataset commit active

Done
> srp client host name example

Done
> srp server enable

Done
> srp client autostart enable

Done
> srp client service add inst1 test._tcp 777

Done
> srp client host address auto

Done
> ifconfig up

I (56156) OPENTHREAD: Platform UDP bound to port 49153
Done
I (56156) OT_STATE: netif up
> thread start

I(58126) OPENTHREAD:[N] Mle-----------: Role disabled -> detached
Done
> I(58526) OPENTHREAD:[N] Mle-----------: Role detached -> leader
I(58576) OPENTHREAD:[N] Mle-----------: Partition ID 0x62d54d1e
I (58596) OPENTHREAD: Platform UDP bound to port 49154
W(59416) OPENTHREAD:[W] DuaManager----: Failed to perform next registration: InvalidState
W(60416) OPENTHREAD:[W] DuaManager----: Failed to perform next registration: InvalidState

> srp server service

inst1.test._tcp.default.service.arpa.
    deleted: false
    subtypes: (null)
    port: 777
    priority: 0
    weight: 0
    ttl: 7200
    lease: 7200
    key-lease: 680400
    TXT: []
    host: example.default.service.arpa.
    addresses: [fdde:ad00:beef:0:e8d5:7429:8dfd:4779]
Done
> ipaddr

fdde:ad00:beef:0:0:ff:fe00:fc10
fd3d:e7ee:7ffe:1:71f8:f14e:9964:c8ec
fdde:ad00:beef:0:0:ff:fe00:fc00
fdde:ad00:beef:0:0:ff:fe00:3400
fdde:ad00:beef:0:e8d5:7429:8dfd:4779
fe80:0:0:0:b46a:6337:7375:b4bf
Done
> dns config

Server: [fdde:ad00:beef:0:e8d5:7429:8dfd:4779]:53
ResponseTimeout: 6000 ms
MaxTxAttempts: 3
RecursionDesired: yes
ServiceMode: srv_txt_opt
Nat64Mode: allow
Done
> dns resolve example.default.service.arpa.

DNS response for example.default.service.arpa. - 
Error 28: ResponseTimeout
> 

Note that the Error 28: ResponseTimeout only appears after a long delay (i.e. DNS timeout).

Other nodes in the network can resolve DNS-SD using the border router's address. For example, running the ot_cli example on an ESP32-H2:

> dataset set active 0e080000000000000000000300000b4a0300000b35060004001fffe00208dead00beef00cafe0708fddead00beef0000051022e41ddb7963a912d2bb8c9ec5e317b6030a4f70656e546872656164010221aa0410410e63e1177e27c8fa2380274673b4790c0402a0f7f8

Done
> dataset commit active

Done
> I(312384) OPENTHREAD:[N] Mle-----------: Different partition (peer:1658146078, local:196119147)
I(312404) OPENTHREAD:[N] Mle-----------: Attach attempt 0, BetterPartition 
I(313324) OPENTHREAD:[N] Mle-----------: Role leader -> detached
I(313324) OPENTHREAD:[N] Mle-----------: RLOC16 3800 -> 3401
I(313324) OPENTHREAD:[N] Mle-----------: Role detached -> child
I (313354) OT_STATE: Set dns server address: FD3D:E7EE:7FFE:2::808:808
W(316924) OPENTHREAD:[W] DuaManager----: Failed to perform next registration: NotFound
> dns config

Server: [fdde:ad00:beef:0:e8d5:7429:8dfd:4779]:53
ResponseTimeout: 6000 ms
MaxTxAttempts: 3
RecursionDesired: yes
ServiceMode: srv_txt_opt
Nat64Mode: allow
Done
> dns resolve example.default.service.arpa.

DNS response for example.default.service.arpa. - fd3d:e7ee:7ffe:1:71f8:f14e:9964:c8ec TTL:6589 
Done

Steps to reproduce the behavior

See steps in answer above. CLI commands are as shown in the logs.

Have reproduced the same behaviour with esp-thread-br current HEAD (bdf1a1c) and both ESP-IDF v5.3.2 and the current master HEAD (v5.5-dev-2916-g1c468f6825).

Project release version

unreleased master

System architecture

Intel/AMD 64-bit (modern PC, older Mac)

Operating system

Linux

Operating system version

Arch Linux

Shell

Fish

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions