Skip to content

Failed to run NFF-go in AWS EC2 with ena driver #718

Open
@guesslin

Description

@guesslin

Hi, I have a problem that I can't run nff-go on AWS EC2 instance. I got some error messages from DPDK about the init port failure with the ENA driver.

  • error message about DPDK port init failure
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: Invalid value for nb_tx_desc(=2048), should be: <= 1024, >= 128, and a product of 1
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: ERROR: Cannot init port  0 !
Oct 08 02:56:02 ip-172-31-41-87 router[18200]: Invalid value for nb_tx_desc(=2048), should be: <= 1024, >= 128, and a 
Full message
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: ------------***-------- Initializing DPDK --------***------------
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: Detected 2 lcore(s)
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: Detected 1 NUMA nodes
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: Selected IOVA mode 'PA'
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: No available hugepages reported in hugepages-1048576kB
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: Probing VFIO support...
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL: Probing VFIO support...
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: PCI device 0000:00:05.0 on NUMA socket -1
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL:   Invalid NUMA socket, default to 0
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL: PCI device 0000:00:06.0 on NUMA socket -1
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL:   Invalid NUMA socket, default to 0
Oct 08 02:56:01 ip-172-31-41-87 router[18195]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL: PCI device 0000:00:05.0 on NUMA socket -1
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL:   Invalid NUMA socket, default to 0
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL: PCI device 0000:00:06.0 on NUMA socket -1
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL:   Invalid NUMA socket, default to 0
Oct 08 02:56:01 ip-172-31-41-87 router[18200]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: PMD: LLQ is not supported. Fallback to host mode policy.
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: PMD: Placement policy: Regular
Oct 08 02:56:02 ip-172-31-41-87 router[18200]: PMD: LLQ is not supported. Fallback to host mode policy.
Oct 08 02:56:02 ip-172-31-41-87 router[18200]: PMD: Placement policy: Regular
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: ------------***------ Initializing scheduler -----***------------
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: DEBUG: Scheduler can use cores: [0 1]
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: ------------***---------- Creating ports ---------***------------
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: Invalid value for nb_tx_desc(=2048), should be: <= 1024, >= 128, and a product of 1
Oct 08 02:56:02 ip-172-31-41-87 router[18195]: ERROR: Cannot init port  0 !
Oct 08 02:56:02 ip-172-31-41-87 router[18200]: Invalid value for nb_tx_desc(=2048), should be: <= 1024, >= 128, and a product of 1

I checked with https://github.com/DPDK/dpdk/blob/main/lib/librte_ethdev/rte_ethdev.c#L2019-L2034 generated this error message. And in nff-go/internel/low/low.h, set the nb_tx_desc to 2048.

I tried to reduce TX_RING_SIZE to 1024, but got another warning message but still can't process packets from DPDK flow.

  • error message
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: WARNING: Can't start new clone for segment1 instance 0
Full message
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: ------------***-------- Initializing DPDK --------***------------
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: Detected 2 lcore(s)
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: Detected 1 NUMA nodes
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: Selected IOVA mode 'PA'
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: No available hugepages reported in hugepages-1048576kB
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: Probing VFIO support...
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL: Probing VFIO support...
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: PCI device 0000:00:05.0 on NUMA socket -1
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL:   Invalid NUMA socket, default to 0
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL: PCI device 0000:00:06.0 on NUMA socket -1
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL:   Invalid NUMA socket, default to 0
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL: PCI device 0000:00:05.0 on NUMA socket -1
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL:   Invalid NUMA socket, default to 0
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL: PCI device 0000:00:06.0 on NUMA socket -1
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL:   Invalid NUMA socket, default to 0
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: EAL:   probe driver: 1d0f:ec20 net_ena
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: PMD: LLQ is not supported. Fallback to host mode policy.
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: PMD: Placement policy: Regular
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: PMD: LLQ is not supported. Fallback to host mode policy.
Oct 08 04:55:31 ip-172-31-41-87 router[20235]: PMD: Placement policy: Regular
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: ------------***------ Initializing scheduler -----***------------
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Scheduler can use cores: [0 1]
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: ------------***---------- Creating ports ---------***------------
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Port 0 MAC address: 06:10:b8:ab:99:db
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: ------------***------ Starting FlowFunctions -----***------------
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Start SCHEDULER at 0 core
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Start STOP at scheduler 0 core
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Start new instance for receiverPort
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: 1
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Start new clone for receiverPort
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: 1 instance 0 at 1 core
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: DEBUG: Start new instance for segment1
Oct 08 04:55:31 ip-172-31-41-87 router[20230]: WARNING: Can't start new clone for segment1 instance 0

Here's the information about the environment.

  • Linux distribution: Ubuntu 18.04 LTS
  • AWS instance type: t3.large
  • linux kernel version: 4.15.0-1065-aws
  • nff-go version: v0.9.2
  • ethtool -i ens6
driver: ena
version: 2.2.10g
firmware-version:
expansion-rom-version:
bus-info: 0000:00:06.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no
  • ifconfig
ens5: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 172.31.41.87  netmask 255.255.240.0  broadcast 172.31.47.255
        inet6 fe80::4f9:c1ff:fee6:a36f  prefixlen 64  scopeid 0x20<link>
        ether 06:f9:c1:e6:a3:6f  txqueuelen 1000  (Ethernet)
        RX packets 146118  bytes 189039782 (189.0 MB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 49107  bytes 4985206 (4.9 MB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9001
        inet 172.31.47.232  netmask 255.255.240.0  broadcast 172.31.47.255
        inet6 fe80::410:b8ff:feab:99db  prefixlen 64  scopeid 0x20<link>
        ether 06:10:b8:ab:99:db  txqueuelen 1000  (Ethernet)
        RX packets 192  bytes 15232 (15.2 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 82  bytes 4244 (4.2 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 3404  bytes 254476 (254.4 KB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3404  bytes 254476 (254.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
  • lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma]
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.3 Non-VGA unclassified device: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:03.0 VGA compatible controller: Amazon.com, Inc. Device 1111
00:04.0 Non-Volatile memory controller: Amazon.com, Inc. Device 8061
00:05.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA)
00:06.0 Ethernet controller: Amazon.com, Inc. Elastic Network Adapter (ENA)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions