Skip to content

Commit 00a5059

Browse files
Hyong Youb Kimferruhy
authored andcommitted
net/enic: support SR-IOV VF using admin channel
The Linux VIC PF driver now requires the use of the admin channel between PF and VF drivers. Certain devcmds are disabled for VF. The VF driver is supposed to send control messages through the admin channel to the PF driver to perform those devcmds. This commit adds the admin channel to the VF driver (net/enic). The VF's admin channel consists of normal Tx/Rx queues. VIC firmware hardwires those queues to PF. Control messages are specially crafted but otherwise normal packets. The Rx queue uses LSC interrupt (interrupt vector 0) to notify the driver of new Rx control messages. The PF driver may send unsolicited request messages (e.g. asking for VF stats) to VF. Such messages cause LSC interrupts and are processed on the global interrupt thread. For devcmds that must be sent through the admin channel, use wrapper functions. They check if the device is a VF. If VF, use the admin channel. Otherwise, perform devcmd directly. Two complications: - Soft Rx stats VF on old VIC models does not have HW Rx counters. In this case, the VF driver counts packets/bytes and reports them as device stats. - Backward compatibility mode Old VIC PF drivers on some operating systems may support only VF_CAPABILITY_REQUEST message or not support the admin channel at all. When the VF driver detects such PF driver, it reverts to the compatibility mode and does not use the admin channel. In this mode, trust mode (e.g. enabling promiscuous mode) does not work. Signed-off-by: Hyong Youb Kim <[email protected]> Reviewed-by: John Daley <[email protected]>
1 parent 177572d commit 00a5059

File tree

20 files changed

+1428
-92
lines changed

20 files changed

+1428
-92
lines changed

doc/guides/nics/enic.rst

Lines changed: 52 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ Supported features
3737
- UDP RSS hashing (1400 series and later adapters)
3838
- Scattered Rx
3939
- MTU update
40-
- SR-IOV on UCS managed servers connected to Fabric Interconnects
40+
- SR-IOV virtual function
4141
- Flow API
4242
- Overlay offload
4343

@@ -135,103 +135,87 @@ Configuration information
135135
TCP, IPv4, TCP-IPv4, IPv6, TCP-IPv6, IPv6 Extension, TCP-IPv6 Extension.
136136

137137

138-
SR-IOV mode utilization
138+
SR-IOV Virtual Function
139139
-----------------------
140140

141-
UCS blade servers configured with dynamic vNIC connection policies in UCSM
142-
are capable of supporting SR-IOV. SR-IOV virtual functions (VFs) are
143-
specialized vNICs, distinct from regular Ethernet vNICs. These VFs can be
144-
directly assigned to virtual machines (VMs) as 'passthrough' devices.
141+
VIC 1400 and later series supports SR-IOV. It can be enabled via both
142+
UCSM and CIMC. Please refer to the following guides to enable SR-IOV
143+
virtual functions (VFs).
145144

146-
In UCS, SR-IOV VFs require the use of the Cisco Virtual Machine Fabric Extender
147-
(VM-FEX), which gives the VM a dedicated
148-
interface on the Fabric Interconnect (FI). Layer 2 switching is done at
149-
the FI. This may eliminate the requirement for software switching on the
150-
host to route intra-host VM traffic.
145+
- CIMC: `Managing vNICs <https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/c/sw/gui/config/guide/4_3/b_cisco_ucs_c-series_gui_configuration_guide_43/b_Cisco_UCS_C-series_GUI_Configuration_Guide_41_chapter_01011.html#d77871e5874a1635>`_
151146

152-
Please refer to `Creating a Dynamic vNIC Connection Policy
153-
<http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/vm_fex/vmware/gui/config_guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide_chapter_010.html#task_433E01651F69464783A68E66DA8A47A5>`_
154-
for information on configuring SR-IOV adapter policies and port profiles
155-
using UCSM.
147+
- UCSM: `Configuring SRIOV HPN Connection Policies <https://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/ucs-manager/GUI-User-Guides/Network-Mgmt/4-3/b_UCSM_Network_Mgmt_Guide_4_3/b_UCSM_Network_Mgmt_Guide_chapter_01010.html#d21438e9555a1635>`_
156148

157-
Once the policies are in place and the host OS is rebooted, VFs should be
158-
visible on the host, E.g.:
149+
Note that the previous SR-IOV implementation that is tied to VM-FEX
150+
(Cisco Virtual Machine Fabric Extender) has been discontinued, and
151+
ENIC PMD no longer supports it. The current SR-IOV implementation does
152+
not require the Fabric Interconnect (FI), as layer 2 switching is done
153+
within the VIC adapter.
154+
155+
Once SR-IOV is enabled, reboot the host OS and follow OS specific
156+
steps to create VFs and assign them to virtual machines (VMs) or
157+
containers as necessary. The VIC physical function (PF) drivers for ESXi
158+
and Linux support SR-IOV. The following shows simplified steps for
159+
Linux.
159160

160161
.. code-block:: console
161162
163+
# echo 4 > /sys/class/net/<pf-interface>/device/sriov_numvfs
164+
162165
# lspci | grep Cisco | grep Ethernet
163-
0d:00.0 Ethernet controller: Cisco Systems Inc VIC Ethernet NIC (rev a2)
164-
0d:00.1 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
165-
0d:00.2 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
166-
0d:00.3 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
167-
0d:00.4 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
168-
0d:00.5 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
169-
0d:00.6 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
170-
0d:00.7 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
171-
172-
Enable Intel IOMMU on the host and install KVM and libvirt, and reboot again as
173-
required. Then, using libvirt, create a VM instance with an assigned device.
174-
Below is an example ``interface`` block (part of the domain configuration XML)
175-
that adds the host VF 0d:00:01 to the VM. ``profileid='pp-vlan-25'`` indicates
176-
the port profile that has been configured in UCSM.
166+
12:00.0 Ethernet controller: Cisco Systems Inc VIC Ethernet NIC (rev a2)
167+
12:00.1 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
168+
12:00.2 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
169+
12:00.3 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
170+
12:00.4 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
171+
172+
Writing 4 to ``sriov_numvfs`` creates 4 VFs. ``lspci`` shows VFs and
173+
their PCI locations. Interfaces with device ID ``02b7`` are the
174+
VFs. The following snippet for libvirt XML assigns VF at ``12:00.1``
175+
to VM.
177176

178177
.. code-block:: console
179178
180-
<interface type='hostdev' managed='yes'>
181-
<mac address='52:54:00:ac:ff:b6'/>
179+
<interface type="hostdev" managed="yes">
180+
<mac address="fa:16:3e:46:39:c5"/>
182181
<driver name='vfio'/>
183182
<source>
184-
<address type='pci' domain='0x0000' bus='0x0d' slot='0x00' function='0x1'/>
183+
<address type="pci" domain="0x0000" bus="0x12" slot="0x00" function="0x1"/>
185184
</source>
186-
<virtualport type='802.1Qbh'>
187-
<parameters profileid='pp-vlan-25'/>
188-
</virtualport>
185+
<vlan>
186+
<tag id="1000"/>
187+
</vlan>
189188
</interface>
190189
191-
192-
Alternatively, the configuration can be done in a separate file using the
193-
``network`` keyword. These methods are described in the libvirt documentation for
194-
`Network XML format <https://libvirt.org/formatnetwork.html>`_.
195-
196190
When the VM instance is started, libvirt will bind the host VF to
197-
vfio, complete provisioning on the FI and bring up the link.
198-
199-
.. note::
200-
201-
It is not possible to use a VF directly from the host because it is not
202-
fully provisioned until libvirt brings up the VM that it is assigned
203-
to.
204-
205-
In the VM instance, the VF will now be visible. E.g., here the VF 00:04.0 is
206-
seen on the VM instance and should be available for binding to a DPDK.
191+
vfio-pci. In the VM instance, the VF will now be visible. In this
192+
example, VF at ``07:00.0`` is seen on the VM instance and is available
193+
for binding to DPDK.
207194

208195
.. code-block:: console
209196
210-
# lspci | grep Ether
211-
00:04.0 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2)
197+
# lspci | grep Cisco
198+
07:00.0 Ethernet controller: Cisco Systems Inc Device 02b7 (rev a2)
212199
213-
Follow the normal DPDK install procedure, binding the VF to either ``igb_uio``
214-
or ``vfio`` in non-IOMMU mode.
200+
There are two known limitations of the current SR-IOV implementation.
215201

216-
In the VM, the kernel enic driver may be automatically bound to the VF during
217-
boot. Unbinding it currently hangs due to a known issue with the driver. To
218-
work around the issue, block the enic module as follows.
219-
Please see :ref:`Limitations <enic_limitations>` for limitations in
220-
the use of SR-IOV.
202+
- Software Rx statistics
221203

222-
.. code-block:: console
204+
VF on old VIC models does not have hardware Rx counters. In this case,
205+
ENIC PMD counts packets/bytes and reports them as device statistics.
223206

224-
# cat /etc/modprobe.d/enic.conf
225-
blacklist enic
207+
- Backward compatibility mode
226208

227-
# dracut --force
209+
Old PF drivers on ESXi may lack full admin channel support. ENIC PMD
210+
detects such PF driver during initialization and reverts to the
211+
compatibility mode. In this mode, ENIC PMD does not use the admin channel,
212+
and trust mode (e.g. enabling promiscuous mode on VF) is not supported.
228213

229214
.. note::
230215

231-
Passthrough does not require SR-IOV. If VM-FEX is not desired, the user
216+
Passthrough does not require SR-IOV. If SR-IOV is not desired, the user
232217
may create as many regular vNICs as necessary and assign them to VMs as
233-
passthrough devices. Since these vNICs are not SR-IOV VFs, using them as
234-
passthrough devices do not require libvirt, port profiles, and VM-FEX.
218+
passthrough devices.
235219

236220

237221
.. _enic-generic-flow-api:

doc/guides/rel_notes/release_24_11.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@ New Features
5555
Also, make sure to start the actual text at the margin.
5656
=======================================================
5757
58+
* **Updated Cisco enic driver.**
59+
60+
* Added SR-IOV VF support.
61+
5862

5963
Removed Items
6064
-------------

drivers/net/enic/base/vnic_cq.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ int vnic_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
2424

2525
cq->index = index;
2626
cq->vdev = vdev;
27+
cq->admin_chan = false;
2728

2829
cq->ctrl = vnic_dev_get_res(vdev, RES_TYPE_CQ, index);
2930
if (!cq->ctrl) {
@@ -40,6 +41,32 @@ int vnic_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
4041
return 0;
4142
}
4243

44+
int vnic_admin_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
45+
unsigned int socket_id, unsigned int desc_count, unsigned int desc_size)
46+
{
47+
int err;
48+
char res_name[RTE_MEMZONE_NAMESIZE];
49+
static int instance;
50+
51+
cq->index = index;
52+
cq->vdev = vdev;
53+
cq->admin_chan = true;
54+
55+
cq->ctrl = vnic_dev_get_res(vdev, RES_TYPE_ADMIN_CQ, index);
56+
if (!cq->ctrl) {
57+
pr_err("Failed to get admin CQ[%u] resource\n", index);
58+
return -EINVAL;
59+
}
60+
61+
snprintf(res_name, sizeof(res_name), "%d-admin-cq-%u", instance++, index);
62+
err = vnic_dev_alloc_desc_ring(vdev, &cq->ring, desc_count, desc_size,
63+
socket_id, res_name);
64+
if (err)
65+
return err;
66+
67+
return 0;
68+
}
69+
4370
void vnic_cq_init(struct vnic_cq *cq, unsigned int flow_control_enable,
4471
unsigned int color_enable, unsigned int cq_head, unsigned int cq_tail,
4572
unsigned int cq_tail_color, unsigned int interrupt_enable,

drivers/net/enic/base/vnic_cq.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,12 +59,15 @@ struct vnic_cq {
5959
unsigned int tobe_rx_coal_timeval;
6060
ktime_t prev_ts;
6161
#endif
62+
bool admin_chan;
6263
};
6364

6465
void vnic_cq_free(struct vnic_cq *cq);
6566
int vnic_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
6667
unsigned int socket_id,
6768
unsigned int desc_count, unsigned int desc_size);
69+
int vnic_admin_cq_alloc(struct vnic_dev *vdev, struct vnic_cq *cq, unsigned int index,
70+
unsigned int socket_id, unsigned int desc_count, unsigned int desc_size);
6871
void vnic_cq_init(struct vnic_cq *cq, unsigned int flow_control_enable,
6972
unsigned int color_enable, unsigned int cq_head, unsigned int cq_tail,
7073
unsigned int cq_tail_color, unsigned int interrupt_enable,

drivers/net/enic/base/vnic_dev.c

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ struct vnic_dev {
4747
dma_addr_t linkstatus_pa;
4848
struct vnic_stats *stats;
4949
dma_addr_t stats_pa;
50+
struct vnic_sriov_stats *sriov_stats;
51+
dma_addr_t sriov_stats_pa;
5052
struct vnic_devcmd_fw_info *fw_info;
5153
dma_addr_t fw_info_pa;
5254
struct fm_info *flowman_info;
@@ -164,6 +166,9 @@ static int vnic_dev_discover_res(struct vnic_dev *vdev,
164166
case RES_TYPE_RQ:
165167
case RES_TYPE_CQ:
166168
case RES_TYPE_INTR_CTRL:
169+
case RES_TYPE_ADMIN_WQ:
170+
case RES_TYPE_ADMIN_RQ:
171+
case RES_TYPE_ADMIN_CQ:
167172
/* each count is stride bytes long */
168173
len = count * VNIC_RES_STRIDE;
169174
if (len + bar_offset > bar[bar_num].len) {
@@ -210,6 +215,9 @@ void __iomem *vnic_dev_get_res(struct vnic_dev *vdev, enum vnic_res_type type,
210215
case RES_TYPE_RQ:
211216
case RES_TYPE_CQ:
212217
case RES_TYPE_INTR_CTRL:
218+
case RES_TYPE_ADMIN_WQ:
219+
case RES_TYPE_ADMIN_RQ:
220+
case RES_TYPE_ADMIN_CQ:
213221
return (char __iomem *)vdev->res[type].vaddr +
214222
index * VNIC_RES_STRIDE;
215223
default:
@@ -1143,6 +1151,18 @@ int vnic_dev_alloc_stats_mem(struct vnic_dev *vdev)
11431151
return vdev->stats == NULL ? -ENOMEM : 0;
11441152
}
11451153

1154+
int vnic_dev_alloc_sriov_stats_mem(struct vnic_dev *vdev)
1155+
{
1156+
char name[RTE_MEMZONE_NAMESIZE];
1157+
static uint32_t instance;
1158+
1159+
snprintf((char *)name, sizeof(name), "vnic_sriov_stats-%u", instance++);
1160+
vdev->sriov_stats = vdev->alloc_consistent(vdev->priv,
1161+
sizeof(struct vnic_sriov_stats),
1162+
&vdev->sriov_stats_pa, (uint8_t *)name);
1163+
return vdev->sriov_stats == NULL ? -ENOMEM : 0;
1164+
}
1165+
11461166
void vnic_dev_unregister(struct vnic_dev *vdev)
11471167
{
11481168
if (vdev) {
@@ -1155,6 +1175,10 @@ void vnic_dev_unregister(struct vnic_dev *vdev)
11551175
vdev->free_consistent(vdev->priv,
11561176
sizeof(struct vnic_stats),
11571177
vdev->stats, vdev->stats_pa);
1178+
if (vdev->sriov_stats)
1179+
vdev->free_consistent(vdev->priv,
1180+
sizeof(struct vnic_sriov_stats),
1181+
vdev->sriov_stats, vdev->sriov_stats_pa);
11581182
if (vdev->flowman_info)
11591183
vdev->free_consistent(vdev->priv,
11601184
sizeof(struct fm_info),
@@ -1355,3 +1379,27 @@ int vnic_dev_set_cq_entry_size(struct vnic_dev *vdev, uint32_t rq_idx,
13551379

13561380
return vnic_dev_cmd(vdev, CMD_CQ_ENTRY_SIZE_SET, &a0, &a1, wait);
13571381
}
1382+
1383+
int vnic_dev_enable_admin_qp(struct vnic_dev *vdev, uint32_t enable)
1384+
{
1385+
uint64_t a0, a1;
1386+
int wait = 1000;
1387+
1388+
a0 = QP_TYPE_ADMIN;
1389+
a1 = enable;
1390+
return vnic_dev_cmd(vdev, CMD_QP_TYPE_SET, &a0, &a1, wait);
1391+
}
1392+
1393+
int vnic_dev_sriov_stats(struct vnic_dev *vdev, struct vnic_sriov_stats **stats)
1394+
{
1395+
uint64_t a0, a1;
1396+
int wait = 1000;
1397+
int err;
1398+
1399+
a0 = vdev->sriov_stats_pa;
1400+
a1 = sizeof(struct vnic_sriov_stats);
1401+
err = vnic_dev_cmd(vdev, CMD_SRIOV_STATS_GET, &a0, &a1, wait);
1402+
if (!err)
1403+
*stats = vdev->sriov_stats;
1404+
return err;
1405+
}

drivers/net/enic/base/vnic_dev.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -199,5 +199,8 @@ int vnic_dev_capable_geneve(struct vnic_dev *vdev);
199199
uint64_t vnic_dev_capable_cq_entry_size(struct vnic_dev *vdev);
200200
int vnic_dev_set_cq_entry_size(struct vnic_dev *vdev, uint32_t rq_idx,
201201
uint32_t size_flag);
202+
int vnic_dev_alloc_sriov_stats_mem(struct vnic_dev *vdev);
203+
int vnic_dev_sriov_stats(struct vnic_dev *vdev, struct vnic_sriov_stats **stats);
204+
int vnic_dev_enable_admin_qp(struct vnic_dev *vdev, uint32_t enable);
202205

203206
#endif /* _VNIC_DEV_H_ */

drivers/net/enic/base/vnic_devcmd.h

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -646,6 +646,20 @@ enum vnic_devcmd_cmd {
646646
* bit 2: 64 bytes
647647
*/
648648
CMD_CQ_ENTRY_SIZE_SET = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 90),
649+
650+
/*
651+
* enable/disable wq/rq queue pair of qp_type on a PF/VF.
652+
* in: (u32) a0 = wq/rq qp_type
653+
* in: (u32) a0 = enable(1)/disable(0)
654+
*/
655+
CMD_QP_TYPE_SET = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 97),
656+
657+
/*
658+
* SRIOV vic stats get
659+
* in: (u64) a0 = host buffer addr for stats dump
660+
* in (u32) a1 = length of the buffer
661+
*/
662+
CMD_SRIOV_STATS_GET = _CMDC(_CMD_DIR_WRITE, _CMD_VTYPE_ENET, 98),
649663
};
650664

651665
/* Modes for exchanging advanced filter capabilities. The modes supported by
@@ -1194,4 +1208,39 @@ typedef enum {
11941208
#define VNIC_RQ_CQ_ENTRY_SIZE_32_CAPABLE (1 << VNIC_RQ_CQ_ENTRY_SIZE_32)
11951209
#define VNIC_RQ_CQ_ENTRY_SIZE_64_CAPABLE (1 << VNIC_RQ_CQ_ENTRY_SIZE_64)
11961210

1211+
/* CMD_QP_TYPE_SET */
1212+
#define QP_TYPE_ADMIN 0
1213+
1214+
struct vnic_sriov_stats {
1215+
uint32_t ver;
1216+
uint8_t sriov_vlan_membership_cap; /* sriov support vlan-membership */
1217+
uint8_t sriov_vlan_membership_enabled; /* Default is disabled (0) */
1218+
uint8_t sriov_rss_vf_full_cap; /* sriov VFs support full rss */
1219+
uint8_t sriov_host_rx_stats; /* host does rx stats */
1220+
1221+
/* IGx/EGx classifier TCAM
1222+
*/
1223+
uint32_t ig_classifier0_tcam_cfg; /* IG0 TCAM config entries */
1224+
uint32_t ig_classifier0_tcam_free; /* IG0 TCAM free count */
1225+
uint32_t eg_classifier0_tcam_cfg; /* EG0 TCAM config entries */
1226+
uint32_t eg_classifier0_tcam_free; /* EG0 TCAM free count */
1227+
1228+
uint32_t ig_classifier1_tcam_cfg; /* IG1 TCAM config entries */
1229+
uint32_t ig_classifier1_tcam_free; /* IG1 TCAM free count */
1230+
uint32_t eg_classifier1_tcam_cfg; /* EG1 TCAM config entries */
1231+
uint32_t eg_classifier1_tcam_free; /* EG1 TCAM free count */
1232+
1233+
/* IGx/EGx flow table entries
1234+
*/
1235+
uint32_t sriov_ig_flow_table_cfg; /* sriov IG FTE config */
1236+
uint32_t sriov_ig_flow_table_free; /* sriov IG FTE free */
1237+
uint32_t sriov_eg_flow_table_cfg; /* sriov EG FTE config */
1238+
uint32_t sriov_eg_flow_table_free; /* sriov EG FTE free */
1239+
1240+
uint8_t admin_qp_ready[32]; /* admin_qp ready bits (256) */
1241+
uint16_t vf_index; /* VF index or SRIOV_PF_IDX */
1242+
uint16_t reserved1;
1243+
uint32_t reserved2[256 - 23];
1244+
};
1245+
11971246
#endif /* _VNIC_DEVCMD_H_ */

0 commit comments

Comments
 (0)