blob: 282e17be798a9cb4985b37f01e65252af0dc0362 [file] [log] [blame]
Vishwanathapura, Niranjanac73690c2017-04-12 20:29:19 -07001Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature
2supports Ethernet functionality over Omni-Path fabric by encapsulating
3the Ethernet packets between HFI nodes.
4
5Architecture
6=============
7The patterns of exchanges of Omni-Path encapsulated Ethernet packets
8involves one or more virtual Ethernet switches overlaid on the Omni-Path
9fabric topology. A subset of HFI nodes on the Omni-Path fabric are
10permitted to exchange encapsulated Ethernet packets across a particular
11virtual Ethernet switch. The virtual Ethernet switches are logical
12abstractions achieved by configuring the HFI nodes on the fabric for
13header generation and processing. In the simplest configuration all HFI
14nodes across the fabric exchange encapsulated Ethernet packets over a
15single virtual Ethernet switch. A virtual Ethernet switch, is effectively
16an independent Ethernet network. The configuration is performed by an
17Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM)
18application. HFI nodes can have multiple VNICs each connected to a
19different virtual Ethernet switch. The below diagram presents a case
20of two virtual Ethernet switches with two HFI nodes.
21
22 +-------------------+
23 | Subnet/ |
24 | Ethernet |
25 | Manager |
26 +-------------------+
27 / /
28 / /
29 / /
30 / /
31+-----------------------------+ +------------------------------+
32| Virtual Ethernet Switch | | Virtual Ethernet Switch |
33| +---------+ +---------+ | | +---------+ +---------+ |
34| | VPORT | | VPORT | | | | VPORT | | VPORT | |
35+--+---------+----+---------+-+ +-+---------+----+---------+---+
36 | \ / |
37 | \ / |
38 | \/ |
39 | / \ |
40 | / \ |
41 +-----------+------------+ +-----------+------------+
42 | VNIC | VNIC | | VNIC | VNIC |
43 +-----------+------------+ +-----------+------------+
44 | HFI | | HFI |
45 +------------------------+ +------------------------+
46
47
48The Omni-Path encapsulated Ethernet packet format is as described below.
49
50Bits Field
51------------------------------------
52Quad Word 0:
530-19 SLID (lower 20 bits)
5420-30 Length (in Quad Words)
5531 BECN bit
5632-51 DLID (lower 20 bits)
5752-56 SC (Service Class)
5857-59 RC (Routing Control)
5960 FECN bit
6061-62 L2 (=10, 16B format)
6163 LT (=1, Link Transfer Head Flit)
62
63Quad Word 1:
640-7 L4 type (=0x78 ETHERNET)
658-11 SLID[23:20]
6612-15 DLID[23:20]
6716-31 PKEY
6832-47 Entropy
6948-63 Reserved
70
71Quad Word 2:
720-15 Reserved
7316-31 L4 header
7432-63 Ethernet Packet
75
76Quad Words 3 to N-1:
770-63 Ethernet packet (pad extended)
78
79Quad Word N (last):
800-23 Ethernet packet (pad extended)
8124-55 ICRC
8256-61 Tail
8362-63 LT (=01, Link Transfer Tail Flit)
84
85Ethernet packet is padded on the transmit side to ensure that the VNIC OPA
86packet is quad word aligned. The 'Tail' field contains the number of bytes
87padded. On the receive side the 'Tail' field is read and the padding is
88removed (along with ICRC, Tail and OPA header) before passing packet up
89the network stack.
90
91The L4 header field contains the virtual Ethernet switch id the VNIC port
92belongs to. On the receive side, this field is used to de-multiplex the
93received VNIC packets to different VNIC ports.
94
95Driver Design
96==============
97Intel OPA VNIC software design is presented in the below diagram.
98OPA VNIC functionality has a HW dependent component and a HW
99independent component.
100
101The support has been added for IB device to allocate and free the RDMA
102netdev devices. The RDMA netdev supports interfacing with the network
103stack thus creating standard network interfaces. OPA_VNIC is an RDMA
104netdev device type.
105
106The HW dependent VNIC functionality is part of the HFI1 driver. It
107implements the verbs to allocate and free the OPA_VNIC RDMA netdev.
108It involves HW resource allocation/management for VNIC functionality.
109It interfaces with the network stack and implements the required
110net_device_ops functions. It expects Omni-Path encapsulated Ethernet
111packets in the transmit path and provides HW access to them. It strips
112the Omni-Path header from the received packets before passing them up
113the network stack. It also implements the RDMA netdev control operations.
114
115The OPA VNIC module implements the HW independent VNIC functionality.
116It consists of two parts. The VNIC Ethernet Management Agent (VEMA)
117registers itself with IB core as an IB client and interfaces with the
118IB MAD stack. It exchanges the management information with the Ethernet
119Manager (EM) and the VNIC netdev. The VNIC netdev part allocates and frees
120the OPA_VNIC RDMA netdev devices. It overrides the net_device_ops functions
121set by HW dependent VNIC driver where required to accommodate any control
122operation. It also handles the encapsulation of Ethernet packets with an
123Omni-Path header in the transmit path. For each VNIC interface, the
124information required for encapsulation is configured by the EM via VEMA MAD
125interface. It also passes any control information to the HW dependent driver
126by invoking the RDMA netdev control operations.
127
128 +-------------------+ +----------------------+
129 | | | Linux |
130 | IB MAD | | Network |
131 | | | Stack |
132 +-------------------+ +----------------------+
133 | | |
134 | | |
135 +----------------------------+ |
136 | | |
137 | OPA VNIC Module | |
138 | (OPA VNIC RDMA Netdev | |
139 | & EMA functions) | |
140 | | |
141 +----------------------------+ |
142 | |
143 | |
144 +------------------+ |
145 | IB core | |
146 +------------------+ |
147 | |
148 | |
149 +--------------------------------------------+
150 | |
151 | HFI1 Driver with VNIC support |
152 | |
153 +--------------------------------------------+