blob: 2cda12ab7075082d92c44e82baba1cf4760238a1 [file] [log] [blame]
Alexander Duyckf7a62722016-04-10 21:45:09 -04001Segmentation Offloads in the Linux Networking Stack
2
3Introduction
4============
5
6This document describes a set of techniques in the Linux networking stack
7to take advantage of segmentation offload capabilities of various NICs.
8
9The following technologies are described:
10 * TCP Segmentation Offload - TSO
11 * UDP Fragmentation Offload - UFO
12 * IPIP, SIT, GRE, and UDP Tunnel Offloads
13 * Generic Segmentation Offload - GSO
14 * Generic Receive Offload - GRO
15 * Partial Generic Segmentation Offload - GSO_PARTIAL
16
17TCP Segmentation Offload
18========================
19
20TCP segmentation allows a device to segment a single frame into multiple
21frames with a data payload size specified in skb_shinfo()->gso_size.
22When TCP segmentation requested the bit for either SKB_GSO_TCP or
23SKB_GSO_TCP6 should be set in skb_shinfo()->gso_type and
24skb_shinfo()->gso_size should be set to a non-zero value.
25
26TCP segmentation is dependent on support for the use of partial checksum
27offload. For this reason TSO is normally disabled if the Tx checksum
28offload for a given device is disabled.
29
30In order to support TCP segmentation offload it is necessary to populate
31the network and transport header offsets of the skbuff so that the device
32drivers will be able determine the offsets of the IP or IPv6 header and the
33TCP header. In addition as CHECKSUM_PARTIAL is required csum_start should
34also point to the TCP header of the packet.
35
36For IPv4 segmentation we support one of two types in terms of the IP ID.
37The default behavior is to increment the IP ID with every segment. If the
38GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
39ID and all segments will use the same IP ID. If a device has
40NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
41and we will either increment the IP ID for all frames, or leave it at a
42static value based on driver preference.
43
44UDP Fragmentation Offload
45=========================
46
47UDP fragmentation offload allows a device to fragment an oversized UDP
48datagram into multiple IPv4 fragments. Many of the requirements for UDP
49fragmentation offload are the same as TSO. However the IPv4 ID for
50fragments should not increment as a single IPv4 datagram is fragmented.
51
Daniel Axtensa65820e2018-02-14 18:05:31 +110052UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
53still receive them from tuntap and similar devices. Offload of UDP-based
54tunnel protocols is still supported.
55
Alexander Duyckf7a62722016-04-10 21:45:09 -040056IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
57========================================================
58
59In addition to the offloads described above it is possible for a frame to
60contain additional headers such as an outer tunnel. In order to account
61for such instances an additional set of segmentation offload types were
Nicolas Dichtel11bafd52017-07-07 14:08:25 +020062introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
Alexander Duyckf7a62722016-04-10 21:45:09 -040063SKB_GSO_UDP_TUNNEL. These extra segmentation types are used to identify
64cases where there are more than just 1 set of headers. For example in the
65case of IPIP and SIT we should have the network and transport headers moved
66from the standard list of headers to "inner" header offsets.
67
68Currently only two levels of headers are supported. The convention is to
69refer to the tunnel headers as the outer headers, while the encapsulated
70data is normally referred to as the inner headers. Below is the list of
71calls to access the given headers:
72
73IPIP/SIT Tunnel:
74 Outer Inner
75MAC skb_mac_header
76Network skb_network_header skb_inner_network_header
77Transport skb_transport_header
78
79UDP/GRE Tunnel:
80 Outer Inner
81MAC skb_mac_header skb_inner_mac_header
82Network skb_network_header skb_inner_network_header
83Transport skb_transport_header skb_inner_transport_header
84
85In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
86SKB_GSO_UDP_TUNNEL_CSUM. These two additional tunnel types reflect the
87fact that the outer header also requests to have a non-zero checksum
88included in the outer header.
89
90Finally there is SKB_GSO_REMCSUM which indicates that a given tunnel header
91has requested a remote checksum offload. In this case the inner headers
92will be left with a partial checksum and only the outer header checksum
93will be computed.
94
95Generic Segmentation Offload
96============================
97
98Generic segmentation offload is a pure software offload that is meant to
99deal with cases where device drivers cannot perform the offloads described
100above. What occurs in GSO is that a given skbuff will have its data broken
101out over multiple skbuffs that have been resized to match the MSS provided
102via skb_shinfo()->gso_size.
103
104Before enabling any hardware segmentation offload a corresponding software
105offload is required in GSO. Otherwise it becomes possible for a frame to
106be re-routed between devices and end up being unable to be transmitted.
107
108Generic Receive Offload
109=======================
110
111Generic receive offload is the complement to GSO. Ideally any frame
112assembled by GRO should be segmented to create an identical sequence of
113frames using GSO, and any sequence of frames segmented by GSO should be
114able to be reassembled back to the original by GRO. The only exception to
115this is IPv4 ID in the case that the DF bit is set for a given IP header.
116If the value of the IPv4 ID is not sequentially incrementing it will be
117altered so that it is when a frame assembled via GRO is segmented via GSO.
118
119Partial Generic Segmentation Offload
120====================================
121
122Partial generic segmentation offload is a hybrid between TSO and GSO. What
123it effectively does is take advantage of certain traits of TCP and tunnels
124so that instead of having to rewrite the packet headers for each segment
125only the inner-most transport header and possibly the outer-most network
126header need to be updated. This allows devices that do not support tunnel
127offloads or tunnel offloads with checksum to still make use of segmentation.
128
129With the partial offload what occurs is that all headers excluding the
130inner transport header are updated such that they will contain the correct
131values for if the header was simply duplicated. The one exception to this
132is the outer IPv4 ID field. It is up to the device drivers to guarantee
133that the IPv4 ID field is incremented in the case that a given header does
134not have the DF bit set.