net: skb_shared_info optimization
skb_dma_unmap() is quite expensive for small packets,
because we use two different cache lines from skb_shared_info.
One to access nr_frags, one to access dma_maps[0]
Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements,
let dma_head alone in a new dma_head field, close to nr_frags,
to reduce cache lines misses.
Tested on my dev machine (bnx2 & tg3 adapters), nice speedup !
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index a39b534..46a3f86 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -5021,7 +5021,7 @@
/* New SKB is guaranteed to be linear. */
entry = *start;
ret = skb_dma_map(&tp->pdev->dev, new_skb, DMA_TO_DEVICE);
- new_addr = skb_shinfo(new_skb)->dma_maps[0];
+ new_addr = skb_shinfo(new_skb)->dma_head;
/* Make sure new skb does not cross any 4G boundaries.
* Drop the packet if it does.
@@ -5155,7 +5155,7 @@
sp = skb_shinfo(skb);
- mapping = sp->dma_maps[0];
+ mapping = sp->dma_head;
tp->tx_buffers[entry].skb = skb;
@@ -5173,7 +5173,7 @@
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
len = frag->size;
- mapping = sp->dma_maps[i + 1];
+ mapping = sp->dma_maps[i];
tp->tx_buffers[entry].skb = NULL;
tg3_set_txd(tp, entry, mapping, len,
@@ -5331,7 +5331,7 @@
sp = skb_shinfo(skb);
- mapping = sp->dma_maps[0];
+ mapping = sp->dma_head;
tp->tx_buffers[entry].skb = skb;
@@ -5356,7 +5356,7 @@
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
len = frag->size;
- mapping = sp->dma_maps[i + 1];
+ mapping = sp->dma_maps[i];
tp->tx_buffers[entry].skb = NULL;