Merge branch 'tunnel_dst_caching'

Tom Herbert says:

====================
ipv4: Cache dst in tunnels

Version 3 of caching routes in tunnels.

Addressed some comments from Eric in this series.

There are two patches (variants) in the series:
1) One dst cached for each tunnel.
2) Percpu dst cache per tunnel to avoid false sharing

Testing with GRE tunnels on a 32 CPU host with bnx2x (RSS support
for GRE) shows a modest improvement in CPU utilization with these
patches running 200 TCP_RR netperf clients.

Without patches
71.22% CPU utilization
138/180/244 90/95/99% latencies
1.30465e+06 CPU/tps
18318 CPU/tps

With patches
69.84%
142/186/249 90/95/99% latencies
1.30827e+06
18732 CPU/tps
====================

Signed-off-by: David S. Miller <davem@davemloft.net>