sparc64: fix and optimize irq distribution
irq_choose_cpu() should compare the affinity mask against cpu_online_map
rather than CPU_MASK_ALL, since irq_select_affinity() sets the interrupt's
affinity mask to cpu_online_map "and" CPU_MASK_ALL (which ends up being
just cpu_online_map). The mask comparison in irq_choose_cpu() will always
fail since the two masks are not the same. So the CPU chosen is the first CPU
in the intersection of cpu_online_map and CPU_MASK_ALL, which is always CPU0.
That means all interrupts are reassigned to CPU0...
Distributing interrupts to CPUs in a linearly increasing round robin fashion
is not optimal for the UltraSPARC T1/T2. Also, the irq_rover in
irq_choose_cpu() causes an interrupt to be assigned to a different
processor each time the interrupt is allocated and released. This may lead
to an unbalanced distribution over time.
A static mapping of interrupts to processors is done to optimize and balance
interrupt distribution. For the T1/T2, interrupts are spread to different
cores first, and then to strands within a core.
The following is some benchmarks showing the effects of interrupt
distribution on a T2. The test was done with iperf using a pair of T5220
boxes, each with a 10GBe NIU (XAUI) connected back to back.
TCP | Stock Linear RR IRQ Optimized IRQ
Streams | 2.6.30-rc5 Distribution Distribution
| GBits/sec GBits/sec GBits/sec
--------+-----------------------------------------
1 0.839 0.862 0.868
8 1.16 4.96 5.88
16 1.15 6.40 8.04
100 1.09 7.28 8.68
Signed-off-by: Hong H. Pham <hong.pham@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/arch/sparc/kernel/irq_64.c b/arch/sparc/kernel/irq_64.c
index e5e78f9..bd07505 100644
--- a/arch/sparc/kernel/irq_64.c
+++ b/arch/sparc/kernel/irq_64.c
@@ -45,6 +45,7 @@
#include <asm/cacheflush.h>
#include "entry.h"
+#include "cpumap.h"
#define NUM_IVECS (IMAP_INR + 1)
@@ -256,35 +257,13 @@
int cpuid;
cpumask_copy(&mask, irq_desc[virt_irq].affinity);
- if (cpus_equal(mask, CPU_MASK_ALL)) {
- static int irq_rover;
- static DEFINE_SPINLOCK(irq_rover_lock);
- unsigned long flags;
-
- /* Round-robin distribution... */
- do_round_robin:
- spin_lock_irqsave(&irq_rover_lock, flags);
-
- while (!cpu_online(irq_rover)) {
- if (++irq_rover >= nr_cpu_ids)
- irq_rover = 0;
- }
- cpuid = irq_rover;
- do {
- if (++irq_rover >= nr_cpu_ids)
- irq_rover = 0;
- } while (!cpu_online(irq_rover));
-
- spin_unlock_irqrestore(&irq_rover_lock, flags);
+ if (cpus_equal(mask, cpu_online_map)) {
+ cpuid = map_to_cpu(virt_irq);
} else {
cpumask_t tmp;
cpus_and(tmp, cpu_online_map, mask);
-
- if (cpus_empty(tmp))
- goto do_round_robin;
-
- cpuid = first_cpu(tmp);
+ cpuid = cpus_empty(tmp) ? map_to_cpu(virt_irq) : first_cpu(tmp);
}
return cpuid;