[SCSI] fcoe: Amends previous patch, Round-robin based selection of CPU for post processing of incoming request for FCoE target

Problem: Selection of RX queue on target is based on RX-ID. FCoE used
8 Net Rx queues.  HW post the packets based on rx_id %
num_rx_queue. Due to this has based filtering, only one CPU is busy
servicing incoming request including post-processing of incoming
request. This is gating factor because

1. Only one CPU is utilized 100% while others CPUs are not used at all.

2. CPU which received request assign "sequence' by selecting exchange
   from per CPU pool (num_ddp_context / num_online_cpus,
   approxi.). Due to which if if rate of incoming request is higher
   than rate of servicing request, existing code path end of sending
   "BUSY" response (SAM_STAT_BUSY because unable to allocate
   exchange).

Fix: Fan-out incoming request to all other CPUs excluding the CPU
which is receiving all incoiming request. This path also addresses,
selecting same CPU based on rx_id from received frame for completion
of the request such as "releasing exchange to the per CPU Pool". This
fix is applicable for FCoE target since initiator code path already
takes care of selecting CPU to complete post-processing of request
once OX_ID is assigned.

Notes: N/A

Dependencines: N/A

Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index da73115d..522fbaa 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1358,7 +1358,11 @@
 			do {
 				cpu = fcoe_select_cpu(cpu);
 			} while (!cpu_online(cpu));
-		}
+		} else  if ((fh->fh_type == FC_TYPE_FCP) &&
+			    (ntohs(fh->fh_rx_id) != FC_XID_UNKNOWN)) {
+			cpu = ntohs(fh->fh_rx_id) & fc_cpu_mask;
+		} else
+			cpu = smp_processor_id();
 	}
 	fps = &per_cpu(fcoe_percpu, cpu);
 	spin_lock_bh(&fps->fcoe_rx_list.lock);