KVM: PPC: Book3S HV: Add more barriers in XIVE load/unload code
On POWER9 systems, we push the VCPU context onto the XIVE (eXternal
Interrupt Virtualization Engine) hardware when entering a guest,
and pull the context off the XIVE when exiting the guest. The push
is done with cache-inhibited stores, and the pull with cache-inhibited
loads.
Testing has revealed that it is possible (though very rare) for
the stores to get reordered with the loads so that we end up with the
guest VCPU context still loaded on the XIVE after we have exited the
guest. When that happens, it is possible for the same VCPU context
to then get loaded on another CPU, which causes the machine to
checkstop.
To fix this, we add I/O barrier instructions (eieio) before and
after the push and pull operations. As partial compensation for the
potential slowdown caused by the extra barriers, we remove the eieio
instructions between the two stores in the push operation, and between
the two loads in the pull operation. (The architecture requires
loads to cache-inhibited, guarded storage to be kept in order, and
requires stores to cache-inhibited, guarded storage likewise to be
kept in order, but allows such loads and stores to be reordered with
respect to each other.)
Reported-by: Carol L Soto <clsoto@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c700bed..42639fb 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -989,13 +989,14 @@
beq no_xive
ld r11, VCPU_XIVE_SAVED_STATE(r4)
li r9, TM_QW1_OS
- stdcix r11,r9,r10
eieio
+ stdcix r11,r9,r10
lwz r11, VCPU_XIVE_CAM_WORD(r4)
li r9, TM_QW1_OS + TM_WORD2
stwcix r11,r9,r10
li r9, 1
stw r9, VCPU_XIVE_PUSHED(r4)
+ eieio
no_xive:
#endif /* CONFIG_KVM_XICS */
@@ -1401,8 +1402,8 @@
cmpldi cr0, r10, 0
beq 1f
/* First load to pull the context, we ignore the value */
- lwzx r11, r7, r10
eieio
+ lwzx r11, r7, r10
/* Second load to recover the context state (Words 0 and 1) */
ldx r11, r6, r10
b 3f
@@ -1410,8 +1411,8 @@
cmpldi cr0, r10, 0
beq 1f
/* First load to pull the context, we ignore the value */
- lwzcix r11, r7, r10
eieio
+ lwzcix r11, r7, r10
/* Second load to recover the context state (Words 0 and 1) */
ldcix r11, r6, r10
3: std r11, VCPU_XIVE_SAVED_STATE(r9)
@@ -1421,6 +1422,7 @@
stw r10, VCPU_XIVE_PUSHED(r9)
stb r10, (VCPU_XIVE_SAVED_STATE+3)(r9)
stb r0, (VCPU_XIVE_SAVED_STATE+4)(r9)
+ eieio
1:
#endif /* CONFIG_KVM_XICS */
/* Save more register state */