kvm: mmu: ITLB_MULTIHIT mitigation With some Intel processors, putting the same virtual address in the TLB as both a 4 KiB and 2 MiB page can confuse the instruction fetch unit and cause the processor to issue a machine check resulting in a CPU lockup. Unfortunately when EPT page tables use huge pages, it is possible for a malicious guest to cause this situation. Add a knob to mark huge pages as non-executable. When the nx_huge_pages parameter is enabled (and we are using EPT), all huge pages are marked as NX. If the guest attempts to execute in one of those pages, the page is broken down into 4K pages, which are then marked executable. This is not an issue for shadow paging (except nested EPT), because then the host is in control of TLB flushes and the problematic situation cannot happen. With nested EPT, again the nested guest can cause problems shadow and direct EPT is treated in the same way. [ tglx: Fixup default to auto and massage wording a bit ] Originally-by: Junaid Shahid <junaids@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

commit: b8e8c8303ff28c61046a4d0f6ea99aea609a7dc0 [log] [tgz]
author: Paolo Bonzini <pbonzini@redhat.com> Mon Nov 04 12:22:02 2019 +0100
committer: Thomas Gleixner <tglx@linutronix.de> Mon Nov 04 12:22:02 2019 +0100
tree: f5e0c8bb8b968eb158c07b274d9fdc465fdf95e8
parent: 731dc9df975a5da21237a18c3384f811a7a41cc6 [diff] [blame]
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 32d70ca..b087d17 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c

@@ -213,6 +213,7 @@ struct kvm_stats_debugfs_item debugfs_entries[] = {
 	{ "mmu_unsync", VM_STAT(mmu_unsync) },
 	{ "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
 	{ "largepages", VM_STAT(lpages, .mode = 0444) },
+	{ "nx_largepages_splitted", VM_STAT(nx_lpage_splits, .mode = 0444) },
 	{ "max_mmu_page_hash_collisions",
 		VM_STAT(max_mmu_page_hash_collisions) },
 	{ NULL }
@@ -1280,6 +1281,14 @@ static u64 kvm_get_arch_capabilities(void)
 		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, data);
 
 	/*
+	 * If nx_huge_pages is enabled, KVM's shadow paging will ensure that
+	 * the nested hypervisor runs with NX huge pages.  If it is not,
+	 * L1 is anyway vulnerable to ITLB_MULTIHIT explots from other
+	 * L1 guests, so it need not worry about its own (L2) guests.
+	 */
+	data |= ARCH_CAP_PSCHANGE_MC_NO;
+
+	/*
 	 * If we're doing cache flushes (either "always" or "cond")
 	 * we will do one whenever the guest does a vmlaunch/vmresume.
 	 * If an outer hypervisor is doing the cache flush for us
commit	b8e8c8303ff28c61046a4d0f6ea99aea609a7dc0	[log] [tgz]
author	Paolo Bonzini <pbonzini@redhat.com>	Mon Nov 04 12:22:02 2019 +0100
committer	Thomas Gleixner <tglx@linutronix.de>	Mon Nov 04 12:22:02 2019 +0100
tree	f5e0c8bb8b968eb158c07b274d9fdc465fdf95e8
parent	731dc9df975a5da21237a18c3384f811a7a41cc6 [diff] [blame]