intel-iommu: Fix address wrap on 32-bit kernel.

The problem is in dma_pte_clear_range and dma_pte_free_pagetable. When
intel_unmap_single and intel_unmap_sg call them, the end address may be
zero if the 'start_addr + size' rounds up. So no PTE gets cleared. The
uncleared PTE fires the BUG_ON when it's used again to create new mappings.

After I modified dma_pte_clear_range a bit, the BUG_ON is gone.

Tested both 32 and 32 PAE modes on Intel X58 and Q35 platforms.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 0c12d06..002c8b9 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -718,15 +718,17 @@
 static void dma_pte_clear_range(struct dmar_domain *domain, u64 start, u64 end)
 {
 	int addr_width = agaw_to_width(domain->agaw);
+	int npages;
 
 	start &= (((u64)1) << addr_width) - 1;
 	end &= (((u64)1) << addr_width) - 1;
 	/* in case it's partial page */
 	start = PAGE_ALIGN(start);
 	end &= PAGE_MASK;
+	npages = (end - start) / VTD_PAGE_SIZE;
 
 	/* we don't need lock here, nobody else touches the iova range */
-	while (start < end) {
+	while (npages--) {
 		dma_pte_clear_one(domain, start);
 		start += VTD_PAGE_SIZE;
 	}