ARM64: Optimization of HRem and HDiv when a denominator is power of 2

On ARM64 when a denominator is a power of 2 fewer instructions can be
used to represent HDiv and HRem. For example, a/2 can be lowered to
add+asr; a%2 to cmp+and+csneg. Currently four instructions
are always used for the division by a power of 2 and five instructions for the
remainder.

This patch optimizes the division by 2 (lowering to two instructions),
the remainder from the division by 2 (lowering to three instructions)
and the remainder from the division by a power of 2 (lowering to four
instructions).

On Pixel 2, performance improvements, geomean of diff for a benchmark group (%),
max - the maximum seen diff of a single case in a benchmark group, higher better:
Big core:
algorithm                 0.664 (max: 1.6)
intrinsics                5.813 (max: 19.0)
micro                     4.734 (max: 22.0)

Little core:
algorithm                 2.097 (max: 5.4)
intrinsics               14.610 (max: 27.3)
micro                    12.687 (max: 35.6)

Test: 012-math, 014-math3, 411-optimizing-arith, 411-checker-hdiv-hrem-pow2
Test: test-art-host, test-art-target
Change-Id: Iaaec6dc8fc0ec5df2b2d0e8692d5dea573b8d284
6 files changed