ARM64: Combine LSR+ADD into ADD_shift for Int32 HDiv/HRem
HDiv/HRem having a constant divisor are optimized by using
multiplication of the dividend by a sort of reciprocal of the divisor.
In case of Int32 the multiplication is done into a 64-bit register
high 32 bits of which are only used.
The multiplication result might need some ADD/SUB corrections.
Currently it is done by extracting high 32 bits with LSR and applying
ADD/SUB. However we can do correcting ADD/SUB on high 32 bits and extracting
those bits with the final right shift. This will eliminate the
extracting LSR instruction.
This CL implements this optimization.
Test: test.py --host --optimizing --jit
Test: test.py --target --optimizing --jit
Change-Id: I5ba557aa283291fd76d61ac0eb733cf6ea975116
4 files changed