crypto: arm64/aes - performance tweak

Shuffle some instructions around in the __hround macro to shave off
0.1 cycles per byte on Cortex-A57.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
1 file changed