- Fix the bug where we zero too many bytes in sha512_Last
(SHORT_BLOCK_LENGTH != BLOCK_LENGTH -2).
- Get rid of an if branch.
- Don't reverse the last two words in 512_Last that are written later.
- make 256_Final and 512_Last look the same.
The old implementation needed 6 sha transformations per iterations:
- 2 for computing sha512 of seed,
- 2 for computing digests of ipads/opads,
- 2 for computing digests of intermediate hashes.
The first 4 transformations are the same in every iteration so we cache
them. A new function hmac_sha512_prepare computes these digests.
We made sha512_Transform visible in pbkdf2 and prevent unneccessary
big/little endian conversions back and forth.