This works because CPUs support hardware 64-bit rotate.
Added hc_umullo() and rewrote trunc_mul() for Argon2. No performance
impact, but trunc_mul() is now easier to read.
Re-enabled USE_BITSELECT, USE_ROTATE, and USE_SWIZZLE for OpenCL. We
have a new unit test script; let's see if OpenCL runtimes have
improved.
Previous fix for -m 21800 in multihash mode was incomplete. Now
shows the correct cracked hash.
Re-enabled --hwmon-disable for users. While it's important for SCRYPT
and Argon2 performance, a warning is now shown when it affects
speed.
Updated hash modes with OPTS_TYPE_NATIVE_THREADS:
1376x, 1377x, 1378x, 14800, 19500 and 2300x.
only the first hash in a multihash list was marked as cracked, regardless
of which hash was actually cracked. For example, if the second hash was
cracked, it incorrectly marked the first as cracked and left the second
uncracked. This issue only affected beta versions and only in multihash
cracking mode.
Added deep-comp kernel support for Kerberos modes 28800 and 28900,
enabling multihash cracking for the same user in the same domain, even if
the password was changed or the recording was bad.
Added a rule ensuring that device buffer sizes for password candidates,
hooks, and transport (tmps) must be smaller than 1/4 of the maximum
allocatable memory. If not, hashcat now automatically reduces kernel-accel
down to 1, then halves the number of threads and restores kernel-accel up
to its maximum, repeating until the size requirement is met.
Fixed salt length limit verification for -m 20712.
Fixed password length limit for -m 14400.
Fixed unit test salt generator for -m 21100, which could produce duplicate
hashes under certain conditions.
Added the OPTS_TYPE_NATIVE_THREADS flag to the following hash modes
(after benchmarking): 7700, 7701, 9000, 1375x, 1376x, 14800, 19500, 23900.
- Used blake2b_transform() instead of blake2b_update() to avoid compiler problems
on Intel OpenCL and segfaults on POCL (still unsure of exact cause but possibly
related to the shuffle functions in combination with these OpenCL drivers).
- Remove 'bug' comments (these are resolved now).
- Added implementation of 'argon2_hash_block()' for non-warped (CPU) case.
- Introduced 'LBLOCKSIZE' for the size of an argon2 block per thread in u64.
Most of the code should now be able to support any warp/wavefront size.
Improved handling of an autotune edge case. In theory, increasing
accel early can improve accuracy, and it does, but it also prevents
increasing the thread count because it's more likely to run into
high runtime limits. OTOH, we want to prioritize threads over accel.
This change may slightly reduce performance for algorithms that
benefit from high accel and low thread counts (e.g., 7800, 14900),
but those can be managed by limiting thread count or, preferably,
by setting OPTS_TYPE_NATIVE_THREADS.
Added OPTS_TYPE_NATIVE_THREADS to 7800, 7810, and 14900.
Also fixed encoder bugs in hash-mode 29920 and 29940, identified
using the new test_edge.sh script. The encoders in the modules
failed to properly terminate the output string.