- Python extraction script that can generate a hash from a LUKS2 partition.
- For now, argon2id as KDF, SHA-256 as hash mode and AES cipher are supported.
Re-enabled USE_BITSELECT for Intel GPUs.
Optimize vector version of hc_swap32() to allow using USE_SWIZZLE based technique on OpenCL in case USE_BITSELECT or USE_ROTATE is not set.
This affects the inner core of nearly all kernels and thus impacts
almost all hash modes. The only functional change is that we now
manually unroll the individual steps of the transform() functions,
saving a small amount of constant memory.
In most cases, JIT compilers would likely detect the unused constant
buffer and remove it automatically, but this makes it explicit.
Tested on newer NVIDIA devices: no speed change observed.
Tested on older NVIDIA devices: visible speed increase.
Tested on AMD devices: visible speed increase across all tested GPUs.
Not yet tested: CPUs, Intel iGPUs, Intel dGPUs.
Add hc_uint4_t to SCRYPT to work-around Intel OpenCL alignment bug.
Align large buffers (V1-V4) manually to 1k-byte boundaries.
Replace uint4 xor operator with xor_uint4() function.
In the automatic downtune routine, hashcat prepares a fixed 512MiB host
buffer that is known to be allocated by the compute runtimes (CUDA, HIP,
OpenCL, Metal), and over which hashcat has no control.
However, hashcat still divides the maximum available host memory by the
active device count to automatically as a preparation to later downtune
the -n and -T parameters when memory is limited.
Hashcat reserves 512MiB per active device. With bridges, the active
devices become bridge units, which for modes 70000, 70100, and 70200
equals the CPU core count. On a 32-core CPU, this multiplies to 16GiB,
even though the memory is actually shared because of threading.
This leads to an overestimation of memory usage.
A simple fix is to divide the 512MiB buffer by the active device count.
This keeps the full 512MiB for a single GPU but avoids overestimating
memory usage with many virtual devices.
Fix code handling kernel-accel value in argon2_common.c for CPU,
which was accidentally removed during previous refactoring.
Set thread count to 1 for hash-mode 70000. Oversubscribing the CPU isn't
useful here. This allows to keep the wordlist count low, which is very
welcome for slow hashes like Argon2id.
Fix unit test for 20011/20012/20013 (DiskCryptor) by adding setuptools
to install_modules.sh and replacing AES.MODE_XTS with python_AES.MODE_XTS.
Fix false negative for kernel 32800, which only occurred if all
conditions were true: multihash, -a 3, optimized mode, and password
length between 16 and 31.
Fix Python package name in BUILD_WSL.md command line example.