in the following functions I changed the type for the parameter used to specify the target of the operation:
- hc_clReleaseMemObject
- hc_clReleaseKernel
- hc_clReleaseProgram
- hc_cuModuleUnload
- hc_cuMemFree
- hc_cuStreamDestroy
- hc_cuEventDestroy
- hc_hipEventDestroy
- hc_hipMemFree
- hc_hipModuleUnload
- hc_hipStreamDestroy
- hc_mtlReleaseMemObject
- hc_mtlReleaseFunction
- hc_mtlReleaseLibrary
With this change, it was possible to remove several lines of code from backend.c, making it more readable.
a more accurate preferred thread size instead.
Automatically set artificial processor count to 1 for Intel iGPU,
since we now use a more accurate preferred thread size instead.
Removed the module_unstable_warning() entry for Intel GPUs on
non-Apple OpenCL platform for hash-mode 21800
Do not always ignore TMTO determination for iGPU's in
scrypt_common.c. We must at least check the available memory
size.
Added preferred thread count and unified memory type to -I output
Removed special characters in machine-readable format from -I output
Changed the default thread count for Intel GPUs (both dGPU and iGPU)
to 32. This is likely because only newer iGPUs are supported by the
Intel Compute Runtime (2024+), and dGPUs already prefer 64
threads (2x32).
As a result, removed the module_unstable_warning() entry for Intel
GPUs on non-Apple OpenCL platform for the following hash modes:
8200, 17200, 17220, 17225, 21700, 25000, 25100, and 25200.
Add hc_uint4_t to SCRYPT to work-around Intel OpenCL alignment bug.
Align large buffers (V1-V4) manually to 1k-byte boundaries.
Replace uint4 xor operator with xor_uint4() function.
In the automatic downtune routine, hashcat prepares a fixed 512MiB host
buffer that is known to be allocated by the compute runtimes (CUDA, HIP,
OpenCL, Metal), and over which hashcat has no control.
However, hashcat still divides the maximum available host memory by the
active device count to automatically as a preparation to later downtune
the -n and -T parameters when memory is limited.
Hashcat reserves 512MiB per active device. With bridges, the active
devices become bridge units, which for modes 70000, 70100, and 70200
equals the CPU core count. On a 32-core CPU, this multiplies to 16GiB,
even though the memory is actually shared because of threading.
This leads to an overestimation of memory usage.
A simple fix is to divide the 512MiB buffer by the active device count.
This keeps the full 512MiB for a single GPU but avoids overestimating
memory usage with many virtual devices.
Fix code handling kernel-accel value in argon2_common.c for CPU,
which was accidentally removed during previous refactoring.
Set thread count to 1 for hash-mode 70000. Oversubscribing the CPU isn't
useful here. This allows to keep the wordlist count low, which is very
welcome for slow hashes like Argon2id.
Fix unit test for 20011/20012/20013 (DiskCryptor) by adding setuptools
to install_modules.sh and replacing AES.MODE_XTS with python_AES.MODE_XTS.
Fix false negative for kernel 32800, which only occurred if all
conditions were true: multihash, -a 3, optimized mode, and password
length between 16 and 31.
Fix Python package name in BUILD_WSL.md command line example.
Fix deprecation warning on m30906.pm
Fix pipeline error with -m 32600 on Apple
Update edge_test.sh
Fix edge test vectors generation for hash-type 28501, 28502, 28503, 28504, 28505, 28506