hashcat

mirror of https://github.com/hashcat/hashcat.git synced 2025-07-05 06:12:35 +00:00

Author	SHA1	Message	Date
Jens Steube	13a7b56feb	Improve the logic for when to use funnelshift and when not to. Some algorithms, such as SHA1-HMAC and DCC1, do not work well with it, so it has been disabled for them. Fix the automatic reduction of the kernel-accel maximum based on available memory per device by accounting for the additional size needed to handle register spilling. Fix the tools/benchmark_deep.pl script to recognize benchmark masks more reliably.	2025-06-23 12:30:12 +02:00
Jens Steube	b7c8fcf27c	Removed shared-memory based optimization for SCRYPT on HIP, because the shared-memory buffer is incompatible with TMTO, which is limiting SCRYPT-R to a maximum of 8. This change also simplifies the code, allowing removal of large sections of duplicated code. Removed the section in scrypt_module_extra_tuningdb_block() that increased TMTO when there was insufficient shared memory, as this is no longer applicable. Refactored inc_hash_scrypt.cl almost completely and improved macro names in inc_hash_scrypt.h. Adapted all existing SCRYPT-based plugins to the new standard. If you have custom SCRYPT based plugins use hash-mode 8900 as reference. Fixed some compiler warnings in inc_platform.cl. Cleaned up code paths in inc_vendor.h for finding values for HC_ATTR_SEQ and DECLSPEC. Removed option --device-as-default-execution-space from nvrtc for hiprtc compatibility. As a result, added __device__ back to DECLSPEC. Removed option --restrict from nvrtc compile options since we actually alias some buffers. Added --gpu-max-threads-per-block to hiprtc options. Added -D MAX_THREADS_PER_BLOCK to OpenCL options (currently unused). Removed all OPTS_TYPE_MP_MULTI_DISABLE entries for SNMPv3-based plugins. These plugins consume large amounts of memory and for this reason,limited kernel_accel max to 256. This may still be high, but hashcat will automatically tune down kernel_accel if insufficient memory is detected. Removed command `rocm-smi --resetprofile --resetclocks --resetfans` from benchmark_deep.pl, since some AMD GPUs become artificially slow for a while after running these commands. Replaced load_source() with file_to_buffer() from shared.c, which does the exact same operations. Moved suppress_stderr() and restore_stderr() to shared.c and reused them in both Python bridges and opencl_test_instruction(), where the same type of code existed.	2025-06-21 07:09:20 +02:00
Jens Steube	4b93a6e93c	Add support for detecting unified GPU memory on CUDA and HIP (previously available only for OpenCL and Metal). Do not adjust kernel-accel or scrypt-tmto for GPUs with unified memory, typically integrated GPUs in CPUs (APUs). Redesign the "4-buffer" strategy to avoid overallocation from naive division by four, which can significantly increase memory usage for high scrypt configurations (e.g., 256k:8:1). Update the scrypt B[] access pattern to match the new "4-buffer" design. Allow user-specified kernel-accel and scrypt-tmto values, individually or both, via command line and tuning database. Any unspecified parameters are adjusted automatically. Permit user-defined combinations of scrypt-tmto and kernel-accel even if they may exceed available memory.	2025-06-17 13:32:57 +02:00
Jens Steube	e8052a004b	- Replace naive 32 bit rotate with funnelshift on CUDA/HIP - Replace V_ALIGNBIT_B32 with funnelshift on HIP - Improve RC4 performance by preventing inlineing - Fix leftover code in yescrypt-platform.c - Update docs/hashcat-assimilation-bridge-development.md - Only initialize hwmon on host for virtualized backends - Improve SCRYPT tunings on AMD RX6900XT	2025-06-02 11:50:08 +02:00
Gabriele Gristina	b3d3b31c3e	Metal: added support for vectors up to 4	2022-02-10 21:53:08 +01:00
Gabriele Gristina	9d36245d51	Kernels: Set the default Address Space Qualifier for any pointer, refactored / updated KERN_ATTR macros and rc4 cipher functions, in order to support Apple Metal runtime	2022-02-04 19:54:00 +01:00
Jens Steube	3f6c5a0042	Update module_unstable_warning() for -m 172xx on HIP	2021-07-23 21:09:55 +02:00
Jens Steube	5ffcaa980d	HIP Backend: Added support to support HIP 4.4 and later, but added check to rule out older versions because they are incompatible	2021-07-23 16:04:34 +02:00
Jens Steube	bdb7999f07	Switch HIP vector datatypes to OpenCL like ext_vector_type()	2021-07-19 20:24:30 +02:00
Jens Steube	0d8b4b74ad	More CUDA special backports to HIP	2021-07-18 22:56:22 +02:00
Jens Steube	257098a301	Get rid of hip/hip_runtime.h dependancy	2021-07-18 21:14:45 +02:00
Jens Steube	45e65dd05a	Backport more ROCm based optimizations to HIP	2021-07-15 23:34:27 +02:00
Jens Steube	d130cc66b3	Optimize ISA code on HIP for V_ALIGNBIT_B32 using a different template for inline assembly	2021-07-15 09:57:41 +02:00
Jens Steube	674ca7d88f	Add GPU threads to kernel cache checksum because it has an influence on HIP offline compile options Add V_ALIGNBIT_B32 inline assembly wrapper because HIP does not provide amd_bitalign()	2021-07-12 11:27:05 +02:00
Jens Steube	20f7febd4c	Workaround too intensive optimization in -m 2000 using HIPRTC	2021-07-11 15:54:13 +02:00
Jens Steube	1b84a9e53b	Add missing backports from code base v6.2.2 Fix context to thread management Fix missing code in selftest.c, autotune.c, hashes.c, dispatch.c and backend.c Use IS_HIP depending code makes it easier for future optimization related to inline assembly calls - instead of using IS_CUDA \|\| IS_HIP See TODO markers for more optimizations / next steps	2021-07-11 12:38:59 +02:00
Jens Steube	a22f8149fc	Merge branch 'HIP' into hip	2021-07-10 21:34:09 +02:00
reger-men	ea7b74389f	First draft HIP Version	2021-07-09 03:50:40 +00:00
Jens Steube	62fc3601bb	Wrap atomic functions with hc_ prefix to have better platform control	2021-04-20 17:47:44 +02:00
Jens Steube	73cc3170f4	Fixed both false negative and false positive result in -m 3000 in -a 3 (affected only NVIDIA GPU)	2021-04-20 17:14:13 +02:00
Jens Steube	316095c151	Some more ROCm performance tuning	2019-06-20 10:04:31 +02:00
Jens Steube	5e0eb288c9	Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL	2019-06-16 18:01:26 +02:00
Jens Steube	7832c54452	Fix constant memory use of bfs_buf	2019-05-11 09:32:16 +02:00
Jens Steube	46f737c5af	Use real constant memory on CUDA	2019-05-10 13:22:26 +02:00
Jens Steube	d0bd33c9d1	Rename CONSTANT_AS to CONSTANT_VK	2019-05-06 14:34:16 +02:00
Jens Steube	ec9925f3b1	Warnings self-check and autotune with CUDA	2019-05-04 21:52:00 +02:00
Jens Steube	3b7304c9d8	Fix recursion in inc_platform.cl	2019-04-26 14:01:14 +02:00
Jens Steube	89119bf24a	Add missing inc_platform.h include	2019-04-26 13:59:43 +02:00
Jens Steube	9faba41848	Use nvrtc to compile PTX (resulting PTX not yet used)	2019-04-26 13:28:44 +02:00
Jens Steube	4b986de5fb	Prepare native CUDA hybrid integration	2019-04-25 14:45:17 +02:00

30 Commits