1
0
mirror of https://github.com/hashcat/hashcat.git synced 2025-07-05 06:12:35 +00:00
Commit Graph

30 Commits

Author SHA1 Message Date
Jens Steube
13a7b56feb Improve the logic for when to use funnelshift and when not to. Some algorithms, such as SHA1-HMAC and DCC1, do not work well with it, so it has been disabled for them.
Fix the automatic reduction of the kernel-accel maximum based on available memory per device by accounting for the additional size needed to handle register spilling.
Fix the tools/benchmark_deep.pl script to recognize benchmark masks more reliably.
2025-06-23 12:30:12 +02:00
Jens Steube
b7c8fcf27c Removed shared-memory based optimization for SCRYPT on HIP, because the shared-memory buffer is incompatible with TMTO, which is limiting SCRYPT-R to a maximum of 8. This change also simplifies the code, allowing removal of large sections of duplicated code. Removed the section in scrypt_module_extra_tuningdb_block() that increased TMTO when there was insufficient shared memory, as this is no longer applicable.
Refactored inc_hash_scrypt.cl almost completely and improved macro names in inc_hash_scrypt.h. Adapted all existing SCRYPT-based plugins to the new standard. If you have custom SCRYPT based plugins use hash-mode 8900 as reference.
Fixed some compiler warnings in inc_platform.cl.
Cleaned up code paths in inc_vendor.h for finding values for HC_ATTR_SEQ and DECLSPEC.
Removed option --device-as-default-execution-space from nvrtc for hiprtc compatibility. As a result, added __device__ back to DECLSPEC.
Removed option --restrict from nvrtc compile options since we actually alias some buffers.
Added --gpu-max-threads-per-block to hiprtc options.
Added -D MAX_THREADS_PER_BLOCK to OpenCL options (currently unused).
Removed all OPTS_TYPE_MP_MULTI_DISABLE entries for SNMPv3-based plugins.
These plugins consume large amounts of memory and for this reason,limited kernel_accel max to 256. This may still be high, but hashcat will automatically tune down kernel_accel if insufficient memory is detected.
Removed command `rocm-smi --resetprofile --resetclocks --resetfans` from benchmark_deep.pl, since some AMD GPUs become artificially slow for a while after running these commands.
Replaced load_source() with file_to_buffer() from shared.c, which does the exact same operations.
Moved suppress_stderr() and restore_stderr() to shared.c and reused them in both Python bridges and opencl_test_instruction(), where the same type of code existed.
2025-06-21 07:09:20 +02:00
Jens Steube
4b93a6e93c Add support for detecting unified GPU memory on CUDA and HIP (previously available only for OpenCL and Metal).
Do not adjust kernel-accel or scrypt-tmto for GPUs with unified memory, typically integrated GPUs in CPUs (APUs).
Redesign the "4-buffer" strategy to avoid overallocation from naive division by four, which can significantly increase memory usage for high scrypt configurations (e.g., 256k:8:1).
Update the scrypt B[] access pattern to match the new "4-buffer" design.
Allow user-specified kernel-accel and scrypt-tmto values, individually or both, via command line and tuning database. Any unspecified parameters are adjusted automatically.
Permit user-defined combinations of scrypt-tmto and kernel-accel even if they may exceed available memory.
2025-06-17 13:32:57 +02:00
Jens Steube
e8052a004b - Replace naive 32 bit rotate with funnelshift on CUDA/HIP
- Replace V_ALIGNBIT_B32 with funnelshift on HIP
- Improve RC4 performance by preventing inlineing
- Fix leftover code in yescrypt-platform.c
- Update docs/hashcat-assimilation-bridge-development.md
- Only initialize hwmon on host for virtualized backends
- Improve SCRYPT tunings on AMD RX6900XT
2025-06-02 11:50:08 +02:00
Gabriele Gristina
b3d3b31c3e Metal: added support for vectors up to 4 2022-02-10 21:53:08 +01:00
Gabriele Gristina
9d36245d51 Kernels: Set the default Address Space Qualifier for any pointer, refactored / updated KERN_ATTR macros and rc4 cipher functions, in order to support Apple Metal runtime 2022-02-04 19:54:00 +01:00
Jens Steube
3f6c5a0042 Update module_unstable_warning() for -m 172xx on HIP 2021-07-23 21:09:55 +02:00
Jens Steube
5ffcaa980d HIP Backend: Added support to support HIP 4.4 and later, but added check to rule out older versions because they are incompatible 2021-07-23 16:04:34 +02:00
Jens Steube
bdb7999f07 Switch HIP vector datatypes to OpenCL like ext_vector_type() 2021-07-19 20:24:30 +02:00
Jens Steube
0d8b4b74ad More CUDA special backports to HIP 2021-07-18 22:56:22 +02:00
Jens Steube
257098a301 Get rid of hip/hip_runtime.h dependancy 2021-07-18 21:14:45 +02:00
Jens Steube
45e65dd05a Backport more ROCm based optimizations to HIP 2021-07-15 23:34:27 +02:00
Jens Steube
d130cc66b3 Optimize ISA code on HIP for V_ALIGNBIT_B32 using a different template for inline assembly 2021-07-15 09:57:41 +02:00
Jens Steube
674ca7d88f Add GPU threads to kernel cache checksum because it has an influence on HIP offline compile options
Add V_ALIGNBIT_B32 inline assembly wrapper because HIP does not provide amd_bitalign()
2021-07-12 11:27:05 +02:00
Jens Steube
20f7febd4c Workaround too intensive optimization in -m 2000 using HIPRTC 2021-07-11 15:54:13 +02:00
Jens Steube
1b84a9e53b Add missing backports from code base v6.2.2
Fix context to thread management
Fix missing code in selftest.c, autotune.c, hashes.c, dispatch.c and backend.c
Use IS_HIP depending code makes it easier for future optimization related to inline assembly calls - instead of using IS_CUDA || IS_HIP
See TODO markers for more optimizations / next steps
2021-07-11 12:38:59 +02:00
Jens Steube
a22f8149fc
Merge branch 'HIP' into hip 2021-07-10 21:34:09 +02:00
reger-men
ea7b74389f First draft HIP Version 2021-07-09 03:50:40 +00:00
Jens Steube
62fc3601bb Wrap atomic functions with hc_ prefix to have better platform control 2021-04-20 17:47:44 +02:00
Jens Steube
73cc3170f4 Fixed both false negative and false positive result in -m 3000 in -a 3 (affected only NVIDIA GPU) 2021-04-20 17:14:13 +02:00
Jens Steube
316095c151 Some more ROCm performance tuning 2019-06-20 10:04:31 +02:00
Jens Steube
5e0eb288c9 Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL 2019-06-16 18:01:26 +02:00
Jens Steube
7832c54452 Fix constant memory use of bfs_buf 2019-05-11 09:32:16 +02:00
Jens Steube
46f737c5af Use real constant memory on CUDA 2019-05-10 13:22:26 +02:00
Jens Steube
d0bd33c9d1 Rename CONSTANT_AS to CONSTANT_VK 2019-05-06 14:34:16 +02:00
Jens Steube
ec9925f3b1 Warnings self-check and autotune with CUDA 2019-05-04 21:52:00 +02:00
Jens Steube
3b7304c9d8 Fix recursion in inc_platform.cl 2019-04-26 14:01:14 +02:00
Jens Steube
89119bf24a Add missing inc_platform.h include 2019-04-26 13:59:43 +02:00
Jens Steube
9faba41848 Use nvrtc to compile PTX (resulting PTX not yet used) 2019-04-26 13:28:44 +02:00
Jens Steube
4b986de5fb Prepare native CUDA hybrid integration 2019-04-25 14:45:17 +02:00