1
0
mirror of https://github.com/hashcat/hashcat.git synced 2025-07-21 22:18:44 +00:00
Commit Graph

19 Commits

Author SHA1 Message Date
Gabriele Gristina
f663abee44
Added workaround to get rid of internal runtimes memory leaks
As of now, especially in the benchmark mode, hashcat will not go to create and destroy context and command-queue for each enabled device each time it switches from one hash-mode to the next.
Specifically using OpenCL with an NVIDIA device, it was not possible to complete the benchmark because clCreateContext has memory leaks that slowly consume all available GPU memory until hashcat can activate a new context and disable the device.

Avoid deprecated HIP functions

All hipCtx* features have been declared deprecated, so we have replaced them with the new ones, also fixing a critical bug on handling multiple AMD devices in the same system.
2025-07-06 21:28:37 +02:00
Jens Steube
f8df94f457 Switched all async and non-blocking calls to synchronous and blocking ones. Kept the original async bindings intact. This avoids race conditions like the one fixed in the previous commit, with no performance impact.
Fixed a typedef issue for clEnqueueReadBuffer().
Updated Python/hcshared.py with missing entry for new salt_dimy attribute in salt_t struct.
Fixed a bug in the autotuner when determining the starting value for kernel loops, in cases where the iteration count is N-1 and not a multiple of 1024.
Updated additional plugins to use OPTI_TYPE_REGISTER_LIMIT.
2025-06-30 11:26:05 +02:00
Jens Steube
69a585fa4a Autotune refactoring II: dynamic threads-per-block
- Integrated occupancy hints from vendor APIs (CUDA, HIP) to set a
  dynamic threads-per-block limit per kernel instead of using static
  values.
- Added `find_tuning_function()` to identify the relevant kernel.
- Autotuner now runs in three stages: threads -> loops -> accel. The
  first two stages now stop increasing when the tested kernel runtime
  gets too close to the target runtime (96ms for `-w 3`), leaving
  headroom for the next stage to adjust in a finer sense.
- Accel tuning now uses a capped floating-point multiplier instead of
  powers of two.
- Removed workarounds for missing thread autotuning in plugins.
- Removed the hardcoded 4GiB host memory limit for accel. Added a
  cross-platform `get_free_memory()` to check actual free RAM during GPU
  initialization, preventing underutilization of high-end GPUs like the
  4090. If needed, users can still cap memory usage with `-T` or `-n`.
- Updated enums for ROCm 6.4.x and CUDA 12.9.
- Added code to detect kernel register spilling. That's relevant so we
  can keep free enough global memory on the runtime for the runtime to
  handle spills efficiently.
2025-06-24 20:19:42 +02:00
Jens Steube
c033873e4b Update hipDeviceAttribute_t for ROCm 6.x
Add hipDeviceProp_t and bindings for hipGetDeviceProperties(), hipGetDeviceProperties is required to retrieve gcnArchName[].
Add gcnArchName[] to select the correct --gpu-architecture value for a specific device when using hiprtc.
Include sm_major and sm_minor for CUDA and gcnArchName for HIP in the kernel filename hash.
Update nvrtc_options[] and hiprtc_options[] to avoid unused variables, eliminating the use of --restrict as a placeholder and preventing nvrtc from aborting.
Add check_file_suffix() and remove_file_suffix() helper functions.
2025-06-18 18:29:47 +02:00
Rosen Penev
a55d4aa3c9 fix prototypes and old declarations
Signed-off-by: Rosen Penev <rosenp@gmail.com>
2023-08-20 21:13:12 -07:00
justpretending
b2f14f2f5d Fix some typos 2023-07-27 23:11:55 +07:00
jsteube
6ee2658104 Prefix more macros to avoid collisions in other existing libraries 2023-01-30 14:41:12 +00:00
jsteube
f1ff925b6e Prepare rename macros in header files from _MACRO to MACRO 2023-01-17 19:25:40 +00:00
Jens Steube
a15eeac44f Backport some AMD HIP headers from ROCm 5.3.0 2022-11-13 19:14:02 +01:00
Jens Steube
cf352e4f8b Update HIP includes to work with Linux on HIP 5.1.20531+ 2022-04-14 17:46:59 +02:00
Gabriele Gristina
78c7ee2af6 HIP Backend: moved functions to ext_hip.c/ext_hiprtc.c and includes to ext_hip.h/ext_hiprtc.h 2022-01-02 19:12:41 +01:00
Jens Steube
53f6693495 Temporary enable HIP 4.4/ROCM 4.5 on Linux and globally set native thread count 2021-11-10 19:32:54 +01:00
Jens Steube
cb69e2d413 Added some HIP version checks, fall back to OpenCL automatically
Switched HIP version check from driverVersion to runtimeVersion
Fixed syntax check of HAS_VPERM macro in several kernel includes causing invalid error message for AMD GPUs on Windows
Updated AMD driver requirements
Updated docs/changes.txt with missing changes from previous commits
Fixed invalid vector data type in Murmur Hash in -a 3 mode
Fixed uninitialized variable warning in src/hashes.c
Fixed broken support for --generate-rules-func-min
2021-08-04 20:49:22 +02:00
Jukka Ojanen
c3195d0603 Merge branch 'master' of https://github.com/hashcat/hashcat 2021-07-30 11:34:25 +03:00
Jukka Ojanen
cdf27a1cb3 Implement async run_cuda_kernel_memset() and run_cuda_kernel_memset32() 2021-07-27 18:56:59 +03:00
Jukka Ojanen
a642f7b233 Remove synchronous GPU memory copy functions 2021-07-26 15:36:42 +03:00
Jens Steube
5ffcaa980d HIP Backend: Added support to support HIP 4.4 and later, but added check to rule out older versions because they are incompatible 2021-07-23 16:04:34 +02:00
Jukka Ojanen
8674e23d79 Add async HIP memcpy functions: hc_hipMemcpyDtoDAsync(), hc_hipMemcpyDtoHAsync() and hc_hipMemcpyHtoDAsync(). Implement partially async HIP memset and bzero kernels. 2021-07-20 12:47:10 +03:00
reger-men
ea7b74389f First draft HIP Version 2021-07-09 03:50:40 +00:00