1
0
mirror of https://github.com/hashcat/hashcat.git synced 2025-07-06 23:02:35 +00:00
Commit Graph

17 Commits

Author SHA1 Message Date
Jens Steube
69a585fa4a Autotune refactoring II: dynamic threads-per-block
- Integrated occupancy hints from vendor APIs (CUDA, HIP) to set a
  dynamic threads-per-block limit per kernel instead of using static
  values.
- Added `find_tuning_function()` to identify the relevant kernel.
- Autotuner now runs in three stages: threads -> loops -> accel. The
  first two stages now stop increasing when the tested kernel runtime
  gets too close to the target runtime (96ms for `-w 3`), leaving
  headroom for the next stage to adjust in a finer sense.
- Accel tuning now uses a capped floating-point multiplier instead of
  powers of two.
- Removed workarounds for missing thread autotuning in plugins.
- Removed the hardcoded 4GiB host memory limit for accel. Added a
  cross-platform `get_free_memory()` to check actual free RAM during GPU
  initialization, preventing underutilization of high-end GPUs like the
  4090. If needed, users can still cap memory usage with `-T` or `-n`.
- Updated enums for ROCm 6.4.x and CUDA 12.9.
- Added code to detect kernel register spilling. That's relevant so we
  can keep free enough global memory on the runtime for the runtime to
  handle spills efficiently.
2025-06-24 20:19:42 +02:00
Rosen Penev
a55d4aa3c9 fix prototypes and old declarations
Signed-off-by: Rosen Penev <rosenp@gmail.com>
2023-08-20 21:13:12 -07:00
justpretending
b2f14f2f5d Fix some typos 2023-07-27 23:11:55 +07:00
jsteube
6ee2658104 Prefix more macros to avoid collisions in other existing libraries 2023-01-30 14:41:12 +00:00
jsteube
f1ff925b6e Prepare rename macros in header files from _MACRO to MACRO 2023-01-17 19:25:40 +00:00
Gabriele Gristina
f8ceb8785e CUDA Backend: moved functions to ext_cuda.c/ext_nvrtc.c and includes to ext_cuda.h/ext_nvrtc.h 2022-01-03 16:29:15 +01:00
Jukka Ojanen
cdf27a1cb3 Implement async run_cuda_kernel_memset() and run_cuda_kernel_memset32() 2021-07-27 18:56:59 +03:00
Jukka Ojanen
a642f7b233 Remove synchronous GPU memory copy functions 2021-07-26 15:36:42 +03:00
Jukka Ojanen
4263cafdcf Add async CUDA memcpy functions: hc_cuMemcpyDtoDAsync(), hc_cuMemcpyDtoHAsync() and hc_cuMemcpyHtoDAsync(). Implement partially async CUDA memset and bzero kernels. 2021-07-20 12:23:39 +03:00
Jens Steube
66ae5125ce Cache cubin instead of PTX to decrease startup time 2020-01-29 15:56:36 +01:00
Jens Steube
33028314f0 Add hc_cuCtxSetCacheConfig() 2019-05-09 00:04:05 +02:00
Jens Steube
ec9925f3b1 Warnings self-check and autotune with CUDA 2019-05-04 21:52:00 +02:00
Jens Steube
a6fa7a2749 Add support for some first CUDA module loader 2019-05-02 14:58:52 +02:00
Jens Steube
4b986de5fb Prepare native CUDA hybrid integration 2019-04-25 14:45:17 +02:00
jsteube
378258d789 Fix caching system for use with AMD and NV, drop BINARY_KERNEL define 2015-12-21 12:01:38 +01:00
jsteube
968265fffb - Prepared for JIT use of hash-mode 1500, 8900 and 9300, works already on OpenCL (AMD)
- Changed PROMPT
2015-12-07 21:37:12 +01:00
Jens Steube
5065474b4e Initial commit 2015-12-04 15:47:52 +01:00