1
0
mirror of https://github.com/hashcat/hashcat.git synced 2025-07-24 23:48:19 +00:00
Commit Graph

47 Commits

Author SHA1 Message Date
Gabriele Gristina
0e4b6894ee
module_unstable_warning only for Intel Iris Graphics on Apple Intel 2025-06-26 22:26:31 +02:00
Jens Steube
58fa783095 Enhanced the auto-tune engine: when a kernel runs with a single thread and no accel, it should finish quickly (ideally under 1 ms). If it doesn't, the kernel is likely overloaded with code. If such a kernel also uses barriers (e.g., to load shared storage with multiple threads), high iteration counts cause unnecessary thread waiting. To address this, we now skip increasing the loop count if the runtime exceeds either 1/8 of the target time (based on the -w setting) or a hard-coded threshold of 4 ms.
Improved shared memory handling for -m 10700. Removed the hard-coded limit of 256 threads and now dynamically check the device's shared memory pool to adapt threads accordingly.
Implemented a feature request to display non-default session names early during startup.
Added a check for the number of registers required by a kernel (CUDA and HIP only). This allows us to estimate the max threads per block before entering the auto-tune engine and make pre-adjustments.
Fixed Metal command encoder argument to work with the new auto-tuner's extra kernel invocation.
Fixed incorrect host memory calculation logic during automatic kernel-accel reduction for scrypt-based algorithms. This ensures memory constraints are respected.
Improved several plugins by setting maximum loop counts and others using the OPTS_TYPE_NATIVE_THREADS option.
Fixed compilation on Apple platforms by excluding '#include <sys/sysinfo.h>'.
2025-06-25 22:10:29 +02:00
Jens Steube
69a585fa4a Autotune refactoring II: dynamic threads-per-block
- Integrated occupancy hints from vendor APIs (CUDA, HIP) to set a
  dynamic threads-per-block limit per kernel instead of using static
  values.
- Added `find_tuning_function()` to identify the relevant kernel.
- Autotuner now runs in three stages: threads -> loops -> accel. The
  first two stages now stop increasing when the tested kernel runtime
  gets too close to the target runtime (96ms for `-w 3`), leaving
  headroom for the next stage to adjust in a finer sense.
- Accel tuning now uses a capped floating-point multiplier instead of
  powers of two.
- Removed workarounds for missing thread autotuning in plugins.
- Removed the hardcoded 4GiB host memory limit for accel. Added a
  cross-platform `get_free_memory()` to check actual free RAM during GPU
  initialization, preventing underutilization of high-end GPUs like the
  4090. If needed, users can still cap memory usage with `-T` or `-n`.
- Updated enums for ROCm 6.4.x and CUDA 12.9.
- Added code to detect kernel register spilling. That's relevant so we
  can keep free enough global memory on the runtime for the runtime to
  handle spills efficiently.
2025-06-24 20:19:42 +02:00
Jens Steube
ed10e6a913 Autotune and Benchmark refactoring
This change affects three key areas, each improving autotuning:

- Autotune refactoring itself

The main autotune algorithm had become too complex to maintain and has
now been rewritten from scratch. The engine is now closer to the old
v6.0.0 version, using a much more straightforward approach.

Additionally, the backend is now informed when the autotune engine runs
its operations and runs an extra invisible kernel invocation. This
significantly improves runtime accuracy because the same caching
mechanisms which kick in normal cracking sessions now also apply during
autotuning. This leads to more consistent and reliable automatic
workload tuning.

- Benchmarking and '--speed-only' accuracy bugs fixed

Benchmark runtimes had become too short, especially since the default
benchmark mask changed from '?b?b?b?b?b?b?b' to '?a?a?a?a?a?a?a?a'. For
very fast hashes like NTLM, benchmarks often stopped immediately when
base words needed to be regenerated, producing highly inaccurate
results.

This issue also misled users tuning '-n' values, as manually
oversubscribing kernels could mask the problem, creating the impression
that increasing '-n' had a larger impact on performance than it truly
does. While '-n' still has an effect, it’s not as significant. With this
fix, users achieve the same speed without needing to tune '-n' manually.

The bug was fixed by enforcing a minimum benchmark runtime of 4 seconds,
regardless of kernel runtime or kernel type. This ensures more stable
and realistic benchmark results, but typically increasing the benchmark
duration by up to 4 seconds.

- Kernel-Threads set to 32 and plugin configuration cleanup

Some plugin configurations existed solely to work around the old
benchmarking bug and can now be removed. For example,
'OPTS_TYPE_MAXIMUM_THREADS' is no longer required and has been removed
from all plugins, although the parameter itself remains to avoid
breaking custom plugins.

Because increasing threads beyond 32 no longer offers meaningful
performance gains, the default is now capped at 32 (unless overridden
with '-T'). This simplifies GPU memory management. Currently, work-item
counts are indirectly limited by buffer sizes (e.g., 'pws_buf[]'), which
must not exceed 4 GiB (a hard-coded limit). This buffer size depends on
the product of 'kernel-accel', 'kernel-threads', and the device’s
compute units. By reducing the default threads from 1024 to 32, there is
now more space available for base words.
2025-06-22 20:17:52 +02:00
Jens Steube
ceb5ff5641 The Assimilation Bridge (Framework) 2025-05-29 15:38:13 +02:00
Gabriele Gristina
003579d21b Modules: Updated module_unstable_warning 2025-05-11 15:40:52 +02:00
Rosen Penev
ae07d65f34 clang-tidy: remove useless casts
Now that const was fixed, pointless casts can be removed.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2023-08-20 21:10:34 -07:00
Gabriele Gristina
54205412a6 Fixed build failed for 10700 optimized with Apple Metal 2023-05-20 17:15:17 +02:00
Gabriele Gristina
d7c914e267
Merge branch 'master' into fix_tokenizer_TOKEN_ATTR_FIXED_LENGTH 2023-04-16 15:30:18 +02:00
Gabriele Gristina
cf3ab8e2dc Handle signed/unsigned PDF permission P value for all PDF hash-modes 2023-04-11 21:17:25 +02:00
Gabriele Gristina
2adf735e51 Fixed bug in input_tokenizer when TOKEN_ATTR_FIXED_LENGTH is used and refactor modules 2023-04-11 19:34:01 +02:00
sein
6e642121e7 allow up to 11 chars for the P value of PDFs 2022-12-22 19:08:42 +01:00
Jens Steube
f6537a2964 Use inline static on HIP for some hash-modes which benefit from it 2022-11-07 15:35:46 +01:00
philsmd
d9749e8799
change interface, add module_benchmark_charset () 2022-07-15 17:17:57 +02:00
Jens Steube
6fce6fb3ff Update all existing modules to use the stock module marker 2022-04-08 14:11:50 +02:00
Gabriele Gristina
a1ced24564 Fixed bug on benchmark engine, add some unstable warnings, updated negative status code 2022-01-22 12:10:09 +01:00
Jens Steube
5015bc0d2e Module Parser: Renamed struct token_t to hc_token_t to avoid naming conflict with token_t on MacOS 2021-12-20 13:19:40 +01:00
Jens Steube
5b4ac09e91 User Options: Add new module function module_hash_decode_postprocess() to override hash specific configurations from command line 2021-11-28 13:58:27 +01:00
Jens Steube
93ba57f183 Update more module with OPTS_TYPE_MAXIMUM_THREADS 2021-11-14 10:11:53 +01:00
Jens Steube
1d33b57144 PDF 1.7 Kernel: Improved performance on AMD GPU by using shared memory for the scratch buffer
Inspired by https://github.com/reger-men/hashcat/blob/6.2.4/OpenCL/m10700-optimized.cl
2021-10-30 20:16:45 +02:00
Jens Steube
01738fafa0 Deprecated Plugins: Add new module function module_deprecated_notice() to mark a plugin as deprecated and to return a free text user notice
Added option --deprecated-check-disable to enable deprecated plugins
2021-08-10 17:59:52 +02:00
Jens Steube
20a7b9f992 Tuning-Database: Add new module function module_extra_tuningdb_block() to extend hashcat.hctune content from a plugin
See src/modules/module_08900.c as an example
2021-08-01 16:25:37 +02:00
Jens Steube
11295e4679 Fix missing OPTI_TYPE_USES_BITS_64 in several modules 2021-07-14 17:01:46 +02:00
Jens Steube
ff72a8ed21 Remove module_unstable_warning() entries for AMD (legacy) driver after workaround inside UTF16 conversion function is in use 2021-05-08 15:55:32 +00:00
Jens Steube
95489b0473 Update module_unstable_warning() for amdgpu-pro-20.50-1234664-ubuntu-20.04 (legacy) 2021-05-02 18:18:50 +00:00
Jens Steube
98aef2ae92 Module Structure: Add 3rd party library hook management functions. This also requires an update to all existing module_init() 2020-08-29 16:12:15 +02:00
Jens Steube
b9f6777f1b OpenCL Runtime: Add some unstable warnings for some SHA512 based algorithms on AMD GPU on macOS 2020-07-15 11:27:46 +02:00
Jens Steube
4788c61dd2 Add OPTI_TYPE_REGISTER_LIMIT flag to enable register limiting in CUDA 2020-02-04 21:53:27 +01:00
Jens Steube
8039290cd0 Update -m 10700 unstable warning and disable JiT compiler optimization for AMD GPU PRO, too 2020-01-06 13:36:17 +01:00
Jens Steube
4bef41ed1b Update -m 10700 unstable warning and disable JiT compiler optimization in pure kernel mode 2020-01-06 13:24:47 +01:00
philsmd
b2c28289c8
PDF module: -m 10700 missing assignment of tmp_size 2020-01-04 14:08:30 +01:00
Jens Steube
664e595b45 Add unstable warning for -m 10700 for Intel CPU 2019-11-14 12:46:09 +01:00
Jens Steube
f1632b933e Add support to configure hash-mode specific range of number of hashes supported 2019-05-19 14:46:05 +02:00
Jens Steube
e3500ff4aa Add CUDA device attributes to -I 2019-04-30 13:38:44 +02:00
jsteube
926e99811c Add some more NO_UNROLL to avoid module_unstable_warnings 2019-04-20 16:36:43 +02:00
jsteube
773dab9161 Mark -m 10700 as unstable on AMDGPU driver in pure kernel mode 2019-04-06 20:06:19 +02:00
jsteube
b8d609ba16 WPA/WPA2 cracking: In the potfile, replace password with PMK in order to detect already cracked networks across all WPA modes 2019-04-02 11:24:22 +02:00
jsteube
c0a31b3239 Prepare potfile specific module_hash_decode and module_hash_encode hooks 2019-04-01 12:32:11 +02:00
Jens Steube
4115e6b825 Update some unstable_warning on Intel CPU 2019-04-01 11:22:51 +02:00
jsteube
e93590c11d Fix some variable names in modules 2019-03-16 13:30:53 +01:00
jsteube
73d4ca14f1 Mark -m 10700 in optimized mode as unstable on Intel OpenCL runtime 2019-03-04 16:20:13 +01:00
jsteube
d325413b34 Fix -m 10700 activate unstable warning only in optimized mode 2019-03-04 15:34:03 +01:00
jsteube
dc9279c95c New Strategy: Instead of using volatile, mark the mode as unstable. Remove all volatiles 2019-03-03 19:18:56 +01:00
jsteube
21154d6522 Add some module specific warnings for AMDGPU driver in optimized kernel mode 2019-03-02 21:18:30 +01:00
jsteube
88a051629c Support module specific JiT compiler build options 2019-03-02 11:12:13 +01:00
jsteube
bab735b367 Get rid of hash_type variable. This hopefully reduces some confusion for new hashcat kernel developers 2019-02-12 16:02:27 +01:00
jsteube
1cccaad681 Add -m 10700 module 2019-02-10 14:59:26 +01:00