Jens Steube
5e1d37c82e
Make unsupported AMD HIP runtime version error message a bit more human readable
3 years ago
Jens Steube
cb69e2d413
Added some HIP version checks, fall back to OpenCL automatically
...
Switched HIP version check from driverVersion to runtimeVersion
Fixed syntax check of HAS_VPERM macro in several kernel includes causing invalid error message for AMD GPUs on Windows
Updated AMD driver requirements
Updated docs/changes.txt with missing changes from previous commits
Fixed invalid vector data type in Murmur Hash in -a 3 mode
Fixed uninitialized variable warning in src/hashes.c
Fixed broken support for --generate-rules-func-min
3 years ago
Jens Steube
b2d1f42905
Fix self-test functionality if FIXED_LOCAL_SIZE_COMP is used
...
Fix -m 25700 datatype in -a 3 mode and maximum password length in pure kernel mode
Fix -m 12500, 23700 and 23800 if password is exactly length 128
3 years ago
Jens Steube
af5d346244
Merge pull request #2894 from jtojanen/master
...
Allow async execution
3 years ago
Jens Steube
20a7b9f992
Tuning-Database: Add new module function module_extra_tuningdb_block() to extend hashcat.hctune content from a plugin
...
See src/modules/module_08900.c as an example
3 years ago
Jukka Ojanen
3d7ce7162b
Merge branch 'master' of https://github.com/hashcat/hashcat
3 years ago
Jukka Ojanen
62a06f735f
kernel_memset32(): assume offset and size are bytes, not elements
3 years ago
Jukka Ojanen
fdbfae9a28
Modify OpenCL clEnqueueFillBuffer() workaround
3 years ago
Jukka Ojanen
dbe2bad098
Maintain code style
3 years ago
Jukka Ojanen
1ee222d43f
Replace free() with hcfree()
3 years ago
Jukka Ojanen
e352a79a05
Extend context in gidd_to_pw_t()
3 years ago
Jens Steube
d4997d1255
Added support for auto-tuning --kernel-threads (-T) on startup
3 years ago
Jukka Ojanen
e154f9e781
Few cleanups
3 years ago
Jukka Ojanen
c3195d0603
Merge branch 'master' of https://github.com/hashcat/hashcat
3 years ago
Jukka Ojanen
1064cce08c
Synchronize before hooks
3 years ago
Jukka Ojanen
81c2ec3caf
Small cleanup in gidd_to_pw_t()
3 years ago
Jukka Ojanen
7a8065d090
Do not call clWaitForEvents() after spin damper when we know that event status is CL_COMPLETE
3 years ago
Jukka Ojanen
c48e6a25a8
Enqueue several commands before clFlush()
3 years ago
Jens Steube
03ed06849a
Merge branch 'HIP44' into master
3 years ago
Jens Steube
d38d40c8ba
Unlock all GPU threads for AMD GPUs if WaveFront size is 32 (basically new models)
...
Add new hash-modes to tools/benchmark_deep.pl
Fix MINGW issue on 64 bit constant in refactored kernel-accel limiting section
3 years ago
Jukka Ojanen
9ed231c99c
Add comment to blocking OpenCL calls
3 years ago
Jens Steube
a4299b74af
Memory Management: Refactored the code responsible for limiting kernel accel in order to avoid out of -host- memory situations
3 years ago
Jukka Ojanen
d7cc8d7cd2
Revert due to module_03200.c, module_25600.c and module_25800.c using device_param->kernel_dynamic_local_mem_size_memset
3 years ago
Jukka Ojanen
8b590f651b
Remove unused gpu_memset and its references
3 years ago
Jens Steube
9c0a37accf
Update driver requirement
3 years ago
Jukka Ojanen
cdf27a1cb3
Implement async run_cuda_kernel_memset() and run_cuda_kernel_memset32()
3 years ago
Jukka Ojanen
d7de3550b1
Add fast event for CUDA and HIP, and use it to synchronize check_hash()
3 years ago
Jens Steube
f6de3e61e0
Merge pull request #2906 from hashcat/master
...
Backports
3 years ago
Jens Steube
fd2cb59d26
AMD GPUs: On Apple OpenCL platform, we ask for the preferred kernel thread size rather than hard-coding 32
...
ECC secp256k1: Removed the inline assembly code for AMD GPUs because the latest JIT compilers optimize it with the same efficiency
3 years ago
Jukka Ojanen
a642f7b233
Remove synchronous GPU memory copy functions
3 years ago
Jukka Ojanen
de5200cffc
Allow async execution of backend
3 years ago
Jens Steube
72e307fbce
Merge pull request #2901 from hashcat/master
...
Backport changes
3 years ago
Jens Steube
959a232828
Merge pull request #2885 from neheb/charfixes
...
const and char fixes
3 years ago
Jens Steube
84a4058edf
Merge pull request #2900 from hashcat/master
...
Backport changes
3 years ago
Jens Steube
640d95a00f
Vendor Detection: Add "Intel" as a valid vendor name for GPU on macOS
3 years ago
Jens Steube
5ffcaa980d
HIP Backend: Added support to support HIP 4.4 and later, but added check to rule out older versions because they are incompatible
3 years ago
Rosen Penev
adaf3f293b
make const char pointers actually const
...
const char* is a non const pointer that points to const data. Add
missing const.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
3 years ago
Jukka Ojanen
9f9333f2ef
Allow async execution of run_opencl_kernel_bzero(), run_hip_kernel_bzero() and run_opencl_kernel_bzero()
3 years ago
Jukka Ojanen
fafd24237d
Define HC_ALIGN macro to control data alignment and use common constant zero buffer in run_cuda_kernel_bzero(), run_hip_kernel_bzero() and run_opencl_kernel_bzero().
3 years ago
Jukka Ojanen
2c2988518d
Remove all calls to clFlush()
3 years ago
Jens Steube
5024865d87
Kernel Threads: Use warp size / wavefront size query instead of hardcoded values as base for kernel threads
...
Kernel Cache: Add kernel threads into hash computation which is later used in the kernel cache filename
Remove some unused function symbol lookups in HIP library
3 years ago
Jens Steube
c990e252d3
Added option --multiply-accel-disable (short: -M) to disable multiply the kernel-accel with the multiprocessor count automatism
3 years ago
Jens Steube
a7a899e5a4
Backport changes from #2888 to HIP backend
3 years ago
Jukka Ojanen
cb923d6e46
Replace CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK with CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK_OPTIN
3 years ago
Jukka Ojanen
d23f2d6c2f
Calculation kernel dynamic memory size based on CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK
3 years ago
Jukka Ojanen
8674e23d79
Add async HIP memcpy functions: hc_hipMemcpyDtoDAsync(), hc_hipMemcpyDtoHAsync() and hc_hipMemcpyHtoDAsync(). Implement partially async HIP memset and bzero kernels.
3 years ago
Jukka Ojanen
4263cafdcf
Add async CUDA memcpy functions: hc_cuMemcpyDtoDAsync(), hc_cuMemcpyDtoHAsync() and hc_cuMemcpyHtoDAsync(). Implement partially async CUDA memset and bzero kernels.
3 years ago
Jukka Ojanen
4c0f6dd263
Merge branch 'hashcat:master' into master
3 years ago
Jukka Ojanen
ea5180ac46
Include missing bzero kernel parameters
3 years ago
Jens Steube
257098a301
Get rid of hip/hip_runtime.h dependancy
3 years ago
Jukka Ojanen
a2a1d04bcf
Implement gpu_bzero
3 years ago
Jens Steube
45e65dd05a
Backport more ROCm based optimizations to HIP
3 years ago
Jens Steube
cf512faa53
Update large switch() cases in inc_common.cl and some inline assembly common functions for devices managed with HIP backend
3 years ago
Jens Steube
f3bd936971
Add hardware monitor mapping for HIP devices
3 years ago
Jens Steube
219bed457f
Fix use of --gpu-max-threads-per-block
3 years ago
Jens Steube
674ca7d88f
Add GPU threads to kernel cache checksum because it has an influence on HIP offline compile options
...
Add V_ALIGNBIT_B32 inline assembly wrapper because HIP does not provide amd_bitalign()
3 years ago
Jens Steube
23c3c178bf
Limit max threads per block to 64 to enable offline compiler to make better use if available registers
...
Fix double free() for hip_event1/hip_event2 and hip_stream causes segfault
Replace hc_cuCtxSetCurrent() with hc_cuCtxPushCurrent() in order to align changes with HIP
Add vector datatype operators (if we decide to use them - currently unused)
3 years ago
Jens Steube
ca3beacd93
Disable dynamic shared memory on HIP, because hipFuncSetAttribute() maps to cudaFuncSetAttribute() and not to cuFuncSetAttribute()
3 years ago
Jens Steube
1b84a9e53b
Add missing backports from code base v6.2.2
...
Fix context to thread management
Fix missing code in selftest.c, autotune.c, hashes.c, dispatch.c and backend.c
Use IS_HIP depending code makes it easier for future optimization related to inline assembly calls - instead of using IS_CUDA || IS_HIP
See TODO markers for more optimizations / next steps
3 years ago
Jens Steube
bfe83ec138
Added temperature watchdog for CPU on linux using sysfs
3 years ago
Jens Steube
979f9e9868
Rename hardware monitor interface sysfs to sysfs_amdgpu
3 years ago
Jukka Ojanen
a0eaefa0c2
Missing whitespaces
3 years ago
Jukka Ojanen
e133bd4ec4
Change rc_board_name_amd initial value to CL_INVALID_VALUE. If CHECK_BOARD_NAME_AMD is changed to 0, there is a problem with CL_SUCCESS which equals to 0, device will be skipped.
3 years ago
Jukka Ojanen
b3d18f86e2
Fix early return leaks in backend_session_begin
3 years ago
Jukka Ojanen
b976e52bc7
Fix early return leaks in load_kernel; nvrtc_options, nvrtc_options_string, build_log. Ensure build log NULL termination.
3 years ago
Jukka Ojanen
2f7eec2fd7
Fix early return leaks in backend_ctx_init and backend_ctx_devices_init
3 years ago
Jukka Ojanen
bcbb9b0d2c
Fix skipped device param leak in backend_ctx_devices_destroy
3 years ago
Jukka Ojanen
5f109b5862
Fix iconv_ctx and iconv_tmp leaks in backend.c
3 years ago
Jens Steube
8a3eee3fe5
OpenCL Runtime: Workaround JiT crash (SC failed. No reason given.) on macOS by limiting local memory allocations to 32k
3 years ago
Jens Steube
909d5e64a5
Added hash-mode: AES-128/192/256-ECB NOKDF
...
This mode is probably very rare in real-life scenarios,
but it is a nice template for kernels which do
not use a KDF,
or use AES,
or simple fast hashes with lookup tables
or simple optimized kernels in general
3 years ago
Jens Steube
28c08de9ef
Folders: Do not escape the variable cpath_real to prevent certain OpenCL runtimes from running into an error which do not support escape characters
3 years ago
Jens Steube
90f0e78b5b
Add event about autodetection process start
...
Remove "..." substring from final messages
Add patient message on "..." startup messages
Add missing docs/changes.txt entry
3 years ago
Jens Steube
7e267b9b37
Merge pull request #2825 from matrix/hwmon_osx_v2
...
Add OSX HW Monitor initial support (2021)
3 years ago
Jens Steube
2d7f67fd23
Alias Devices: Show a warning in case the user specifically listed a device to use which in a later step is skipped because it is an alias of another active device
3 years ago
Gabriele Gristina
bc4ce4cbeb
Add support for CPU/GPU device temperature and fanspeed using iokit (Apple)
3 years ago
Jens Steube
ee7fca82f5
Scrypt Kernels: Re-enable scrypt based kernels to use kernel cache
3 years ago
Jens Steube
7fc0ac4ef1
Do not initialize backend devices in case --id is used and some bug fixes
3 years ago
Jens Steube
65d81c0f7b
Hashrate: Innerloop hashrate prediction requires update because of the new salt_repeats feature and also respect _loop2 kernel runtime
3 years ago
Jens Steube
fdccc8287d
Fixed free memory size output for skipped GPU (both automatic and manual) of --backend-info information screen
3 years ago
Royce Williams
7bea7ca177
tighten output to be physical-terminal friendly
3 years ago
nycex
470e844e5d
use XDG_CACHE_HOME for kernels
3 years ago
Jens Steube
ce8c121b50
BCRYPT Kernels: Improved bcrypt performance by 6.5% for high-end NVIDIA GPU devices using CUDA backend
3 years ago
Jens Steube
9f5a22a3ab
OpenCL Backend: Use CL_DEVICE_BOARD_NAME_AMD instead of CL_DEVICE_NAME for device name in case OpenCL runtime supports this query
3 years ago
Jens Steube
5c6501444a
Kernels: Add standalone true UTF8 to UTF16 converter kernel that runs after amplifier. Use OPTS_TYPE_POST_AMP_UTF16LE from plugin
3 years ago
Jens Steube
fe91f6276d
CUDA Backend: Do not warn about missing CUDA SDK installation if --stdout is used
3 years ago
Jens Steube
a2fcb03fe3
Update AMD GPU check on macOS warning message
3 years ago
Jens Steube
282eb75fe9
Update module_unstable_warning for benchmark long selection on macOS for CPU and GPU; Allow use of GPU without --force testwise
3 years ago
Jens Steube
bbd6e55968
Add missing null pointer to cuda_module_shared and opencl_program_shared in backend cleanup function
3 years ago
Jens Steube
0c2afde83b
Add support for clUnloadPlatformCompiler()
3 years ago
Jens Steube
59459d0e5b
Fixed memory leak causing problems in sessions with many iterations. for instance, --benchmark-all or large mask files
3 years ago
Jens Steube
9a87d5aa01
Fixed out-of-boundary reads in case user activates -S for fast but pure hashes in -a 1 or -a 3 mode
3 years ago
Jens Steube
0ba77fe761
Kernel Development: Kernel cache is disabled automatically in casehashcat is compiled with DEBUG=1
...
See https://github.com/hashcat/hashcat/issues/2750
3 years ago
Jens Steube
1dac869cb7
Removed unnecessary swaps in SCRYPT based algorithms
3 years ago
Jens Steube
15f35fa68c
Scrypt Kernels: Reduced kernel wait times by making it a true split kernel where iteration count = N value
3 years ago
Jens Steube
8e47fdf8f5
Add 4 times single workitem extra buffer size to total extra buffer size to workaround single workitem buffer overflows
3 years ago
Jens Steube
57a8923b81
Update complete SCRYPT workload tuning logic.
...
A detailed description will follow.
Set -m 8900 defaults to 16k:8:1 (default scrypt settings).
3 years ago
Jens Steube
ff96015f53
Add OPTS_TYPE_NATIVE_THREADS for use by plugin developer to enforce native thread count (useful for scrypt)
4 years ago
Jens Steube
67d189e10a
Update calculation of EXTRA_SPACE in backend.c and add upper and lower hard limit
4 years ago
Jens Steube
51e8661070
Update calculation of EXTRA_SPACE in backend.c to make it depending from kernel-accel
4 years ago
Jens Steube
9033975efd
Allow plugins to disable the multiplication of the kernel-accel value with the multiprocessor count of the compute device. Will be used later.
4 years ago
Jens Steube
a0eae9050c
OpenCL Runtime: Workaround JiT compiler deadlock on NVIDIA driver >= 465.89
4 years ago
Jens Steube
3c199bfa1b
Merge pull request #2693 from matrix/out_of_host_memory
...
[backend.c] skipping devices instead of stop with error
4 years ago
Jens Steube
d53913f444
Merge pull request #2672 from matrix/example2info
...
Replaced --example-hashes with --hash-info
4 years ago
Chick3nman
9b6235a5fc
Downgrade Kernel Exec Timeout Warning
...
Kernel times of >450ms are very uncommon and this warning is not a blocking, downgrading it to advice to allow for it to be hidden along with other advice messages.
4 years ago
Gabriele Gristina
fda0d668e5
use skip also with first checks of backend_session_begin()
4 years ago
Gabriele Gristina
4c2605f7f2
switch to skip instead return -1 for all checks, moved cuda counter update to the end of loop
4 years ago
Gabriele Gristina
f4dbd46b71
trying skip devices instead of return -1
4 years ago
Gabriele Gristina
77e328d659
Removed option --example-hashes, now is an alias of --hash-info
4 years ago
Gabriele Gristina
3ed1f0d840
Added new option: --hash-info
4 years ago
Jens Steube
04d5e5a119
New Attack-Mode: Association Attack. Like JtR's single mode. Very early
...
stage. See hashcat Forum for detailed writeup.
4 years ago
Jens Steube
57bef8abc9
Display possible NVIDIA CUDA/RTC library loading error message only in case a NVIDIA device was found using OpenCL
4 years ago
Jens Steube
111f39eeb2
OpenCL Runtime: Switched default OpenCL device type on macOS from GPU to CPU. Use -D 2 to enable GPU devices.
4 years ago
Jens Steube
343d3bc0aa
CUDA Backend: Give detailed warning if either the NVIDIA CUDA or the NVIDIA RTC library cannot be initialized
...
CUDA Backend: Do not warn about missing CUDA SDK installation if --backend-ignore-cuda is used
4 years ago
Jens Steube
6a419d068c
CUDA Backend: Use blocking events to avoid 100% CPU core usage (per GPU)
4 years ago
Jens Steube
62a7ae4075
Increase EXTRA_SPACE to leave some room for free device memory
4 years ago
Jens Steube
3ebf4c5f9f
Merge branch 'master' of https://github.com/hashcat/hashcat
4 years ago
Jens Steube
98aef2ae92
Module Structure: Add 3rd party library hook management functions. This also requires an update to all existing module_init()
4 years ago
philsmd
ee5bce1c3e
fixes #2518 : call clear_prompt () more often to avoid misaligned prompt
4 years ago
Jens Steube
a72ba6faab
Add OPTI_TYPE_SLOW_HASH_SIMD_INIT2 and OPTI_TYPE_SLOW_HASH_SIMD_LOOP2
4 years ago
Jens Steube
e21463da4b
Fixed race condition resulting in out of memory error on startup if multiple hashcat instances are started at the same time
4 years ago
Jens Steube
0ff2f8c5e1
OpenCL Devices: Utilize PCI domain to improve alias device detection
4 years ago
philsmd
3e822e97b9
fixes #2460 : better alias detection esp. for macOS
4 years ago
Jens Steube
5628317de8
OpenCL Runtime: Reinterpret return code CL_DEVICE_NOT_FOUND from clGetDeviceIDs() as non-fatal
4 years ago
philsmd
e59f61e8cf
cosmetic: minor code style fixes
4 years ago
Jens Steube
a6a6bb200a
Mark NV 441.x as fixed
4 years ago
Jens Steube
1e469a96a4
Add missing branch in automatic alias device selection
4 years ago
Jens Steube
34f71aaea3
Re-enable POCL is version detected is >= 1.5 and LLVM is >= 9.x and also remove performance warning. Still prefers native OpenCL runtime in alias detection, but this default can be overriden using -d parameter.
4 years ago
Matt Palmer
240d35976a
Fix build warning in DEBUG mode
...
Just a tiny cleanup to avoid an 'unused variable' warning when building
with DEBUG=1.
5 years ago
Jens Steube
008072eb65
OpenCL Runtime: Added a warning if OpenCL runtime NEO, Beignet, POCL or MESA is detected and skip associated devices (override with --force)
5 years ago
Jens Steube
434ad76381
Improve alias device detection to distinguish between Intel CPU and embedded GPU
5 years ago
Jens Steube
ba7163062d
Do not set -cl-std=XXX to workaround NEO driver bug causing to hang while compiling -m 22000
5 years ago
Jens Steube
2b2a7ede66
OpenCL Options: Set --spin-damp to 0 (disabled) by default. With the CUDA backend this workaround became deprecated
5 years ago
Jens Steube
8c3808bad5
Fix NUL filename on windows
5 years ago
Jens Steube
3e4d110fd2
Add stderr redirection the regular way
5 years ago
Jens Steube
125e9ec863
Do not redirect stderr to /dev/null to prevent rocm 3.1 from crashing on debian
5 years ago
Jens Steube
f381e1bbf8
Remove force_recompile functionality, doesn't work with cubin anymore
5 years ago
Jens Steube
f96e35649d
Change bitsliced kernels from 3d to 2d invocation mode for slightly better performance
5 years ago
Jens Steube
d9473358ef
Add support for OPTS_TYPE_LOOP_EXTENDED kernel for special cases like VeraCrypt
5 years ago
Jens Steube
c90d83c3eb
Prepare for UNROLL whitelisting
5 years ago
Jens Steube
4788c61dd2
Add OPTI_TYPE_REGISTER_LIMIT flag to enable register limiting in CUDA
5 years ago
Jens Steube
17a64f5019
Set a fixed register count maximumfor CUDA kernel. This prevents kernels going out of control and to have negative effects on other kernels from the same source code (For instance 16600)
5 years ago
Jens Steube
c40f474c2e
Add special module option to indicate the kernel is using dynamic shared memory
5 years ago
Jens Steube
fb7bb04587
Do not use dynamic shared memory if dynamic_local_mem_size is a multiple of local_mem_size
5 years ago
Jens Steube
96a2c36f53
Reduce CUDA Toolkit minimum version to 9.0 (even 8.0 should be sufficient)
5 years ago
Jens Steube
aef53f7e10
OpenCL Runtime: Allow the kernel to access post-48k shared memory region on CUDA. Requires both module and kernel preparation
5 years ago
Jens Steube
1fc37c25f9
OpenCL Kernels: Moved "gpu_decompress", "gpu_memset" and "gpu_atinit" into new OpenCL/shared.cl in order to reduce compile time
5 years ago
Jens Steube
08163501cf
Add option to disable cubin cache binaries and moved some redundant kernel load code into specific function
5 years ago
Jens Steube
01085cdab2
Move cujit_opts allocation closer to the calling functions because CUDA library needs it reinitialized after each use
5 years ago
Jens Steube
346637ec43
Improve cujit logging
5 years ago
Jens Steube
66ae5125ce
Cache cubin instead of PTX to decrease startup time
5 years ago