Jens Steube
a4299b74af
Memory Management: Refactored the code responsible for limiting kernel accel in order to avoid out of -host- memory situations
3 years ago
Jens Steube
fd2cb59d26
AMD GPUs: On Apple OpenCL platform, we ask for the preferred kernel thread size rather than hard-coding 32
...
ECC secp256k1: Removed the inline assembly code for AMD GPUs because the latest JIT compilers optimize it with the same efficiency
3 years ago
Jens Steube
959a232828
Merge pull request #2885 from neheb/charfixes
...
const and char fixes
3 years ago
Jens Steube
640d95a00f
Vendor Detection: Add "Intel" as a valid vendor name for GPU on macOS
3 years ago
Rosen Penev
adaf3f293b
make const char pointers actually const
...
const char* is a non const pointer that points to const data. Add
missing const.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
3 years ago
Jens Steube
5024865d87
Kernel Threads: Use warp size / wavefront size query instead of hardcoded values as base for kernel threads
...
Kernel Cache: Add kernel threads into hash computation which is later used in the kernel cache filename
Remove some unused function symbol lookups in HIP library
3 years ago
Jens Steube
c990e252d3
Added option --multiply-accel-disable (short: -M) to disable multiply the kernel-accel with the multiprocessor count automatism
3 years ago
Jens Steube
a7a899e5a4
Backport changes from #2888 to HIP backend
3 years ago
Jukka Ojanen
cb923d6e46
Replace CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK with CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK_OPTIN
3 years ago
Jukka Ojanen
d23f2d6c2f
Calculation kernel dynamic memory size based on CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK
3 years ago
Jukka Ojanen
8674e23d79
Add async HIP memcpy functions: hc_hipMemcpyDtoDAsync(), hc_hipMemcpyDtoHAsync() and hc_hipMemcpyHtoDAsync(). Implement partially async HIP memset and bzero kernels.
3 years ago
Jukka Ojanen
4263cafdcf
Add async CUDA memcpy functions: hc_cuMemcpyDtoDAsync(), hc_cuMemcpyDtoHAsync() and hc_cuMemcpyHtoDAsync(). Implement partially async CUDA memset and bzero kernels.
3 years ago
Jukka Ojanen
4c0f6dd263
Merge branch 'hashcat:master' into master
3 years ago
Jukka Ojanen
ea5180ac46
Include missing bzero kernel parameters
3 years ago
Jens Steube
257098a301
Get rid of hip/hip_runtime.h dependancy
3 years ago
Jukka Ojanen
a2a1d04bcf
Implement gpu_bzero
3 years ago
Jens Steube
45e65dd05a
Backport more ROCm based optimizations to HIP
3 years ago
Jens Steube
cf512faa53
Update large switch() cases in inc_common.cl and some inline assembly common functions for devices managed with HIP backend
3 years ago
Jens Steube
f3bd936971
Add hardware monitor mapping for HIP devices
3 years ago
Jens Steube
219bed457f
Fix use of --gpu-max-threads-per-block
3 years ago
Jens Steube
674ca7d88f
Add GPU threads to kernel cache checksum because it has an influence on HIP offline compile options
...
Add V_ALIGNBIT_B32 inline assembly wrapper because HIP does not provide amd_bitalign()
3 years ago
Jens Steube
23c3c178bf
Limit max threads per block to 64 to enable offline compiler to make better use if available registers
...
Fix double free() for hip_event1/hip_event2 and hip_stream causes segfault
Replace hc_cuCtxSetCurrent() with hc_cuCtxPushCurrent() in order to align changes with HIP
Add vector datatype operators (if we decide to use them - currently unused)
3 years ago
Jens Steube
ca3beacd93
Disable dynamic shared memory on HIP, because hipFuncSetAttribute() maps to cudaFuncSetAttribute() and not to cuFuncSetAttribute()
3 years ago
Jens Steube
1b84a9e53b
Add missing backports from code base v6.2.2
...
Fix context to thread management
Fix missing code in selftest.c, autotune.c, hashes.c, dispatch.c and backend.c
Use IS_HIP depending code makes it easier for future optimization related to inline assembly calls - instead of using IS_CUDA || IS_HIP
See TODO markers for more optimizations / next steps
3 years ago
Jens Steube
bfe83ec138
Added temperature watchdog for CPU on linux using sysfs
3 years ago
Jens Steube
979f9e9868
Rename hardware monitor interface sysfs to sysfs_amdgpu
3 years ago
Jukka Ojanen
a0eaefa0c2
Missing whitespaces
3 years ago
Jukka Ojanen
e133bd4ec4
Change rc_board_name_amd initial value to CL_INVALID_VALUE. If CHECK_BOARD_NAME_AMD is changed to 0, there is a problem with CL_SUCCESS which equals to 0, device will be skipped.
3 years ago
Jukka Ojanen
b3d18f86e2
Fix early return leaks in backend_session_begin
3 years ago
Jukka Ojanen
b976e52bc7
Fix early return leaks in load_kernel; nvrtc_options, nvrtc_options_string, build_log. Ensure build log NULL termination.
3 years ago
Jukka Ojanen
2f7eec2fd7
Fix early return leaks in backend_ctx_init and backend_ctx_devices_init
3 years ago
Jukka Ojanen
bcbb9b0d2c
Fix skipped device param leak in backend_ctx_devices_destroy
3 years ago
Jukka Ojanen
5f109b5862
Fix iconv_ctx and iconv_tmp leaks in backend.c
3 years ago
Jens Steube
8a3eee3fe5
OpenCL Runtime: Workaround JiT crash (SC failed. No reason given.) on macOS by limiting local memory allocations to 32k
3 years ago
Jens Steube
909d5e64a5
Added hash-mode: AES-128/192/256-ECB NOKDF
...
This mode is probably very rare in real-life scenarios,
but it is a nice template for kernels which do
not use a KDF,
or use AES,
or simple fast hashes with lookup tables
or simple optimized kernels in general
3 years ago
Jens Steube
28c08de9ef
Folders: Do not escape the variable cpath_real to prevent certain OpenCL runtimes from running into an error which do not support escape characters
3 years ago
Jens Steube
90f0e78b5b
Add event about autodetection process start
...
Remove "..." substring from final messages
Add patient message on "..." startup messages
Add missing docs/changes.txt entry
3 years ago
Jens Steube
7e267b9b37
Merge pull request #2825 from matrix/hwmon_osx_v2
...
Add OSX HW Monitor initial support (2021)
3 years ago
Jens Steube
2d7f67fd23
Alias Devices: Show a warning in case the user specifically listed a device to use which in a later step is skipped because it is an alias of another active device
3 years ago
Gabriele Gristina
bc4ce4cbeb
Add support for CPU/GPU device temperature and fanspeed using iokit (Apple)
3 years ago
Jens Steube
ee7fca82f5
Scrypt Kernels: Re-enable scrypt based kernels to use kernel cache
3 years ago
Jens Steube
7fc0ac4ef1
Do not initialize backend devices in case --id is used and some bug fixes
3 years ago
Jens Steube
65d81c0f7b
Hashrate: Innerloop hashrate prediction requires update because of the new salt_repeats feature and also respect _loop2 kernel runtime
3 years ago
Jens Steube
fdccc8287d
Fixed free memory size output for skipped GPU (both automatic and manual) of --backend-info information screen
3 years ago
Royce Williams
7bea7ca177
tighten output to be physical-terminal friendly
3 years ago
nycex
470e844e5d
use XDG_CACHE_HOME for kernels
3 years ago
Jens Steube
ce8c121b50
BCRYPT Kernels: Improved bcrypt performance by 6.5% for high-end NVIDIA GPU devices using CUDA backend
3 years ago
Jens Steube
9f5a22a3ab
OpenCL Backend: Use CL_DEVICE_BOARD_NAME_AMD instead of CL_DEVICE_NAME for device name in case OpenCL runtime supports this query
3 years ago
Jens Steube
5c6501444a
Kernels: Add standalone true UTF8 to UTF16 converter kernel that runs after amplifier. Use OPTS_TYPE_POST_AMP_UTF16LE from plugin
3 years ago
Jens Steube
fe91f6276d
CUDA Backend: Do not warn about missing CUDA SDK installation if --stdout is used
3 years ago