Jens Steube
|
5628317de8
|
OpenCL Runtime: Reinterpret return code CL_DEVICE_NOT_FOUND from clGetDeviceIDs() as non-fatal
|
4 years ago |
philsmd
|
e59f61e8cf
|
cosmetic: minor code style fixes
|
4 years ago |
Jens Steube
|
a6a6bb200a
|
Mark NV 441.x as fixed
|
4 years ago |
Jens Steube
|
1e469a96a4
|
Add missing branch in automatic alias device selection
|
4 years ago |
Jens Steube
|
34f71aaea3
|
Re-enable POCL is version detected is >= 1.5 and LLVM is >= 9.x and also remove performance warning. Still prefers native OpenCL runtime in alias detection, but this default can be overriden using -d parameter.
|
4 years ago |
Matt Palmer
|
240d35976a
|
Fix build warning in DEBUG mode
Just a tiny cleanup to avoid an 'unused variable' warning when building
with DEBUG=1.
|
5 years ago |
Jens Steube
|
008072eb65
|
OpenCL Runtime: Added a warning if OpenCL runtime NEO, Beignet, POCL or MESA is detected and skip associated devices (override with --force)
|
5 years ago |
Jens Steube
|
434ad76381
|
Improve alias device detection to distinguish between Intel CPU and embedded GPU
|
5 years ago |
Jens Steube
|
ba7163062d
|
Do not set -cl-std=XXX to workaround NEO driver bug causing to hang while compiling -m 22000
|
5 years ago |
Jens Steube
|
2b2a7ede66
|
OpenCL Options: Set --spin-damp to 0 (disabled) by default. With the CUDA backend this workaround became deprecated
|
5 years ago |
Jens Steube
|
8c3808bad5
|
Fix NUL filename on windows
|
5 years ago |
Jens Steube
|
3e4d110fd2
|
Add stderr redirection the regular way
|
5 years ago |
Jens Steube
|
125e9ec863
|
Do not redirect stderr to /dev/null to prevent rocm 3.1 from crashing on debian
|
5 years ago |
Jens Steube
|
f381e1bbf8
|
Remove force_recompile functionality, doesn't work with cubin anymore
|
5 years ago |
Jens Steube
|
f96e35649d
|
Change bitsliced kernels from 3d to 2d invocation mode for slightly better performance
|
5 years ago |
Jens Steube
|
d9473358ef
|
Add support for OPTS_TYPE_LOOP_EXTENDED kernel for special cases like VeraCrypt
|
5 years ago |
Jens Steube
|
c90d83c3eb
|
Prepare for UNROLL whitelisting
|
5 years ago |
Jens Steube
|
4788c61dd2
|
Add OPTI_TYPE_REGISTER_LIMIT flag to enable register limiting in CUDA
|
5 years ago |
Jens Steube
|
17a64f5019
|
Set a fixed register count maximumfor CUDA kernel. This prevents kernels going out of control and to have negative effects on other kernels from the same source code (For instance 16600)
|
5 years ago |
Jens Steube
|
c40f474c2e
|
Add special module option to indicate the kernel is using dynamic shared memory
|
5 years ago |
Jens Steube
|
fb7bb04587
|
Do not use dynamic shared memory if dynamic_local_mem_size is a multiple of local_mem_size
|
5 years ago |
Jens Steube
|
96a2c36f53
|
Reduce CUDA Toolkit minimum version to 9.0 (even 8.0 should be sufficient)
|
5 years ago |
Jens Steube
|
aef53f7e10
|
OpenCL Runtime: Allow the kernel to access post-48k shared memory region on CUDA. Requires both module and kernel preparation
|
5 years ago |
Jens Steube
|
1fc37c25f9
|
OpenCL Kernels: Moved "gpu_decompress", "gpu_memset" and "gpu_atinit" into new OpenCL/shared.cl in order to reduce compile time
|
5 years ago |
Jens Steube
|
08163501cf
|
Add option to disable cubin cache binaries and moved some redundant kernel load code into specific function
|
5 years ago |
Jens Steube
|
01085cdab2
|
Move cujit_opts allocation closer to the calling functions because CUDA library needs it reinitialized after each use
|
5 years ago |
Jens Steube
|
346637ec43
|
Improve cujit logging
|
5 years ago |
Jens Steube
|
66ae5125ce
|
Cache cubin instead of PTX to decrease startup time
|
5 years ago |
Jens Steube
|
cc4fd48ace
|
Optimize hook buffer size to be copied
|
5 years ago |
Jens Steube
|
041a777025
|
OpenCL Runtime: Unlocked maximum thread count for NVIDIA GPU
|
5 years ago |
Jens Steube
|
ccacc508cb
|
Reenabled support for Intel GPU OpenCL runtime (Beignet and NEO) because a workaround was found (force -cl-std=CL2.0)
|
5 years ago |
Jens Steube
|
fe372dffb7
|
Add RDNA ISA instructions test for ADD/ADDC/SUB/SUBB
|
5 years ago |
Jens Steube
|
df5e2361d3
|
Disable inline assembly instruction tests for CUDA and refer to documented requirements
|
5 years ago |
Jens Steube
|
d0fb171da9
|
Added new options --backend-ignore-cuda and --backend-ingore-opencl, to ignore CUDA and/or OpenCL interface from being load on startup
|
5 years ago |
Jens Steube
|
b3690fcd05
|
Backport instruction test cache from CUDA to OpenCL
|
5 years ago |
Jens Steube
|
2b4d0656d5
|
Cache inline assembly instruction check results for same devices types
|
5 years ago |
Jens Steube
|
5d1d48f5d7
|
Do not check for COPY_PW limits in outside kernels
|
5 years ago |
Jens Steube
|
53254b45aa
|
Backport inc_ecc_secp256k1 inline assembly code for AMD ISA
|
5 years ago |
Jens Steube
|
bfd95d42f6
|
- OpenCL Runtime: Reenabled support for Intel GPU OpenCL runtime
|
5 years ago |
Jens Steube
|
2884bded32
|
Initialize some variable to make scan-build happy
|
5 years ago |
Jens Steube
|
00b9f4c557
|
Add kernel accel minimum limit check
|
5 years ago |
Jens Steube
|
424777ae28
|
Add kernel accel limiter based on kernel threads to reduce host memory requirements
|
5 years ago |
Jens Steube
|
f7c3ced548
|
Fix use of calloc() in backend.c
|
5 years ago |
Jens Steube
|
c4dd020685
|
Add support for NVIDIA Jetson AGX Xavier developer kit
|
5 years ago |
Jens Steube
|
53e96a12a0
|
Improve automatic calculation of hook threads value
|
5 years ago |
Jens Steube
|
fe8c17f4c7
|
Support pause/abort in hooks
|
5 years ago |
Jens Steube
|
9c2c73c6cc
|
Clear hook buffers after full kernel chain is finished
|
5 years ago |
Jens Steube
|
7458e4f487
|
Add per-device available memory test of static data (hashlist, ruleset) before test of dynamic data (-n based)
|
5 years ago |
Rosen Penev
|
a6edb84157
|
Fix extra semicolon warnings
These macros don't need a ; but since ; is used, make the macros more
robust by enclosing them in a do while loop.
|
5 years ago |
Jens Steube
|
c12470b978
|
Merge pull request #2188 from neheb/cast
Add casts where needed in C++ mode
|
5 years ago |