Gabriele Gristina
|
24de156ce8
|
Fixed backend active devices checks
|
2 years ago |
Gabriele Gristina
|
fb12de4be6
|
fix style
|
2 years ago |
Gabriele Gristina
|
7eb66e9936
|
Hardware Monitor: Add support for GPU device utilization readings using iokit on Apple Silicon (OpenCL and Metal)
|
2 years ago |
Gabriele Gristina
|
3064c63c71
|
User Options: Change --backend-info/-I option type, from bool to int
|
2 years ago |
Gabriele Gristina
|
b3d3b31c3e
|
Metal: added support for vectors up to 4
|
2 years ago |
Jens Steube
|
be75e4b4ea
|
Rename STR() to M2S() to avoid future collisions and move from kernel source to command line parameter
|
2 years ago |
Gabriele Gristina
|
7ac879f1e4
|
Fixed unused variable warning on Windows
|
2 years ago |
Gabriele Gristina
|
829d49c8ba
|
resync src/backend.c
|
2 years ago |
Gabriele Gristina
|
490702fcfa
|
Backends: added Metal host-code
|
2 years ago |
Jens Steube
|
8293964097
|
Fix coding convention
|
2 years ago |
Gabriele Gristina
|
cd363b32f6
|
Merge branch 'master' into metal_prepare_kernelIncludes_v2
|
2 years ago |
Gabriele Gristina
|
01a28f80f7
|
Updated handling of POCL's known bugs
|
2 years ago |
Gabriele Gristina
|
a1ced24564
|
Fixed bug on benchmark engine, add some unstable warnings, updated negative status code
|
2 years ago |
Gabriele Gristina
|
2e4a136758
|
Refactored standard kernel includes in order to support Apple Metal runtime, updated backend, test units and status code
|
2 years ago |
Gabriele Gristina
|
7650894e02
|
fixed bug in benchmark engine, updated negative status code
|
2 years ago |
Jens Steube
|
dfd316c653
|
Merge pull request #3103 from matrix/backend_session_update_mp_rl
Removed hc_clSetKernelArg() call from backend_session_update_mp_rl()
|
2 years ago |
Jens Steube
|
7a9a1b37d0
|
Merge pull request #3104 from matrix/backend_session_update_mp
Removed hc_clSetKernelArg() call from backend_session_update_mp()
|
2 years ago |
Jens Steube
|
56ef2b4bde
|
Merge pull request #3102 from matrix/backend_cuda_restyle
CUDA Backend: moved functions to ext_cuda.c/ext_nvrtc.c and includes to ext_cuda.h/ext_nvrtc.h
|
2 years ago |
Jens Steube
|
045ca5cb7a
|
Fixed method how OPTS_TYPE_AUX* kernels are called in association mode, for instance WPA/WPA2 kernels
|
2 years ago |
Jens Steube
|
668d2179cd
|
Kernels: Refactored standard kernel declaration to use a structure holding u32/u64 attributes to reduce the number of attributes
|
2 years ago |
Gabriele Gristina
|
994083eaf5
|
Removed hc_clSetKernelArg() call from backend_session_update_mp()
|
2 years ago |
Gabriele Gristina
|
0f0cf1fe08
|
Removed hc_clSetKernelArg() call from backend_session_update_mp_rl()
|
2 years ago |
Gabriele Gristina
|
f8ceb8785e
|
CUDA Backend: moved functions to ext_cuda.c/ext_nvrtc.c and includes to ext_cuda.h/ext_nvrtc.h
|
2 years ago |
Gabriele Gristina
|
78c7ee2af6
|
HIP Backend: moved functions to ext_hip.c/ext_hiprtc.c and includes to ext_hip.h/ext_hiprtc.h
|
2 years ago |
Gabriele Gristina
|
26b6054cab
|
OpenCL Backend: moved functions to ext_OpenCL.c and includes to ext_OpenCL.h
|
2 years ago |
Gabriele Gristina
|
861e644057
|
OpenCL Backend: added workaround to make optimized kernels work on Apple Silicon
|
2 years ago |
Jens Steube
|
df6e5480ca
|
Print module_extra_tuningdb_block undefined compute device warning only on GPU
|
2 years ago |
Gabriele Gristina
|
3fd6dac523
|
Set default device-type to GPU with Apple M1
|
2 years ago |
Gabriele Gristina
|
0fae3a4394
|
Added support for Apple Silicon compute devices
|
2 years ago |
Jens Steube
|
d4a54287b1
|
Add missing backslash for RUN_INSTRUCTION_CHECKS() on AMD
|
2 years ago |
Jens Steube
|
3d53188cc3
|
Tuning Database: Added a warning if a module implements module_extra_tuningdb_block but the installed computing device is not found
|
2 years ago |
Jens Steube
|
21f91c5bb8
|
Module Optimizaters: Added OPTS_TYPE_MAXIMUM_THREADS to deactivate the else branch route in the section to find -T before compilation
Set the new flag based on some testings with RX6900XT
|
3 years ago |
Gabriele Gristina
|
9be7bc71a5
|
OpenCL Backend: added workaround to support Apple Silicon
|
3 years ago |
Jens Steube
|
53f6693495
|
Temporary enable HIP 4.4/ROCM 4.5 on Linux and globally set native thread count
|
3 years ago |
Jens Steube
|
f84aca82ca
|
Backend types: The default filter for the device types is now set so that only the GPU is used, except for APPLE, where we set CPU
|
3 years ago |
Jens Steube
|
49a68cd6c1
|
AMD Driver: Updated requirements for AMD Linux drivers to ROCm 4.5 or later due to new HIP interface
|
3 years ago |
Jens Steube
|
576a71af5c
|
Update minimum HIP version from 4.4 to upcoming 4.5
|
3 years ago |
Jens Steube
|
756c29ec57
|
Add missing cleanup on windows if outdated HIP version is detected
|
3 years ago |
Jens Steube
|
733f9c2d77
|
Add better detection future HIP 4.4
|
3 years ago |
Jens Steube
|
07e58631a5
|
Backend devices: In non -S mode, limit the number of workitems so that no more than 4GB of host memory is required per backend device
|
3 years ago |
Jens Steube
|
4b6654b503
|
Fix unstable plugin to driver warning
|
3 years ago |
Jens Steube
|
c1fd42fe72
|
Reduce work item maximum in -S mode even further, tested with NTLM
|
3 years ago |
Jens Steube
|
bd2cde31ae
|
Back-end devices: In -S mode, limit the number of workitems so that no more than 2GB of host memory is required per backend device
|
3 years ago |
Jens Steube
|
4ef1509bc7
|
Backend Devices: Reduce maximum workitems limited derived from available host memory down from to 8GB to 4GB per backend device
|
3 years ago |
Jens Steube
|
721e1ea54d
|
Fixed division by zero because backend_ctx->hardware_power_all was not re-inserted after refactoring device_param->hardware_power
|
3 years ago |
Jens Steube
|
8c14fd85ea
|
POCL: Added a workaround for an issue in POCL that uses a quote character as part of the path itself given to a path for the -I option
|
3 years ago |
Jens Steube
|
b4b2195fa5
|
OpenCL Runtime: The use of clUnloadPlatformCompiler () was disabled after some users received unexpected return codes
|
3 years ago |
Jens Steube
|
85854236d1
|
Merge pull request #2935 from matrix/apple_gpu_workaround
workaround to 'clEnqueueWriteBuffer(): CL_INVALID_VALUE' with apple gpu
|
3 years ago |
Jens Steube
|
50e28ff306
|
Merge pull request #2926 from jtojanen/master
Code cleanup and small fixes
|
3 years ago |
Jukka Ojanen
|
6b4786de84
|
Make blocking clEnqueueWriteBuffer() non-blocking
|
3 years ago |