Gabriele Gristina
|
f8ceb8785e
|
CUDA Backend: moved functions to ext_cuda.c/ext_nvrtc.c and includes to ext_cuda.h/ext_nvrtc.h
|
2022-01-03 16:29:15 +01:00 |
|
Jukka Ojanen
|
cdf27a1cb3
|
Implement async run_cuda_kernel_memset() and run_cuda_kernel_memset32()
|
2021-07-27 18:56:59 +03:00 |
|
Jukka Ojanen
|
a642f7b233
|
Remove synchronous GPU memory copy functions
|
2021-07-26 15:36:42 +03:00 |
|
Jukka Ojanen
|
4263cafdcf
|
Add async CUDA memcpy functions: hc_cuMemcpyDtoDAsync(), hc_cuMemcpyDtoHAsync() and hc_cuMemcpyHtoDAsync(). Implement partially async CUDA memset and bzero kernels.
|
2021-07-20 12:23:39 +03:00 |
|
Jens Steube
|
66ae5125ce
|
Cache cubin instead of PTX to decrease startup time
|
2020-01-29 15:56:36 +01:00 |
|
Jens Steube
|
33028314f0
|
Add hc_cuCtxSetCacheConfig()
|
2019-05-09 00:04:05 +02:00 |
|
Jens Steube
|
ec9925f3b1
|
Warnings self-check and autotune with CUDA
|
2019-05-04 21:52:00 +02:00 |
|
Jens Steube
|
a6fa7a2749
|
Add support for some first CUDA module loader
|
2019-05-02 14:58:52 +02:00 |
|
Jens Steube
|
4b986de5fb
|
Prepare native CUDA hybrid integration
|
2019-04-25 14:45:17 +02:00 |
|
jsteube
|
378258d789
|
Fix caching system for use with AMD and NV, drop BINARY_KERNEL define
|
2015-12-21 12:01:38 +01:00 |
|
jsteube
|
968265fffb
|
- Prepared for JIT use of hash-mode 1500, 8900 and 9300, works already on OpenCL (AMD)
- Changed PROMPT
|
2015-12-07 21:37:12 +01:00 |
|
Jens Steube
|
5065474b4e
|
Initial commit
|
2015-12-04 15:47:52 +01:00 |
|