Jens Steube
|
316095c151
|
Some more ROCm performance tuning
|
2019-06-20 10:04:31 +02:00 |
|
Jens Steube
|
5e0eb288c9
|
Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL
|
2019-06-16 18:01:26 +02:00 |
|
Jens Steube
|
7832c54452
|
Fix constant memory use of bfs_buf
|
2019-05-11 09:32:16 +02:00 |
|
Jens Steube
|
46f737c5af
|
Use real constant memory on CUDA
|
2019-05-10 13:22:26 +02:00 |
|
Jens Steube
|
d0bd33c9d1
|
Rename CONSTANT_AS to CONSTANT_VK
|
2019-05-06 14:34:16 +02:00 |
|
Jens Steube
|
ec9925f3b1
|
Warnings self-check and autotune with CUDA
|
2019-05-04 21:52:00 +02:00 |
|
Jens Steube
|
3b7304c9d8
|
Fix recursion in inc_platform.cl
|
2019-04-26 14:01:14 +02:00 |
|
Jens Steube
|
89119bf24a
|
Add missing inc_platform.h include
|
2019-04-26 13:59:43 +02:00 |
|
Jens Steube
|
9faba41848
|
Use nvrtc to compile PTX (resulting PTX not yet used)
|
2019-04-26 13:28:44 +02:00 |
|
Jens Steube
|
4b986de5fb
|
Prepare native CUDA hybrid integration
|
2019-04-25 14:45:17 +02:00 |
|