Jens Steube
45e65dd05a
Backport more ROCm based optimizations to HIP
3 years ago
Jens Steube
cf512faa53
Update large switch() cases in inc_common.cl and some inline assembly common functions for devices managed with HIP backend
3 years ago
Jens Steube
7faf6859d6
Backport hand-optimized compiler settings in modules from ROCM to HIP
...
Backport DECLSPEC settings from ROCM to HIP
3 years ago
Jens Steube
1b84a9e53b
Add missing backports from code base v6.2.2
...
Fix context to thread management
Fix missing code in selftest.c, autotune.c, hashes.c, dispatch.c and backend.c
Use IS_HIP depending code makes it easier for future optimization related to inline assembly calls - instead of using IS_CUDA || IS_HIP
See TODO markers for more optimizations / next steps
3 years ago
Jens Steube
a22f8149fc
Merge branch 'HIP' into hip
3 years ago
reger-men
ea7b74389f
First draft HIP Version
3 years ago
Jens Steube
9bf0f36d0a
Get rid of MAYBE_VOLATILE for context position by replacing it with zero length check
3 years ago
Jens Steube
ddb641b843
Add option to force disable real SHM access to be used from within the module
4 years ago
Jens Steube
61fe90bacb
Use oldschool SHA1 kernel for CPU it's slightly faster
4 years ago
Jens Steube
b4bac70bd6
Remove inline keyword in DECLSPEC for CPU
4 years ago
Jens Steube
e53bff0fb0
Reenable bitselect() and rotate() on Intel SDK
4 years ago
Jens Steube
c90d83c3eb
Prepare for UNROLL whitelisting
4 years ago
Jens Steube
3561e7b8d7
Add special ROCM detection in OpenCL/inc_vendor.h
4 years ago
Jens Steube
3a5544a554
Help some compiler with 64 bit constants
4 years ago
Jens Steube
cf4cee2f2f
Update selection of API to make use of bitselect and rotate
4 years ago
Jens Steube
89f9ef45b6
Whitelist some OpenCL specific functions
4 years ago
Jens Steube
8ff8c5d536
Add LOCAL_VK to make use of __shared__
5 years ago
Jens Steube
d0bd33c9d1
Rename CONSTANT_AS to CONSTANT_VK
5 years ago
Jens Steube
5ee033673c
Disable name mangling in NVRTC's PTX output and more
5 years ago
Jens Steube
9faba41848
Use nvrtc to compile PTX (resulting PTX not yet used)
5 years ago
Jens Steube
4b986de5fb
Prepare native CUDA hybrid integration
5 years ago
Jens Steube
38c1029f2e
Need volatile for IRIS GPU on Mac OSX for -m 2500 and -m 2501
5 years ago
jsteube
7c6970dbdd
Remove hard-coded static keyword from OpenCL kernels
5 years ago
jsteube
b7cdca09c4
OpenCL Runtime: Workaround JiT compiler error on ROCM 2.3 driver if the 'inline' keyword is used in function declaration
5 years ago
jsteube
d7d716f3ab
Make it easier to include OpenCL kernels into modules
5 years ago
jsteube
9ced13cc94
Get rid of CONSTSPEC macro in OpenCL kernels
5 years ago
Jens Steube
0fb3b3c83e
Declare internal functions in OpenCL kernels as static
5 years ago
jsteube
66d94b06e4
Get rid of src/rp_kernel_on_cpu.c and src/rp_kernel_on_cpu_optimized.c and use OpenCL emulated kernel version
5 years ago
jsteube
e80b1838e8
Rename some functions in inc_common.cl to avoid conflicts with bitops.c
5 years ago
jsteube
adeeaee84a
Replace __kernel, __constant, __global and __local qualifiers with macro for better control
5 years ago
jsteube
7d4bea41a0
Get rid of OpenCL/inc_hash_constants.h and OpenCL/inc_hash_functions.cl
5 years ago