Commit Graph

26 Commits (master)

Author SHA1 Message Date
Gabriele Gristina b3d3b31c3e Metal: added support for vectors up to 4
2 years ago
Gabriele Gristina 9d36245d51 Kernels: Set the default Address Space Qualifier for any pointer, refactored / updated KERN_ATTR macros and rc4 cipher functions, in order to support Apple Metal runtime
2 years ago
Jens Steube 3f6c5a0042 Update module_unstable_warning() for -m 172xx on HIP
3 years ago
Jens Steube 5ffcaa980d HIP Backend: Added support to support HIP 4.4 and later, but added check to rule out older versions because they are incompatible
3 years ago
Jens Steube bdb7999f07 Switch HIP vector datatypes to OpenCL like ext_vector_type()
3 years ago
Jens Steube 0d8b4b74ad More CUDA special backports to HIP
3 years ago
Jens Steube 257098a301 Get rid of hip/hip_runtime.h dependancy
3 years ago
Jens Steube 45e65dd05a Backport more ROCm based optimizations to HIP
3 years ago
Jens Steube d130cc66b3 Optimize ISA code on HIP for V_ALIGNBIT_B32 using a different template for inline assembly
3 years ago
Jens Steube 674ca7d88f Add GPU threads to kernel cache checksum because it has an influence on HIP offline compile options
3 years ago
Jens Steube 20f7febd4c Workaround too intensive optimization in -m 2000 using HIPRTC
3 years ago
Jens Steube 1b84a9e53b Add missing backports from code base v6.2.2
3 years ago
Jens Steube a22f8149fc
Merge branch 'HIP' into hip
3 years ago
reger-men ea7b74389f First draft HIP Version
3 years ago
Jens Steube 62fc3601bb Wrap atomic functions with hc_ prefix to have better platform control
3 years ago
Jens Steube 73cc3170f4 Fixed both false negative and false positive result in -m 3000 in -a 3 (affected only NVIDIA GPU)
3 years ago
Jens Steube 316095c151 Some more ROCm performance tuning
5 years ago
Jens Steube 5e0eb288c9 Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL
5 years ago
Jens Steube 7832c54452 Fix constant memory use of bfs_buf
5 years ago
Jens Steube 46f737c5af Use real constant memory on CUDA
5 years ago
Jens Steube d0bd33c9d1 Rename CONSTANT_AS to CONSTANT_VK
5 years ago
Jens Steube ec9925f3b1 Warnings self-check and autotune with CUDA
5 years ago
Jens Steube 3b7304c9d8 Fix recursion in inc_platform.cl
5 years ago
Jens Steube 89119bf24a Add missing inc_platform.h include
5 years ago
Jens Steube 9faba41848 Use nvrtc to compile PTX (resulting PTX not yet used)
5 years ago
Jens Steube 4b986de5fb Prepare native CUDA hybrid integration
5 years ago