Gabriele Gristina
|
5d3ed3e754
|
Remove union from HCFILE, using std file ops in ocl_check_dri, remove debug comments
|
2019-06-28 17:58:08 +02:00 |
|
Gabriele Gristina
|
c2e634c426
|
switch is_gzip from short to bool
|
2019-06-27 23:51:54 +02:00 |
|
Gabriele Gristina
|
481c752456
|
No more compress functions, update example.dict.gz, remove some comments
|
2019-06-27 20:18:47 +02:00 |
|
Gabriele Gristina
|
398c89c75c
|
switch almost all FILE ops, potfile is the only missing
|
2019-06-26 19:06:46 +02:00 |
|
Jens Steube
|
2cda236a18
|
OpenCL Runtime: Do not run a shared- and constant-memory size check if their memory type is of type global memory (typically CPU)
|
2019-06-22 16:01:38 +02:00 |
|
Jens Steube
|
6dfb474adf
|
OpenCL Runtime: Do not run a shared- and constant-memory size check if their memory type is of type global memory (typically CPU)
|
2019-06-22 16:00:48 +02:00 |
|
Gabriele Gristina
|
b2529af172
|
remove original commented code
|
2019-06-22 15:00:50 +02:00 |
|
Gabriele Gristina
|
6cb4abd526
|
Add zlib support v2
|
2019-06-21 21:56:38 +02:00 |
|
Jens Steube
|
955bfeaa14
|
Improve performance of bitsliced algorithms on ROCm
|
2019-06-19 16:35:52 +02:00 |
|
Jens Steube
|
5e0eb288c9
|
Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL
|
2019-06-16 18:01:26 +02:00 |
|
Jens Steube
|
c2fc849e2c
|
Fix minimum threads_per_block check
|
2019-06-06 20:46:20 +02:00 |
|
Jens Steube
|
0568c0746a
|
Emulate effect of reqd_work_group_size() in CUDA
|
2019-06-06 17:49:41 +02:00 |
|
Jens Steube
|
44ecc83d82
|
Do some CUDA and NVRTC version checks on startup
|
2019-06-05 10:53:48 +02:00 |
|
Jens Steube
|
03ed89684e
|
Use --restrict nvrtc option by default
|
2019-06-04 17:35:10 +02:00 |
|
Jens Steube
|
87c336e822
|
Fix format warning in backend.c
|
2019-06-03 13:41:52 +02:00 |
|
Jens Steube
|
1f6c82b6d1
|
Add hc_cuModuleLoadDataExLog wrapper function for more detailed error logging from CUDA
|
2019-06-01 07:47:30 +02:00 |
|
Jens Steube
|
ce8a6fde0a
|
Fix status screen current password query
|
2019-05-14 15:25:36 +02:00 |
|
Jens Steube
|
f84eaa2e4d
|
Fix bitsliced algorithm brute-force with CUDA
|
2019-05-14 14:08:27 +02:00 |
|
Jens Steube
|
523e0f7151
|
Fix free unallocated memory in case OpenCL initialization failed
|
2019-05-14 10:25:49 +02:00 |
|
Jens Steube
|
bca03bb7ed
|
CUDA offers a nice way to query available device memory, no need to brute force
|
2019-05-14 10:09:46 +02:00 |
|
Jens Steube
|
a6bc1d3cc0
|
Experimental kernel-thread autotuner
|
2019-05-11 11:58:18 +02:00 |
|
Jens Steube
|
d59474fded
|
Testwise unlock full thread count on NVidia
|
2019-05-10 17:27:15 +02:00 |
|
Jens Steube
|
d378aa7ab9
|
Show host memory requirement on startup
|
2019-05-10 16:37:49 +02:00 |
|
Jens Steube
|
46f737c5af
|
Use real constant memory on CUDA
|
2019-05-10 13:22:26 +02:00 |
|
Jens Steube
|
5d14a59304
|
Need 3.x nvrtc minimum
|
2019-05-10 10:11:12 +02:00 |
|
Jens Steube
|
54feb62e94
|
brute-force nvrtc .dll name
|
2019-05-09 22:17:13 +02:00 |
|
Jens Steube
|
a2b5981303
|
Fix some library names
|
2019-05-09 21:20:50 +02:00 |
|
Jens Steube
|
be8f29ca39
|
Only warn about broken NVIDIA driver
|
2019-05-09 16:30:08 +02:00 |
|
Jens Steube
|
39e150fc1e
|
Use xxx_v2 CUDA symbols
|
2019-05-09 14:37:14 +02:00 |
|
Jens Steube
|
33028314f0
|
Add hc_cuCtxSetCacheConfig()
|
2019-05-09 00:04:05 +02:00 |
|
Jens Steube
|
fb82bfc169
|
Improve thread handling based on FIXED_LOCAL_SIZE
|
2019-05-08 23:30:07 +02:00 |
|
Jens Steube
|
3a3df091c7
|
Fix CUDA num_elements
|
2019-05-08 22:42:52 +02:00 |
|
Jens Steube
|
363e789b89
|
Assume local nvrtc.dll and cuda.dll on windows
|
2019-05-07 16:52:08 +02:00 |
|
Jens Steube
|
a7d04adba3
|
Fix opencl_devices_active and backend_devices_active
|
2019-05-07 14:17:29 +02:00 |
|
Jens Steube
|
8ff8c5d536
|
Add LOCAL_VK to make use of __shared__
|
2019-05-07 09:01:32 +02:00 |
|
Jens Steube
|
bbed0cd67a
|
Fix test.sh and bitsliced algos
|
2019-05-06 15:06:02 +02:00 |
|
Jens Steube
|
d0bd33c9d1
|
Rename CONSTANT_AS to CONSTANT_VK
|
2019-05-06 14:34:16 +02:00 |
|
Jens Steube
|
64c495dfa5
|
Use CUDA stream for all cuLaunchKernel() invocations
|
2019-05-06 11:23:34 +02:00 |
|
Jens Steube
|
d94f582097
|
Replace CEILDIV() with round_up_multiple_64()
|
2019-05-06 09:36:07 +02:00 |
|
Jens Steube
|
e9c04c2446
|
More CUDA implementation
|
2019-05-05 21:15:46 +02:00 |
|
Jens Steube
|
08dc1acc02
|
More CUDA rewrites
|
2019-05-05 11:57:54 +02:00 |
|
Jens Steube
|
ec9925f3b1
|
Warnings self-check and autotune with CUDA
|
2019-05-04 21:52:00 +02:00 |
|
Jens Steube
|
4df00033d7
|
Prepare CUDA events
|
2019-05-04 10:44:03 +02:00 |
|
Jens Steube
|
f2948460c9
|
Some first kernel invocations
|
2019-05-04 10:13:43 +02:00 |
|
Jens Steube
|
5ee033673c
|
Disable name mangling in NVRTC's PTX output and more
|
2019-05-03 15:50:07 +02:00 |
|
Jens Steube
|
503304f36a
|
Add some first CUDA device memory allocations and host buffer copies
|
2019-05-03 12:07:06 +02:00 |
|
Jens Steube
|
50a6e720ca
|
More OpenCL variables rename
|
2019-05-02 17:30:46 +02:00 |
|
Jens Steube
|
af8e317cf4
|
Begin renaming some OpenCL only variables
|
2019-05-02 17:12:59 +02:00 |
|
Jens Steube
|
a6fa7a2749
|
Add support for some first CUDA module loader
|
2019-05-02 14:58:52 +02:00 |
|
Jens Steube
|
456c57a6d0
|
Set vector width size for CUDA
|
2019-05-01 18:20:19 +02:00 |
|
Jens Steube
|
3c4f4df771
|
Rename some more variables
|
2019-05-01 15:52:56 +02:00 |
|
Jens Steube
|
495d89f831
|
Find alias devices across different backend API's
|
2019-05-01 07:27:10 +02:00 |
|
Jens Steube
|
6fd936b43a
|
Removed --opencl-platforms filter in order to force backend device numbers to stay constant
|
2019-04-30 16:24:13 +02:00 |
|
Jens Steube
|
e3500ff4aa
|
Add CUDA device attributes to -I
|
2019-04-30 13:38:44 +02:00 |
|
Jens Steube
|
d862458ab5
|
Begin renaming API specific variables in backend section
|
2019-04-29 10:21:59 +02:00 |
|
Jens Steube
|
d73c0ac8a9
|
More CUDA attribute queries
|
2019-04-28 18:54:26 +02:00 |
|
Jens Steube
|
a415422123
|
Initialize CUDA devices and some first attribute queries
|
2019-04-28 14:45:50 +02:00 |
|
Jens Steube
|
58213c81d6
|
Add vector datatypes operators
|
2019-04-26 22:07:56 +02:00 |
|
Jens Steube
|
052e42ccef
|
Fix CUDA_ARCH value
|
2019-04-26 15:14:48 +02:00 |
|
Jens Steube
|
06171958ee
|
Add --gpu-architecture to NVRTC build options
|
2019-04-26 15:10:02 +02:00 |
|
Jens Steube
|
9faba41848
|
Use nvrtc to compile PTX (resulting PTX not yet used)
|
2019-04-26 13:28:44 +02:00 |
|
Jens Steube
|
4045e60021
|
Add nvrtc wrapper for later use
|
2019-04-26 10:03:16 +02:00 |
|
Jens Steube
|
4b986de5fb
|
Prepare native CUDA hybrid integration
|
2019-04-25 14:45:17 +02:00 |
|