Commit Graph

171 Commits (470e844e5d00a54ecc517f9c407e50ac837faa10)

Author SHA1 Message Date
Rosen Penev 98e17d5774
Run through clang-tidy's readability-uppercase-literal-suffix
5 years ago
Gabriele Gristina ae62e597ce (backend) remove unused *rc* vars and cleanup
5 years ago
Jens Steube a7fd1e40f8
Merge pull request #2075 from matrix/zlib_support_2
5 years ago
Gabriele Gristina 2db6dfcd4e fix HCFILE with potfile BUG and something else related to HCFILE wrong usage
5 years ago
Gabriele Gristina ea786f715f avoid logical negation operator
5 years ago
Gabriele Gristina 3161aec3da fix the comments :)
5 years ago
Gabriele Gristina 5679ca3344 Rewrite hc_fopen to better handling file descriptor locking/unlocking functions, saving kernels binary from plain to gzip format
5 years ago
Gabriele Gristina caf34e0e83 Fix some *print* format arguments
5 years ago
Gabriele Gristina 5d3ed3e754 Remove union from HCFILE, using std file ops in ocl_check_dri, remove debug comments
5 years ago
Gabriele Gristina c2e634c426 switch is_gzip from short to bool
5 years ago
Gabriele Gristina 481c752456 No more compress functions, update example.dict.gz, remove some comments
5 years ago
Gabriele Gristina 398c89c75c switch almost all FILE ops, potfile is the only missing
5 years ago
Jens Steube 2cda236a18 OpenCL Runtime: Do not run a shared- and constant-memory size check if their memory type is of type global memory (typically CPU)
5 years ago
Jens Steube 6dfb474adf OpenCL Runtime: Do not run a shared- and constant-memory size check if their memory type is of type global memory (typically CPU)
5 years ago
Gabriele Gristina b2529af172 remove original commented code
5 years ago
Gabriele Gristina 6cb4abd526 Add zlib support v2
5 years ago
Jens Steube 955bfeaa14 Improve performance of bitsliced algorithms on ROCm
5 years ago
Jens Steube 5e0eb288c9 Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL
5 years ago
Jens Steube c2fc849e2c Fix minimum threads_per_block check
5 years ago
Jens Steube 0568c0746a Emulate effect of reqd_work_group_size() in CUDA
5 years ago
Jens Steube 44ecc83d82 Do some CUDA and NVRTC version checks on startup
5 years ago
Jens Steube 03ed89684e Use --restrict nvrtc option by default
5 years ago
Jens Steube 87c336e822 Fix format warning in backend.c
5 years ago
Jens Steube 1f6c82b6d1 Add hc_cuModuleLoadDataExLog wrapper function for more detailed error logging from CUDA
5 years ago
Jens Steube ce8a6fde0a Fix status screen current password query
5 years ago
Jens Steube f84eaa2e4d Fix bitsliced algorithm brute-force with CUDA
5 years ago
Jens Steube 523e0f7151 Fix free unallocated memory in case OpenCL initialization failed
5 years ago
Jens Steube bca03bb7ed CUDA offers a nice way to query available device memory, no need to brute force
5 years ago
Jens Steube a6bc1d3cc0 Experimental kernel-thread autotuner
5 years ago
Jens Steube d59474fded Testwise unlock full thread count on NVidia
5 years ago
Jens Steube d378aa7ab9 Show host memory requirement on startup
5 years ago
Jens Steube 46f737c5af Use real constant memory on CUDA
5 years ago
Jens Steube 5d14a59304 Need 3.x nvrtc minimum
5 years ago
Jens Steube 54feb62e94 brute-force nvrtc .dll name
5 years ago
Jens Steube a2b5981303 Fix some library names
5 years ago
Jens Steube be8f29ca39 Only warn about broken NVIDIA driver
5 years ago
Jens Steube 39e150fc1e Use xxx_v2 CUDA symbols
5 years ago
Jens Steube 33028314f0 Add hc_cuCtxSetCacheConfig()
5 years ago
Jens Steube fb82bfc169 Improve thread handling based on FIXED_LOCAL_SIZE
5 years ago
Jens Steube 3a3df091c7 Fix CUDA num_elements
5 years ago
Jens Steube 363e789b89 Assume local nvrtc.dll and cuda.dll on windows
5 years ago
Jens Steube a7d04adba3 Fix opencl_devices_active and backend_devices_active
5 years ago
Jens Steube 8ff8c5d536 Add LOCAL_VK to make use of __shared__
5 years ago
Jens Steube bbed0cd67a Fix test.sh and bitsliced algos
5 years ago
Jens Steube d0bd33c9d1 Rename CONSTANT_AS to CONSTANT_VK
5 years ago
Jens Steube 64c495dfa5 Use CUDA stream for all cuLaunchKernel() invocations
5 years ago
Jens Steube d94f582097 Replace CEILDIV() with round_up_multiple_64()
5 years ago
Jens Steube e9c04c2446 More CUDA implementation
5 years ago
Jens Steube 08dc1acc02 More CUDA rewrites
5 years ago
Jens Steube ec9925f3b1 Warnings self-check and autotune with CUDA
5 years ago
Jens Steube 4df00033d7 Prepare CUDA events
5 years ago
Jens Steube f2948460c9 Some first kernel invocations
5 years ago
Jens Steube 5ee033673c Disable name mangling in NVRTC's PTX output and more
5 years ago
Jens Steube 503304f36a Add some first CUDA device memory allocations and host buffer copies
5 years ago
Jens Steube 50a6e720ca More OpenCL variables rename
5 years ago
Jens Steube af8e317cf4 Begin renaming some OpenCL only variables
5 years ago
Jens Steube a6fa7a2749 Add support for some first CUDA module loader
5 years ago
Jens Steube 456c57a6d0 Set vector width size for CUDA
5 years ago
Jens Steube 3c4f4df771 Rename some more variables
5 years ago
Jens Steube 495d89f831 Find alias devices across different backend API's
5 years ago
Jens Steube 6fd936b43a Removed --opencl-platforms filter in order to force backend device numbers to stay constant
5 years ago
Jens Steube e3500ff4aa Add CUDA device attributes to -I
5 years ago
Jens Steube d862458ab5 Begin renaming API specific variables in backend section
5 years ago
Jens Steube d73c0ac8a9 More CUDA attribute queries
5 years ago
Jens Steube a415422123 Initialize CUDA devices and some first attribute queries
5 years ago
Jens Steube 58213c81d6 Add vector datatypes operators
5 years ago
Jens Steube 052e42ccef Fix CUDA_ARCH value
5 years ago
Jens Steube 06171958ee Add --gpu-architecture to NVRTC build options
5 years ago
Jens Steube 9faba41848 Use nvrtc to compile PTX (resulting PTX not yet used)
5 years ago
Jens Steube 4045e60021 Add nvrtc wrapper for later use
5 years ago
Jens Steube 4b986de5fb Prepare native CUDA hybrid integration
5 years ago