Jens Steube
|
f96e35649d
|
Change bitsliced kernels from 3d to 2d invocation mode for slightly better performance
|
2020-02-22 07:59:58 +01:00 |
|
Jens Steube
|
d9473358ef
|
Add support for OPTS_TYPE_LOOP_EXTENDED kernel for special cases like VeraCrypt
|
2020-02-20 16:00:21 +01:00 |
|
Jens Steube
|
c90d83c3eb
|
Prepare for UNROLL whitelisting
|
2020-02-15 12:44:12 +01:00 |
|
Jens Steube
|
4788c61dd2
|
Add OPTI_TYPE_REGISTER_LIMIT flag to enable register limiting in CUDA
|
2020-02-04 21:53:27 +01:00 |
|
Jens Steube
|
17a64f5019
|
Set a fixed register count maximumfor CUDA kernel. This prevents kernels going out of control and to have negative effects on other kernels from the same source code (For instance 16600)
|
2020-02-04 18:31:23 +01:00 |
|
Jens Steube
|
c40f474c2e
|
Add special module option to indicate the kernel is using dynamic shared memory
|
2020-02-02 11:24:38 +01:00 |
|
Jens Steube
|
fb7bb04587
|
Do not use dynamic shared memory if dynamic_local_mem_size is a multiple of local_mem_size
|
2020-02-02 11:15:37 +01:00 |
|
Jens Steube
|
96a2c36f53
|
Reduce CUDA Toolkit minimum version to 9.0 (even 8.0 should be sufficient)
|
2020-02-01 19:32:03 +01:00 |
|
Jens Steube
|
aef53f7e10
|
OpenCL Runtime: Allow the kernel to access post-48k shared memory region on CUDA. Requires both module and kernel preparation
|
2020-02-01 14:27:42 +01:00 |
|
Jens Steube
|
1fc37c25f9
|
OpenCL Kernels: Moved "gpu_decompress", "gpu_memset" and "gpu_atinit" into new OpenCL/shared.cl in order to reduce compile time
|
2020-02-01 09:00:48 +01:00 |
|
Jens Steube
|
08163501cf
|
Add option to disable cubin cache binaries and moved some redundant kernel load code into specific function
|
2020-01-31 17:50:53 +01:00 |
|
Jens Steube
|
01085cdab2
|
Move cujit_opts allocation closer to the calling functions because CUDA library needs it reinitialized after each use
|
2020-01-31 11:59:59 +01:00 |
|
Jens Steube
|
346637ec43
|
Improve cujit logging
|
2020-01-30 11:44:57 +01:00 |
|
Jens Steube
|
66ae5125ce
|
Cache cubin instead of PTX to decrease startup time
|
2020-01-29 15:56:36 +01:00 |
|
Jens Steube
|
cc4fd48ace
|
Optimize hook buffer size to be copied
|
2020-01-26 20:31:38 +01:00 |
|
Jens Steube
|
041a777025
|
OpenCL Runtime: Unlocked maximum thread count for NVIDIA GPU
|
2020-01-24 13:24:19 +01:00 |
|
Jens Steube
|
ccacc508cb
|
Reenabled support for Intel GPU OpenCL runtime (Beignet and NEO) because a workaround was found (force -cl-std=CL2.0)
|
2020-01-24 10:52:12 +01:00 |
|
Jens Steube
|
fe372dffb7
|
Add RDNA ISA instructions test for ADD/ADDC/SUB/SUBB
|
2020-01-06 12:49:57 +01:00 |
|
Jens Steube
|
df5e2361d3
|
Disable inline assembly instruction tests for CUDA and refer to documented requirements
|
2020-01-03 12:27:27 +01:00 |
|
Jens Steube
|
d0fb171da9
|
Added new options --backend-ignore-cuda and --backend-ingore-opencl, to ignore CUDA and/or OpenCL interface from being load on startup
|
2020-01-03 11:51:24 +01:00 |
|
Jens Steube
|
b3690fcd05
|
Backport instruction test cache from CUDA to OpenCL
|
2020-01-03 11:06:55 +01:00 |
|
Jens Steube
|
2b4d0656d5
|
Cache inline assembly instruction check results for same devices types
|
2020-01-03 10:44:10 +01:00 |
|
Jens Steube
|
5d1d48f5d7
|
Do not check for COPY_PW limits in outside kernels
|
2019-12-31 21:25:37 +01:00 |
|
Jens Steube
|
53254b45aa
|
Backport inc_ecc_secp256k1 inline assembly code for AMD ISA
|
2019-12-05 15:43:01 +01:00 |
|
Jens Steube
|
bfd95d42f6
|
- OpenCL Runtime: Reenabled support for Intel GPU OpenCL runtime
|
2019-11-27 10:28:12 +01:00 |
|
Jens Steube
|
2884bded32
|
Initialize some variable to make scan-build happy
|
2019-11-26 10:55:57 +01:00 |
|
Jens Steube
|
00b9f4c557
|
Add kernel accel minimum limit check
|
2019-11-19 20:38:31 +01:00 |
|
Jens Steube
|
424777ae28
|
Add kernel accel limiter based on kernel threads to reduce host memory requirements
|
2019-11-19 17:59:50 +01:00 |
|
Jens Steube
|
f7c3ced548
|
Fix use of calloc() in backend.c
|
2019-11-17 19:59:23 +01:00 |
|
Jens Steube
|
c4dd020685
|
Add support for NVIDIA Jetson AGX Xavier developer kit
|
2019-11-16 17:27:35 +01:00 |
|
Jens Steube
|
53e96a12a0
|
Improve automatic calculation of hook threads value
|
2019-11-16 11:48:25 +01:00 |
|
Jens Steube
|
fe8c17f4c7
|
Support pause/abort in hooks
|
2019-11-15 14:42:34 +01:00 |
|
Jens Steube
|
9c2c73c6cc
|
Clear hook buffers after full kernel chain is finished
|
2019-11-15 10:12:33 +01:00 |
|
Jens Steube
|
7458e4f487
|
Add per-device available memory test of static data (hashlist, ruleset) before test of dynamic data (-n based)
|
2019-11-14 11:31:00 +01:00 |
|
Rosen Penev
|
a6edb84157
|
Fix extra semicolon warnings
These macros don't need a ; but since ; is used, make the macros more
robust by enclosing them in a do while loop.
|
2019-11-09 16:42:50 -08:00 |
|
Jens Steube
|
c12470b978
|
Merge pull request #2188 from neheb/cast
Add casts where needed in C++ mode
|
2019-11-05 12:28:21 +01:00 |
|
Jens Steube
|
a8555fa048
|
Support use of all available CPU cores for hash-mode specific hooks
|
2019-11-03 12:05:52 +01:00 |
|
Rosen Penev
|
fd8150769d
|
Add casts where needed in C++ mode
Otherwise, -fpermissive must be passed.
|
2019-09-11 18:05:01 -07:00 |
|
Jens Steube
|
57a149276c
|
Do alias check only in case both CUDA and OpenCL devices were detected
|
2019-08-06 12:44:39 +02:00 |
|
Jens Steube
|
97c9e86d15
|
Filehandling: Print a truncation warning in case an oversized line was detected
|
2019-08-06 12:22:24 +02:00 |
|
Rosen Penev
|
dca1a86315
|
Run through Clang's bugprone-macro-parentheses
|
2019-08-03 22:37:38 -07:00 |
|
Rosen Penev
|
6dc72ebcc5
|
Run through Clang's readability-else-after-return
There's no need for a return statement in an else path. Just take it out.
Simplifies the code slightly.
|
2019-08-03 22:37:38 -07:00 |
|
Rosen Penev
|
fb75164126
|
Run through Clang's google-readability-casting
Removes casts where the type is identical.
|
2019-08-03 22:37:38 -07:00 |
|
Rosen Penev
|
2f76326c37
|
Run through Clang's android-cloexec checkers
This is mainly useful with SELinux.
|
2019-08-03 22:37:37 -07:00 |
|
Rosen Penev
|
98e17d5774
|
Run through clang-tidy's readability-uppercase-literal-suffix
1 and l are visually similar and can be confused. This also changes u to U
for consistency.
|
2019-08-03 19:59:17 -07:00 |
|
Gabriele Gristina
|
ae62e597ce
|
(backend) remove unused *rc* vars and cleanup
|
2019-07-10 16:13:11 +02:00 |
|
Jens Steube
|
a7fd1e40f8
|
Merge pull request #2075 from matrix/zlib_support_2
Add zlib support for loading hashlist/wordlist (v2)
|
2019-07-10 10:56:06 +02:00 |
|
Gabriele Gristina
|
2db6dfcd4e
|
fix HCFILE with potfile BUG and something else related to HCFILE wrong usage
|
2019-07-02 18:27:36 +02:00 |
|
Gabriele Gristina
|
ea786f715f
|
avoid logical negation operator
|
2019-07-02 15:52:17 +02:00 |
|
Gabriele Gristina
|
3161aec3da
|
fix the comments :)
|
2019-07-01 17:27:08 +02:00 |
|
Gabriele Gristina
|
5679ca3344
|
Rewrite hc_fopen to better handling file descriptor locking/unlocking functions, saving kernels binary from plain to gzip format
|
2019-07-01 01:30:24 +02:00 |
|
Gabriele Gristina
|
caf34e0e83
|
Fix some *print* format arguments
|
2019-06-29 17:49:57 +02:00 |
|
Gabriele Gristina
|
5d3ed3e754
|
Remove union from HCFILE, using std file ops in ocl_check_dri, remove debug comments
|
2019-06-28 17:58:08 +02:00 |
|
Gabriele Gristina
|
c2e634c426
|
switch is_gzip from short to bool
|
2019-06-27 23:51:54 +02:00 |
|
Gabriele Gristina
|
481c752456
|
No more compress functions, update example.dict.gz, remove some comments
|
2019-06-27 20:18:47 +02:00 |
|
Gabriele Gristina
|
398c89c75c
|
switch almost all FILE ops, potfile is the only missing
|
2019-06-26 19:06:46 +02:00 |
|
Jens Steube
|
2cda236a18
|
OpenCL Runtime: Do not run a shared- and constant-memory size check if their memory type is of type global memory (typically CPU)
|
2019-06-22 16:01:38 +02:00 |
|
Jens Steube
|
6dfb474adf
|
OpenCL Runtime: Do not run a shared- and constant-memory size check if their memory type is of type global memory (typically CPU)
|
2019-06-22 16:00:48 +02:00 |
|
Gabriele Gristina
|
b2529af172
|
remove original commented code
|
2019-06-22 15:00:50 +02:00 |
|
Gabriele Gristina
|
6cb4abd526
|
Add zlib support v2
|
2019-06-21 21:56:38 +02:00 |
|
Jens Steube
|
955bfeaa14
|
Improve performance of bitsliced algorithms on ROCm
|
2019-06-19 16:35:52 +02:00 |
|
Jens Steube
|
5e0eb288c9
|
Use __launch_bounds__ in CUDA as replacement for reqd_work_group_size() in OpenCL
|
2019-06-16 18:01:26 +02:00 |
|
Jens Steube
|
c2fc849e2c
|
Fix minimum threads_per_block check
|
2019-06-06 20:46:20 +02:00 |
|
Jens Steube
|
0568c0746a
|
Emulate effect of reqd_work_group_size() in CUDA
|
2019-06-06 17:49:41 +02:00 |
|
Jens Steube
|
44ecc83d82
|
Do some CUDA and NVRTC version checks on startup
|
2019-06-05 10:53:48 +02:00 |
|
Jens Steube
|
03ed89684e
|
Use --restrict nvrtc option by default
|
2019-06-04 17:35:10 +02:00 |
|
Jens Steube
|
87c336e822
|
Fix format warning in backend.c
|
2019-06-03 13:41:52 +02:00 |
|
Jens Steube
|
1f6c82b6d1
|
Add hc_cuModuleLoadDataExLog wrapper function for more detailed error logging from CUDA
|
2019-06-01 07:47:30 +02:00 |
|
Jens Steube
|
ce8a6fde0a
|
Fix status screen current password query
|
2019-05-14 15:25:36 +02:00 |
|
Jens Steube
|
f84eaa2e4d
|
Fix bitsliced algorithm brute-force with CUDA
|
2019-05-14 14:08:27 +02:00 |
|
Jens Steube
|
523e0f7151
|
Fix free unallocated memory in case OpenCL initialization failed
|
2019-05-14 10:25:49 +02:00 |
|
Jens Steube
|
bca03bb7ed
|
CUDA offers a nice way to query available device memory, no need to brute force
|
2019-05-14 10:09:46 +02:00 |
|
Jens Steube
|
a6bc1d3cc0
|
Experimental kernel-thread autotuner
|
2019-05-11 11:58:18 +02:00 |
|
Jens Steube
|
d59474fded
|
Testwise unlock full thread count on NVidia
|
2019-05-10 17:27:15 +02:00 |
|
Jens Steube
|
d378aa7ab9
|
Show host memory requirement on startup
|
2019-05-10 16:37:49 +02:00 |
|
Jens Steube
|
46f737c5af
|
Use real constant memory on CUDA
|
2019-05-10 13:22:26 +02:00 |
|
Jens Steube
|
5d14a59304
|
Need 3.x nvrtc minimum
|
2019-05-10 10:11:12 +02:00 |
|
Jens Steube
|
54feb62e94
|
brute-force nvrtc .dll name
|
2019-05-09 22:17:13 +02:00 |
|
Jens Steube
|
a2b5981303
|
Fix some library names
|
2019-05-09 21:20:50 +02:00 |
|
Jens Steube
|
be8f29ca39
|
Only warn about broken NVIDIA driver
|
2019-05-09 16:30:08 +02:00 |
|
Jens Steube
|
39e150fc1e
|
Use xxx_v2 CUDA symbols
|
2019-05-09 14:37:14 +02:00 |
|
Jens Steube
|
33028314f0
|
Add hc_cuCtxSetCacheConfig()
|
2019-05-09 00:04:05 +02:00 |
|
Jens Steube
|
fb82bfc169
|
Improve thread handling based on FIXED_LOCAL_SIZE
|
2019-05-08 23:30:07 +02:00 |
|
Jens Steube
|
3a3df091c7
|
Fix CUDA num_elements
|
2019-05-08 22:42:52 +02:00 |
|
Jens Steube
|
363e789b89
|
Assume local nvrtc.dll and cuda.dll on windows
|
2019-05-07 16:52:08 +02:00 |
|
Jens Steube
|
a7d04adba3
|
Fix opencl_devices_active and backend_devices_active
|
2019-05-07 14:17:29 +02:00 |
|
Jens Steube
|
8ff8c5d536
|
Add LOCAL_VK to make use of __shared__
|
2019-05-07 09:01:32 +02:00 |
|
Jens Steube
|
bbed0cd67a
|
Fix test.sh and bitsliced algos
|
2019-05-06 15:06:02 +02:00 |
|
Jens Steube
|
d0bd33c9d1
|
Rename CONSTANT_AS to CONSTANT_VK
|
2019-05-06 14:34:16 +02:00 |
|
Jens Steube
|
64c495dfa5
|
Use CUDA stream for all cuLaunchKernel() invocations
|
2019-05-06 11:23:34 +02:00 |
|
Jens Steube
|
d94f582097
|
Replace CEILDIV() with round_up_multiple_64()
|
2019-05-06 09:36:07 +02:00 |
|
Jens Steube
|
e9c04c2446
|
More CUDA implementation
|
2019-05-05 21:15:46 +02:00 |
|
Jens Steube
|
08dc1acc02
|
More CUDA rewrites
|
2019-05-05 11:57:54 +02:00 |
|
Jens Steube
|
ec9925f3b1
|
Warnings self-check and autotune with CUDA
|
2019-05-04 21:52:00 +02:00 |
|
Jens Steube
|
4df00033d7
|
Prepare CUDA events
|
2019-05-04 10:44:03 +02:00 |
|
Jens Steube
|
f2948460c9
|
Some first kernel invocations
|
2019-05-04 10:13:43 +02:00 |
|
Jens Steube
|
5ee033673c
|
Disable name mangling in NVRTC's PTX output and more
|
2019-05-03 15:50:07 +02:00 |
|
Jens Steube
|
503304f36a
|
Add some first CUDA device memory allocations and host buffer copies
|
2019-05-03 12:07:06 +02:00 |
|
Jens Steube
|
50a6e720ca
|
More OpenCL variables rename
|
2019-05-02 17:30:46 +02:00 |
|
Jens Steube
|
af8e317cf4
|
Begin renaming some OpenCL only variables
|
2019-05-02 17:12:59 +02:00 |
|
Jens Steube
|
a6fa7a2749
|
Add support for some first CUDA module loader
|
2019-05-02 14:58:52 +02:00 |
|
Jens Steube
|
456c57a6d0
|
Set vector width size for CUDA
|
2019-05-01 18:20:19 +02:00 |
|
Jens Steube
|
3c4f4df771
|
Rename some more variables
|
2019-05-01 15:52:56 +02:00 |
|
Jens Steube
|
495d89f831
|
Find alias devices across different backend API's
|
2019-05-01 07:27:10 +02:00 |
|
Jens Steube
|
6fd936b43a
|
Removed --opencl-platforms filter in order to force backend device numbers to stay constant
|
2019-04-30 16:24:13 +02:00 |
|
Jens Steube
|
e3500ff4aa
|
Add CUDA device attributes to -I
|
2019-04-30 13:38:44 +02:00 |
|
Jens Steube
|
d862458ab5
|
Begin renaming API specific variables in backend section
|
2019-04-29 10:21:59 +02:00 |
|
Jens Steube
|
d73c0ac8a9
|
More CUDA attribute queries
|
2019-04-28 18:54:26 +02:00 |
|
Jens Steube
|
a415422123
|
Initialize CUDA devices and some first attribute queries
|
2019-04-28 14:45:50 +02:00 |
|
Jens Steube
|
58213c81d6
|
Add vector datatypes operators
|
2019-04-26 22:07:56 +02:00 |
|
Jens Steube
|
052e42ccef
|
Fix CUDA_ARCH value
|
2019-04-26 15:14:48 +02:00 |
|
Jens Steube
|
06171958ee
|
Add --gpu-architecture to NVRTC build options
|
2019-04-26 15:10:02 +02:00 |
|
Jens Steube
|
9faba41848
|
Use nvrtc to compile PTX (resulting PTX not yet used)
|
2019-04-26 13:28:44 +02:00 |
|
Jens Steube
|
4045e60021
|
Add nvrtc wrapper for later use
|
2019-04-26 10:03:16 +02:00 |
|
Jens Steube
|
4b986de5fb
|
Prepare native CUDA hybrid integration
|
2019-04-25 14:45:17 +02:00 |
|