Jens Steube
|
63f6ca5114
|
Do not use __local memory for whirlpool if running on a device without physical shared memory
|
2017-09-05 16:45:20 +02:00 |
|
jsteube
|
8b0e7087c7
|
Fixed an invalid optimization code in kernel 7700 depending on the input hash, causing the kernel to loop forever
|
2017-09-03 13:36:14 +02:00 |
|
jsteube
|
151dbc5349
|
Fix replace value in inc_hash_ripemd160.cl
|
2017-09-01 16:35:08 +02:00 |
|
jsteube
|
f859f466ef
|
Fix -m 8300 in -a 0 mode
|
2017-09-01 16:10:29 +02:00 |
|
jsteube
|
f5e04254dc
|
Fix -m 10800 in -a 0 mode
|
2017-09-01 16:06:42 +02:00 |
|
jsteube
|
d3b9febb30
|
Fix some double variable declarations
|
2017-08-30 16:29:25 +02:00 |
|
jsteube
|
40b57677cd
|
OpenCL Kernels: Reactivate Dalibors XOR optimization on MD5_H on all MD5 based algorithms
|
2017-08-30 15:32:09 +02:00 |
|
jsteube
|
6d112aeb39
|
OpenCL Kernels: Rewritten Keccak kernel to run fully on registers and partially reversed last round
|
2017-08-30 13:27:04 +02:00 |
|
jsteube
|
a378abee66
|
Add missing NEW_SIMD_CODE in -m 6600
|
2017-08-29 12:01:43 +02:00 |
|
jsteube
|
1c169af0ad
|
Make -m 14100 a pure kernel only
|
2017-08-28 22:26:30 +02:00 |
|
jsteube
|
2b9888486e
|
Make -m 14000 a pure kernel only and add volatile for asm statement
|
2017-08-28 22:20:40 +02:00 |
|
jsteube
|
99f416435e
|
Fix invalid use of __constant in LM kernel
|
2017-08-28 19:40:51 +02:00 |
|
jsteube
|
6db2f4cc18
|
Fix typo
|
2017-08-28 15:54:47 +02:00 |
|
jsteube
|
918578bee1
|
Improve some NVidia specific inline assembly
|
2017-08-28 14:15:47 +02:00 |
|
jsteube
|
9de1e557bb
|
More VEGA specific inline assembly to improve SHA1 based kernels
|
2017-08-28 09:24:06 +02:00 |
|
jsteube
|
a0be36d7b8
|
Fix compile error caused by __add3()
|
2017-08-27 19:46:17 +02:00 |
|
jsteube
|
00e38cc2c6
|
Add VEGA specific inline assembly to improve all MD4, MD5, SHA1 and SHA256 based kernels
|
2017-08-27 19:36:07 +02:00 |
|
jsteube
|
7bfd343ec9
|
Optimized rule_op_mangle_dupechar_last(), rule_op_mangle_rotate_right(), rule_op_mangle_rotate_left() and append_block1() in rule engine
|
2017-08-27 16:47:21 +02:00 |
|
jsteube
|
52a97fee75
|
Improve rule engine performance by improving append_0x80_xxx() performance by using precomputed values from constant memory
|
2017-08-27 14:22:20 +02:00 |
|
jsteube
|
3260000357
|
Fix whirlpool pure kernel in -a 0 mode
|
2017-08-26 19:51:37 +02:00 |
|
jsteube
|
e3810d054b
|
Fix some use of pw_t tmp variable
|
2017-08-26 19:48:38 +02:00 |
|
jsteube
|
5e01ff4c53
|
Refactor some u32x to u32 where u32x is not needed
|
2017-08-26 18:31:50 +02:00 |
|
jsteube
|
1aa76eac15
|
Refactor use of __constant to match up with the user selected attack mode
|
2017-08-25 17:52:55 +02:00 |
|
jsteube
|
938c281ee0
|
Resurrect some volatile variables in order to correctly compile pure kernels on AMD drivers
|
2017-08-25 17:06:07 +02:00 |
|
jsteube
|
48fbe81a09
|
Add more inline assembly for AMD ROCm
|
2017-08-25 16:33:00 +02:00 |
|
jsteube
|
6c619155c3
|
Workaround ROCm compiler error in aes256_ExpandKey()
|
2017-08-25 12:10:36 +02:00 |
|
jsteube
|
8c9c36ee2a
|
Fix out-of-bound access in aesXXX_InvertKey()
|
2017-08-25 11:52:07 +02:00 |
|
jsteube
|
bed7e8f466
|
Remove unused truncate_block_xxx_xx() functions and update kernels to use the _S function
|
2017-08-24 20:07:43 +02:00 |
|
jsteube
|
51dc1c7db3
|
Use truncate_block_4x4_le_S() instead of truncate_block_4x4_le() in -m 6800
|
2017-08-24 19:53:29 +02:00 |
|
jsteube
|
9b73c464d2
|
Fix typo in macro
|
2017-08-24 17:19:16 +02:00 |
|
jsteube
|
7b443ee7ff
|
Optimize performance of rule_op_mangle_title_sep(), rule_op_mangle_purgechar() and rule_op_mangle_replace()
|
2017-08-24 17:14:33 +02:00 |
|
jsteube
|
0de41c2716
|
Some more optimizations for rule engine
|
2017-08-24 15:09:55 +02:00 |
|
jsteube
|
9f8c5a253d
|
More rule engine performance optimizations
|
2017-08-24 00:49:46 +02:00 |
|
jsteube
|
0783289e2f
|
Optimized a0 pure kernel for AMD
|
2017-08-23 13:40:22 +02:00 |
|
jsteube
|
a5659d5619
|
Also switch optimized kernels rule engine to make use of kernel rules in constant memory
|
2017-08-23 12:46:14 +02:00 |
|
jsteube
|
1d04de3a8e
|
Limit kernel-loops in straight-mode to 256, therefore allow rules to be stored in constant memory
|
2017-08-23 12:43:59 +02:00 |
|
jsteube
|
51372438fe
|
Allow OpenCL kernel inline assembly if ROCm drivers was detected
|
2017-08-22 18:47:53 +02:00 |
|
jsteube
|
8853884f2a
|
Fix append_four_byte() in case sm8 is 0
|
2017-08-21 16:04:43 +02:00 |
|
jsteube
|
f32e113942
|
Add missing case in append_block() in pure kernel rule engine
|
2017-08-20 15:08:51 +02:00 |
|
jsteube
|
6907981f08
|
Backport current state of optimized kernel rule engine to CPU
|
2017-08-20 12:50:24 +02:00 |
|
jsteube
|
508f1562f2
|
Fix --stdout kernels, gid_max was still set to u32
|
2017-08-20 12:13:34 +02:00 |
|
jsteube
|
319799bbbf
|
Switch the datatypes of the variables responsible for work-item count and work-item size from u32 to u64
|
2017-08-19 16:39:22 +02:00 |
|
jsteube
|
d9c906e134
|
Move 0x80 to hardcoded position for sha3-256 bit in order to allow ROCm compiler to use registers only
|
2017-08-18 16:22:25 +02:00 |
|
jsteube
|
694cc0b740
|
Remove all calls to overwrite_at_* functions
|
2017-08-17 16:20:01 +02:00 |
|
jsteube
|
e984a829ea
|
Remove no longer needed overwrite_at_* functions
|
2017-08-17 15:53:09 +02:00 |
|
jsteube
|
bf299fe043
|
Optimized 3DES for rocm
|
2017-08-17 14:03:55 +02:00 |
|
jsteube
|
ad1ce462d1
|
Get rid of ceil() in OpenCL kernels
|
2017-08-17 13:43:35 +02:00 |
|
jsteube
|
53f53fe014
|
Reduced number of required registers in SIP based on maximum possible esalt length
|
2017-08-17 12:16:49 +02:00 |
|
jsteube
|
9ee5da40e0
|
Workaround rocm compiler error for -m 15300
|
2017-08-17 11:25:34 +02:00 |
|
jsteube
|
88e995ddcf
|
Replace some SIMD related function calls
|
2017-08-17 11:18:39 +02:00 |
|