Update default hash settings to 64MiB:3:4 for Argon2 in -m 70000, following RFC 9106 recommendations.
Add option OPTS_TYPE_THREAD_MULTI_DISABLE: allows plugin developers to disable scaling the password candidate batch size based on device thread count. This can be useful for super slow hash algorithms that utilize threads differently, e.g., when the algorithm allows parallelization. Note: thread count for the device can still be set normally.
Add options OPTI_TYPE_SLOW_HASH_DIMY_INIT/LOOP/COMP: enable 2D launches for slow hash init/loop/comp kernel with dimensions X and Y. The Y value must be set via salt->salt_dimy attribute.
Change autotune kernel-loops start value to the lowest multiple of the target hash iteration count, if kernel_loops_min permits.
Fixed a bug in autotune where kernel_threads_max was not respected during initial init and loop-prepare kernel runs.