From febb6692e097ba5915dd2167ec77829591be4665 Mon Sep 17 00:00:00 2001
From: philsmd <921533+philsmd@users.noreply.github.com>
Date: Fri, 3 Jan 2020 11:41:10 +0100
Subject: [PATCH] fixes #2121: explain the utf16-le / utf16-be limitation in
 docs/limits.txt

---
 docs/limits.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/docs/limits.txt b/docs/limits.txt
index 884a781be..0ee928080 100644
--- a/docs/limits.txt
+++ b/docs/limits.txt
@@ -22,6 +22,17 @@ Important: That does not mean UTF-16 file content, which is fully supported.
 
 It only means the filename itself.
 
+##
+## Hashing algorithms that internally use UTF-16 characters could in special cases lead to false negatives
+##
+
+The UTF-16 conversion implementation used within the kernel code is very elementary and for performance
+reasons does not respect all complicated encoding rules required to correctly convert, for instance, ASCII
+or UTF-8 to UTF-16LE (or UTF-16BE).
+
+The implementation most likely fails with multi-byte characters, because we basically add a zero byte every
+second byte within the kernel conversion code.
+
 ##
 ## The use of --keep-guessing eventually skips reporting duplicate passwords
 ##