This is a quick reply to "Hiding text inside text" by Edwin Wagha: https://youtu.be/4LhL9ypdbQU
Edwin Wagha uses strings of 8 chars where each char is either ZWS (zero-width spacer) or ZWNJ (zero-width non-joiner) to represent any single-byte char.
This allows to encode 256 distinct characters using 256 distinct 8-char-strings.
Additionally ZWJ (zero-width joiner) is inserted before and after the secret message (to aid parsing).
If we limit the scope of the encoding to [letters and space] we only need 27 distinct strings.
This can be done with the same chars chosen by Edwin: a string of len 3 made with [ZWS, ZWNJ, ZWJ] → 3³ = 27 permutations.
Therefore instead of 8x (+2) our secret message is only 3x longer.
We can try to optimize further; since some letters are used more than others we can design an encoding that uses fewer chars for frequent letters. The best base-3 variable-len encoding I came up with has 6 shorter letters, 3 equal letters, 18 longer letters.
Therefore the size of the secret message varies between 2x and 4x:
# Example with [0, 1, 2] in place of [ZWS, ZWNJ, ZWJ]
decrypt_map = { "01" :" ", "02" :"E", "10" :"T",
"12" :"A", "20" :"O", "21" :"I", "000" :"N",
"111" :"S", "222" :"H", "0010":"R", "0011":"D",
"0012":"L", "0020":"U", "0021":"C", "0022":"M",
"1100":"F", "1101":"W", "1102":"Y", "1120":"P",
"1121":"V", "1122":"B", "2200":"G", "2201":"K",
"2202":"J", "2210":"X", "2211":"Q", "2212":"Z" }
# In my tests long english texts are around 2.7x, but it varies a lot
example1 = "hello edwin wagha how are you your video is cool" # 2.812x
example2 = "the quick brown fox jumps over the lazy dog s back" # 3.020x# Section above + this one = demo Python script
encrypt_map = {}
for k in decrypt_map:
encrypt_map[decrypt_map[k]] = k
def encrypt(str1):
result = ""
for c in str1:
try: result += encrypt_map[c.upper()]
except: pass
return result
def decrypt(str1):
buf = ""
result = ""
for c in str1:
buf = (buf+c) if (len(buf)<4) else (c)
v = decrypt_map.get(buf)
result += (v or "")
if v: buf = ""
return result
str1 = "hello edwin wagha how are you your video is cool"
str2 = encrypt(str1)
str3 = decrypt(str2)
str4 = "\nEncryption size: " + str(round(len(str2)/len(str1),3)) + "x\n"
print(str1, str2, str3, str4, sep="\n")With base-6 you can encode 6² = 36 letters in 2-chars strings. You just need 6 different invisible zero-width chars. You can also follow the example in the section above to make a base-6 variable-len encoding.
Need more? Base-7 len-2 → 7² = 49 permutations, base-8 len-2 → 8² = 64 permutations...