Skip to content

Instantly share code, notes, and snippets.

@DavidBevi
Last active January 6, 2026 23:49
Show Gist options
  • Select an option

  • Save DavidBevi/89c909e3ba49bfabd92d4a1c6e81b8aa to your computer and use it in GitHub Desktop.

Select an option

Save DavidBevi/89c909e3ba49bfabd92d4a1c6e81b8aa to your computer and use it in GitHub Desktop.
Optimizing hidden text into text

This is a quick reply to "Hiding text inside text" by Edwin Wagha: https://youtu.be/4LhL9ypdbQU

Hiding text into text

Edwin Wagha uses strings of 8 chars where each char is either ZWS (zero-width spacer) or ZWNJ (zero-width non-joiner) to represent any single-byte char. This allows to encode 256 distinct characters using 256 distinct 8-char-strings.

Additionally ZWJ (zero-width joiner) is inserted before and after the secret message (to aid parsing).

Hide it more efficiently

If we limit the scope of the encoding to [letters and space] we only need 27 distinct strings. This can be done with the same chars chosen by Edwin: a string of len 3 made with [ZWS, ZWNJ, ZWJ] → 3³ = 27 permutations.

Therefore instead of 8x (+2) our secret message is only 3x longer.

Even more efficiently?

We can try to optimize further; since some letters are used more than others we can design an encoding that uses fewer chars for frequent letters. The best base-3 variable-len encoding I came up with has 6 shorter letters, 3 equal letters, 18 longer letters.

Therefore the size of the secret message varies between 2x and 4x:

# Example with [0, 1, 2] in place of [ZWS, ZWNJ, ZWJ]
decrypt_map = {   "01"  :" ",   "02"  :"E",   "10"  :"T",
    "12"  :"A",   "20"  :"O",   "21"  :"I",   "000" :"N",
    "111" :"S",   "222" :"H",   "0010":"R",   "0011":"D",
    "0012":"L",   "0020":"U",   "0021":"C",   "0022":"M",
    "1100":"F",   "1101":"W",   "1102":"Y",   "1120":"P",
    "1121":"V",   "1122":"B",   "2200":"G",   "2201":"K",
    "2202":"J",   "2210":"X",   "2211":"Q",   "2212":"Z" }

# In my tests long english texts are around 2.7x, but it varies a lot
example1 = "hello edwin wagha how are you your video is cool"   # 2.812x
example2 = "the quick brown fox jumps over the lazy dog s back" # 3.020x
# Section above + this one = demo Python script
encrypt_map = {}
for k in decrypt_map:
    encrypt_map[decrypt_map[k]] = k

def encrypt(str1):
    result = ""
    for c in str1:
        try: result += encrypt_map[c.upper()]
        except: pass
    return result

def decrypt(str1):
    buf = ""
    result = ""
    for c in str1:
        buf = (buf+c) if (len(buf)<4) else (c)
        v = decrypt_map.get(buf)
        result += (v or "")
        if v: buf = ""
    return result
    
str1 = "hello edwin wagha how are you your video is cool"
str2 = encrypt(str1)
str3 = decrypt(str2)
str4 = "\nEncryption size: " + str(round(len(str2)/len(str1),3)) + "x\n"

print(str1, str2, str3, str4, sep="\n")

And beyond?

With base-6 you can encode 6² = 36 letters in 2-chars strings. You just need 6 different invisible zero-width chars. You can also follow the example in the section above to make a base-6 variable-len encoding.

Need more? Base-7 len-2 → 7² = 49 permutations, base-8 len-2 → 8² = 64 permutations...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment