Skip to content

Instantly share code, notes, and snippets.

@gavinanderegg
Last active March 8, 2025 00:47
Show Gist options
  • Save gavinanderegg/3adc44079a9717b5174dead11d87d94b to your computer and use it in GitHub Desktop.
Save gavinanderegg/3adc44079a9717b5174dead11d87d94b to your computer and use it in GitHub Desktop.
What Unicode glyphs are spaces in Python 3.13?
\t
\n
\x0b
\x0c
\r
\x1c
\x1d
\x1e
\x1f
SPACE
\x85
NO-BREAK SPACE
OGHAM SPACE MARK
EN QUAD
EM QUAD
EN SPACE
EM SPACE
THREE-PER-EM SPACE
FOUR-PER-EM SPACE
SIX-PER-EM SPACE
FIGURE SPACE
PUNCTUATION SPACE
THIN SPACE
HAIR SPACE
LINE SEPARATOR
PARAGRAPH SEPARATOR
NARROW NO-BREAK SPACE
MEDIUM MATHEMATICAL SPACE
IDEOGRAPHIC SPACE
import unicodedata
# Some SO answers that were helpful
# https://stackoverflow.com/questions/62292312/how-to-iterate-over-utf-8-in-python
# https://stackoverflow.com/questions/27415935/does-unicode-have-a-defined-maximum-number-of-code-points
# https://stackoverflow.com/questions/14960885/print-escaped-representation-of-a-str
unicode_max = 0x10ffff
glyphs = [ chr(x) for x in range(0, unicode_max + 1) ]
for glyph in glyphs:
if glyph.isspace():
try:
print(unicodedata.name(glyph))
except:
print(glyph.encode('unicode_escape').decode('ASCII'))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment