Created
November 17, 2014 09:33
-
-
Save Sennahoi/36c834ea413f2dfec3fd to your computer and use it in GitHub Desktop.
Some basic python encoding/decoding/unicode examples
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
string_with_utf8 = "T\xc3\xa4ter" # str, no unicode | |
correct_unicode = string_with_utf8.decode("utf-8") # interpret string as utf-8 | |
print repr(correct_unicode) # T\xe4ter, correct unicode string | |
string_with_utf8_new = correct_unicode.encode("utf-8") # make a utf-8 str() | |
print repr(string_with_utf8_new) # equals repr(string_with_utf8), str() | |
#correct_unicode.encode("ascii") # UnicodeEncodeError becuase ascii has no representation for this char! | |
bad_unicode = unicode("T\xc3\xa4ter") | |
print repr(bad_unicode) #u'T\xc3\xa4ter' not interpreted as utf-8, so: | |
correct_unicode = unicode("T\xc3\xa4ter", "utf-8") # make a unicode string from a utf-8 representation | |
print repr(correct_unicode) # u'T\xe4ter', nice! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment