Skip to content

Instantly share code, notes, and snippets.

@johnteee
Forked from pyliaorachel/regex_useful.md
Created May 18, 2023 05:29
Some Regular Expressions that may be useful for data cleaning.
Punctuations, US-ASCII

/[!"#$%&()*+,\-.\/:;<=>?@\[\]^_`{|}~]/

Punctuations, include Unicode ones (\u2000-\u206F: general punctuations, \u2E00-\u2E7F: supplemental punctuations)

/[\u2000-\u206F\u2E00-\u2E7F\\'!"#$%&()*+,\-.\/:;<=>?@\[\]^_`{|}~]/

Chinese characters

/[\u4E00-\u9FFF]/

Non-english letters

/[\u00C0-\u1FFF\u2C00-\uD7FF]/

All letters

[\u00C0-\u1FFF\u2C00-\uD7FF\w]

References
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment