Skip to content

Instantly share code, notes, and snippets.

@KevinDanikowski
Last active June 23, 2021 14:41
Show Gist options
  • Save KevinDanikowski/24c79cbb7a3ef2a7f3e452e740848249 to your computer and use it in GitHub Desktop.
Save KevinDanikowski/24c79cbb7a3ef2a7f3e452e740848249 to your computer and use it in GitHub Desktop.
formats text into correct url format for any language
const noHyphenLangs = ['ko', 'ja', 'zh-cn', 'zh-tw', 'ar', 'th']
const formatTranslationIntoPath = (text, symbol) => { // utf-8 encoding
let t = text
const replaceChar = noHyphenLangs.includes(symbol) ? '' : '-'
t = t.replace(/-/g, ' ')
t = t.replace(/\s/g, replaceChar)
t = t.replace(/['`]/g, '') // remove quotes
t = t.replace(/[,()]/g, '') // remove junk
t = t.normalize('NFD').replace(/\p{Diacritic}/gu, '') // simplify letters for url https://stackoverflow.com/questions/990904/remove-accents-diacritics-in-a-string-in-javascript
// fix any left over diacritic failed to replace
t = t.replace(/[Łł]/g, 'l')
t = t.replace(/đ/g, 'd')
return t.toLowerCase()
}
const ex1 = formatTranslationIntoPath('让我们 尝试-这样-做', 'zh-cn') // 让我们尝试这样做
const ex2 = formatTranslationIntoPath('Việt miễn phí', 'vi') // viet-mien-phi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment