Created
April 25, 2023 06:53
-
-
Save Skirmisher/7b472f06e2c7e6fcef2f13d586a1f212 to your computer and use it in GitHub Desktop.
A small bash_profile snippet and a large wall of text: the products of my macOS terminal locale adventures
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# [Mm]ac( )?OS( X)? locale kludges | |
# Okay here's the 4-1-1 folks: | |
# - Terminal.app takes care of setting locale env vars by default. | |
# Normally this works fine and you get "LANG=en_US.UTF-8" or whatever. | |
# But in certain circumstances it fails to do that, and apparently will set | |
# "LC_CTYPE=UTF-8" as a kind of fallback? Which is alright, but Linux systems | |
# tend to have a "C.UTF-8" locale and not "UTF-8", and also (consequently) | |
# some applications (X11? idr) will complain about it not being a valid | |
# locale, because nobody bothered testing on a BSD long enough to catch | |
# edge cases, or whatever. TL;DR just make a "C.UTF-8" entry in the | |
# relevant locale prefix that's a copy of "UTF-8" and it'll shake out fine. | |
# I wrote that I put one in /opt/homebrew/share/locale, but it seems to be | |
# gone now, so I guess something pruned it... | |
# - In an SSH session (or really, anything not launched from Terminal.app), no | |
# locale vars are defined, so just default to "LANG=en_US.UTF-8" (setting up | |
# ssh env import sounds annoying to maintain across clients). | |
# - LANG serves as a "default" for any of the LC_* vars, while LC_ALL is | |
# more of an "override". In other words, their precedence is thus: | |
# LC_ALL > LC_* > LANG | |
# - Locale search paths are a bit nebulous: | |
# - PATH_LOCALE is listed in setlocale(3) next to /usr/share/locale, but | |
# not really explained. | |
# - FreeBSD's and NetBSD's locale(1) pages specify that `locale -a` "will | |
# respect the PATH_LOCALE environment variable, and use it instead of the | |
# system default locale directory." One could assume this is standard | |
# behavior for the whole operating system. | |
# - macOS, more concerned with its own locale system, never clarified this. | |
# However, sometime after 10.4, it *did* document /usr/local/share/locale | |
# as an additional search path in setlocale(3). Judging by the formatting | |
# of this page as well as its common BSD ancestor, PATH_LOCALE indeed | |
# always *replaces* the default system search path (but newer macOS will, | |
# from what I can guess, still search the local-admin directory first). | |
# - OpenBSD seems to have thrown out PATH_LOCALE and are off doing whatever | |
# the fuck it is they're trying to do. | |
# - glibc has its own locale path var, LOCPATH, documented in locale(7). | |
# Unlike PATH_LOCALE, this one can contain a PATH-style colon-separated | |
# list of paths, and the default system path is still searched afterwards. | |
# But also its locale handling is much more complex, the aforementioned | |
# paths are the ones for "compiled individual locale files" (which are in | |
# /usr/lib/locale), and there's also "locale archives" which will not be | |
# searched if LOCPATH is set. Frankly, after having read the BSD docs, | |
# all this feels a bit overengineered, though I'm not fond of BSD docs | |
# being full of holes either... | |
# - Anyway, decades of UNIX history have led me to wildly overthink this. | |
# Here's what I do for a nicer time on macOS: | |
if [[ ! -v LANG ]]; then | |
# Not under Terminal.app, probably | |
if [[ -v LC_CTYPE ]]; then | |
# If we have the Magic Mac Fallback, make it Linuxy instead | |
# (TODO document what complains with each setting) | |
[ "$LC_CTYPE" = "UTF-8" ] && export LC_CTYPE=C.UTF-8 | |
else | |
# Nothing is set, define my usual locale | |
export LANG=en_US.UTF-8 | |
fi | |
fi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment