Skip to content

Instantly share code, notes, and snippets.

@ynyyn
Last active April 11, 2025 10:45
Show Gist options
  • Save ynyyn/ba10d6df6c72925a7fcb33aa8d4008d2 to your computer and use it in GitHub Desktop.
Save ynyyn/ba10d6df6c72925a7fcb33aa8d4008d2 to your computer and use it in GitHub Desktop.
A Simulate implementation of `encodeURIComponent()` in Cocoa Objective-C that provides similar percent-encoding behaviour.

If you're tired of CFURLCreateStringByAddingPercentEscapes(), try this.

I didn't write a Swift version, but it should be easy for you to adapt.


It is both funny and ironic that until today, in 2025, the standard library of Apple’s Cocoa ecosystem still lacks out-of-the-box but versatile APIs that can handle these most basic (and important) things "correctly", and vary with different common scenarios.

/// Simulate implementation of `encodeURIComponent()` in Cocoa Objective-C that provides similar
/// percent-encoding behaviour.
///
/// Escapes all characters except: `A–Z a–z 0–9 - _ . ! ~ * ' ( )`
+ (NSString *)encodeURIComponent:(NSString *)string {
NSMutableCharacterSet *uriComponentAllowedCharacterSet;
// Based on `URLQueryAllowedCharacterSet`, while it contains extra characters that need to be
// escaped as for `encodeURIComponent()`.
uriComponentAllowedCharacterSet = [NSCharacterSet.URLQueryAllowedCharacterSet mutableCopy];
// Trim the character set by removing the extra characters.
[uriComponentAllowedCharacterSet removeCharactersInString:@" \"#$%&+,/:;<=>?@[\\]^`{|}"];
// Note: Or just remove @"$&+,/:;=?@" for short, if relying on the implementation knowledge of
// characters that already excluded from the built-in allowed set.
return [string stringByAddingPercentEncodingWithAllowedCharacters:uriComponentAllowedCharacterSet];
}
@ynyyn
Copy link
Author

ynyyn commented Apr 11, 2025

Discussion:

Q: Why not use .alphanumericCharacterSet + "-_.!~*'()" (by addCharactersInString)

A: Although it works, the way it works is somehow weird.

On Stack Overflow, many people have pointed out that .alphanumericCharacterSet is actually a HUGE map that includes international character (Unicode) and not just ASCII.

Take a look at [NSCharacterSet.alphanumericCharacterSet bitmapRepresentation]. It covers many Unicode characters.

This leads me to believe that .alphanumericCharacterSet as "AllowedCharacters" is not the intended use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment