Skip to content

Instantly share code, notes, and snippets.

@kennypete
Last active September 4, 2025 08:04
Show Gist options
  • Save kennypete/f7f9e08b5cee78654577a2c021c92884 to your computer and use it in GitHub Desktop.
Save kennypete/f7f9e08b5cee78654577a2c021c92884 to your computer and use it in GitHub Desktop.
String measurement in Vim

String measurement in Vim

Vim has built-in functions for measuring the length, width, or number of characters in a string:

Sometimes these return the same value. For example, if there is only a # and alphanumerics on a line, like this:

# An example

That line would return 12 by all five functions, e.g., with the cursor anywhere in that line, :echo strlen(getline('.')), returns 12.

There are many things that cause the functions to return different values. They are worth knowing because different use cases will call for using a distinct function. The following script illustrates an instance where they all differ and, importantly, why.

vim9script
const S_MEASURE: func = (lnum: number = line('.')): list<string> => {
  return [$"strcharlen={getline(lnum)->strcharlen()}",
          $"strchars={getline(lnum)->strchars()}",
          $"strwidth={getline(lnum)->strwidth()}",
          $"strdisplaywidth={getline(lnum)->strdisplaywidth()}",
          $"strlen={getline(lnum)->strlen()}"]
}
# Call S_MEASURE() on line 18, reporting to line 15
18->S_MEASURE()
  ->insert("  #", 0)
  ->join()
  ->setline(15)

  # strcharlen=63 strchars=64 strwidth=65 strdisplaywidth=69 strlen=70

  # Example line 18 (NB: with &tabstop of 8, as set in the modeline):
  # A tab->	, CJK char 漢, combining char é, and 4-byte emoji 😊!
  # strcharlen=63      - The number of character cells with combining
  #                      characters not counted, so if you substitute
  #                      with `:s/./_/g` it would replace with 63x _
  #                      because the é is two Unicode code points
  #                      (U+0065,U+0301), but only one cell.
  # strchars=64        - This is the number of distinct Unicode code
  #                      points.  So the combining acute accent, U+0301,
  #                      is counted too and the result is 64.  Another
  #                      way to think about this is to:
  #                        :echo str2list(getline('.'))->len()
  #                      [Note: If `strchars(1)` was used in the example
  #                      instead of the default (equivalent to
  #                      `strchars(0)`), it would skip combining
  #                      characters and return the same as
  #                      `strcharlen()`.]
  # strwidth=65        - This is the number of display cells the string
  #                      occupies, like strdisplaywidth, but with the
  #                      tab character only counted as one cell.  Since
  #                      tabs are variable length (and in this example,
  #                      five), it is strdisplaywidth (69) - 4, so 65.
  # strdisplaywidth=69 - This is the number of cells the string
  #                      occupies, visually.  One way to think
  #                      about this value is to consider each monospaced
  #                      column of the display as one unit, and the
  #                      position of the cursor when on the last
  #                      visible character of the display:
  #                        :echo virtcol('.')
  #                      ...on that last visible character echos 69.
  #                      However, beware 'conceal' (e.g. in help files,
  #                      where `|` characters are used for hotlinks).
  #                      Concealed characters may be hidden from view,
  #                      but are not excluded in the number returned
  #                      by `strdisplaywidth()` even though they don’t
  #                      occupy any visible space on screen.
  # strlen=70          - The number of bytes in the string.  This,
  #                      is always the largest number of the `str*`
  #                      values (or equal largest).  In the example
  #                      line, there are 63 cells, as shown with
  #                      strcharlen, but:
  #                        漢 (is three bytes: e6 bc a2; one cell)
  #                        é’s combining acute (is two bytes: cc 81; 0)
  #                        😊 (is four bytes: f0 9f 98 8a; one)
  #                      So, strlen is 63 + (3 - 1) + 2 + (4 - 1) = 70.
  #                      (Note: Using `g8`, when in Normal mode, on
  #                      display character shows the UTF-8 bytes.)
# vim:textwidth=73:tabstop=8:

Copying the script above into Vim and sourcing it illustrates this in action. Delete the contents of line 15 of the script first with D or d$ (not dd) to see the output generated on line 15 when sourced with :so.

You may also like to change:

  • Line 18 itself, or
  • The 18 on line 10 of the script, pointing to a different line, to compare the outputs of the functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment