Skip to content

Instantly share code, notes, and snippets.

@kennypete
Created September 4, 2025 08:02
Show Gist options
  • Save kennypete/d7c3eef8fc5d9bf25dbea8fac0f1fa25 to your computer and use it in GitHub Desktop.
Save kennypete/d7c3eef8fc5d9bf25dbea8fac0f1fa25 to your computer and use it in GitHub Desktop.
String measurement in Neovim

String measurement in Neovim

Neovim has built-in functions for measuring the length, width, or number of characters in a string:

Sometimes these return the same value. For example, if there is only a # and alphanumerics on a line, like this:

# An example

That line would return 12 by all five functions, e.g., with the cursor anywhere in that line, :echo strlen(getline('.')), returns 12.

There are many things that cause the functions to return different values. They are worth knowing because different use cases will call for using a distinct function. The following script illustrates an instance where they all differ and, importantly, why.

local function s_measure(lnum)
  lnum = lnum or vim.fn.line('.')
  local line = vim.fn.getline(lnum)
  return { "strcharlen=" .. vim.fn.strcharlen(line),
           "strchars=" .. vim.fn.strchars(line),
           "strwidth=" .. vim.fn.strwidth(line),
           "strdisplaywidth=" .. vim.fn.strdisplaywidth(line),
           "strlen=" .. vim.fn.strlen(line), }
end
-- Call s_measure() on line 19, reporting to line 16
local measurements = s_measure(19)
table.insert(measurements, 1, " --")
local output = table.concat(measurements, " ")
vim.fn.setline(16, output)

 -- strcharlen=63 strchars=64 strwidth=65 strdisplaywidth=69 strlen=70

 -- Example line 19 (NB: with &tabstop of 8, as set in the modeline):
 -- A tab->	, CJK char 漢, combining char é, and 4-byte emoji 😊!
 -- strcharlen=63      - The number of character cells with combining
 --                      characters not counted, so if you substitute
 --                      with `:s/./_/g` it would replace with 63x _
 --                      because the é is two Unicode code points
 --                      (U+0065,U+0301), but only one cell.
 -- strchars=64        - This is the number of distinct Unicode code
 --                      points.  So the combining acute accent, U+0301,
 --                      is counted too and the result is 64.  Another
 --                      way to think about this is to:
 --                        :echo str2list(getline('.'))->len()
 --                      [Note: If `strchars(1)` was used in the example
 --                      instead of the default (equivalent to
 --                      `strchars(0)`), it would skip combining
 --                      characters and return the same as
 --                      `strcharlen()`.]
 -- strwidth=65        - This is the number of display cells the string
 --                      occupies, like strdisplaywidth, but with the
 --                      tab character only counted as one cell.  Since
 --                      tabs are variable length (and in this example,
 --                      five), it is strdisplaywidth (69) - 4, so 65.
 -- strdisplaywidth=69 - This is the number of cells the string
 --                      occupies, visually.  One way to think
 --                      about this value is to consider each monospaced
 --                      column of the display as one unit, and the
 --                      position of the cursor when on the last
 --                      visible character of the display:
 --                        :echo virtcol('.')
 --                      ...on that last visible character echos 69.
 --                      However, beware 'conceal' (e.g. in help files,
 --                      where `|` characters are used for hotlinks).
 --                      Concealed characters may be hidden from view,
 --                      but are not NOT excluded in the number
 --                      returned by `strdisplaywidth()` even though
 --                      they don't occupy any visible space on screen.
 -- strlen=70          - The number of bytes in the string.  This,
 --                      is always the largest number of the `str*`
 --                      values (or equal largest).  In the example
 --                      line, there are 63 cells, as shown with
 --                      strcharlen, but:
 --                        漢 (is three bytes: e6 bc a2; one cell)
 --                        é's combining acute (is two bytes: cc 81; 0)
 --                        😊 (is four bytes: f0 9f 98 8a; one)
 --                      So, strlen is 63 + (3 - 1) + 2 + (4 - 1) = 70.
 --                      (Note: Using `g8`, when in Normal mode, on
 --                      display character shows the UTF-8 bytes.)
-- vim:textwidth=73:tabstop=8:

Copying the script above into Neovim, saving the file as s_measure.lua and sourcing it illustrates this in action. Delete the contents of line 16 of the script first with D or d$ (not dd) to see the output generated on line 16 when sourced with :so.

You may also like to change:

  • Line 19 itself, or
  • The 19 on line 11 of the script, pointing to a different line, to compare the outputs of the functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment