Vim’s libcall()
is used to call a function in
either a Windows .dll
or Linux .so
library.
Vim’s builtin.txt:
libcall({libname}, {funcname}, {argument}) Call function {funcname} in the run-time library {libname} with single argument {argument}. This is useful to call functions in a library that you especially made to be used with Vim. Since only one argument is possible, calling standard library functions is rather limited. The result is the String returned by the function. If the function returns NULL, this will appear as an empty string "" to Vim. If the function returns a number, use libcallnr()! If {argument} is a number, it is passed to the function as an int; if {argument} is a string, it is passed as a null-terminated string. This function will fail in restricted-mode. libcall() allows you to write your own 'plug-in' extensions to Vim without having to recompile the program. It is NOT a means to call system functions! If you try to do so Vim will very probably crash. For Win32, the functions you write must be placed in a DLL and use the normal C calling convention (NOT Pascal which is used in Windows System DLLs). The function must take exactly one parameter, either a character pointer or a long integer, and must return a character pointer or NULL. The character pointer returned must point to memory that will remain valid after the function has returned (e.g. in static data in the DLL). If it points to allocated memory, that memory will leak away. Using a static buffer in the function should work, it's then freed when the DLL is unloaded.
Some scenarios may be well-suited to using libcall()
. One use
case is a large dictionary. Although vim9script is compiled, Vim
cannot pre-compile vim9script. So, a dictionary or list used by a
plugin needs to be created with each and every new Vim instance.
That’s fine, usually, but what if you have a dictionary or list
that is several, or even hundreds, of megabytes?
ℹ️
|
The Unicode character database, for example, is a few hundred megabytes. |
Especially if you are writing it for your own setup, where the vagaries
of operating systems, versions, etc., are of only concern to you, using
pre-compiled .dll
or .so
may have advantages.
As noted in the help, the returned result is always either a
string or a number. The help notes that is a limitation, but
it won’t be in all cases.
Keep in mind, returning a string means that even
dictionaries of dictionaries are feasible (using eval()
on
that returned string, which is what is demonstrated in the
following example).
Depending on how it’s written, the compiled .dll
or .so
should be
very fast, though some factors may significantly impact performance.
One such factor is using WSL and having the .so
in a location
on the Windows file system (/mnt/c/Users/…
, for example).
🔥
|
I am no C programmer! The following C code is potentially not
great! (And it was created with some AI help.) That is not an issue for
this example, and, at any rate, the same concept has been tested on
a 300Mb .dll and .so , and the string for any requested key was
returned in <0.1 second. So, there may well be room for improvement,
but it does the job for this illustration.
|
ℹ️
|
Skip this if a C compiler already exists on your Windows PC (though the scripts calling the compiler may need adjustment, perhaps). |
MSYS2 with gcc is a relatively simple means of adding the gcc C compiler to a Windows 64-bit PC. Steps:
-
Install MSYS2 from https://www.msys2.org/
-
In MSYS2 UCRT64, install gcc with:
pacman -S mingw-w64-ucrt-x86_64-gcc
-
Validate that gcc has installed:
gcc --version
Something like this should be returned:
gcc.exe (Rev3, Built by MSYS2 project) 14.1.0 Copyright (C) 2024 Free Software Foundation, Inc.
The following eg.c
file creates a static array of key-value pairs
where the key is the Unicode code point and the value is a tiny
subset of the Unicode database (i.e.,
the XML version)
content for each associated code point.
Each value is a string, which subsequently may be turned into a Vim
dictionary with eval()
. As noted in the comments, this has been
left as-is, aside from adding more comments and removal
of over 155,000 pairs, for this demo.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdint.h>
// NB: The _real_ Unicode code points table has >155k entries.
// It has been left as-is.
#define TABLE_SIZE 262144
typedef struct {
const char *key;
const char *value;
} KeyValue;
typedef struct HashNode {
const char *key;
const char *value;
struct HashNode *next;
} HashNode;
static HashNode *hash_table[TABLE_SIZE] = {0};
static int initialized = 0;
// Static array of key-value pairs.
// The 'real' data is >155,000 key-value pairs.
static KeyValue dictionary[] = {
{"00A0", "{'na': 'NO-BREAK SPACE', 'gc': 'Zs', 'bc': 'CS', 'dt': 'nb', 'dm': '0020'}"},
{"00A1", "{'na': 'INVERTED EXCLAMATION MARK', 'gc': 'Po', 'bc': 'ON'}"},
{"00A2", "{'na': 'CENT SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
{"00A3", "{'na': 'POUND SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
{"00A4", "{'na': 'CURRENCY SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
{"00A5", "{'na': 'YEN SIGN', 'gc': 'Sc', 'bc': 'ET'}"},
{"00A6", "{'na': 'BROKEN BAR', 'gc': 'So', 'bc': 'ON'}"},
{"00A7", "{'na': 'SECTION SIGN', 'gc': 'Po', 'bc': 'ON'}"},
{"00A8", "{'na': 'DIAERESIS', 'gc': 'Sk', 'bc': 'ON', 'dt': 'com', 'dm': '0020 0308'}"},
{"00A9", "{'na': 'COPYRIGHT SIGN', 'gc': 'So', 'bc': 'ON'}"},
{"00AA", "{'na': 'FEMININE ORDINAL INDICATOR', 'dt': 'sup', 'dm': '0061'}"},
{"00AB", "{'na': 'LEFT-POINTING DOUBLE ANGLE QUOTATION MARK', 'gc': 'Pi', 'bc': 'ON', 'bm': 'Y'}"},
{"00AC", "{'na': 'NOT SIGN', 'gc': 'Sm', 'bc': 'ON'}"},
{"00AD", "{'na': 'SOFT HYPHEN', 'gc': 'Cf', 'bc': 'BN'}"},
{"00AE", "{'na': 'REGISTERED SIGN', 'gc': 'So', 'bc': 'ON'}"},
{"00AF", "{'na': 'MACRON', 'gc': 'Sk', 'bc': 'ON', 'dt': 'com', 'dm': '0020 0304'}"},
{NULL, NULL} // End marker
};
// FNV-1a hash function.
// Again, this has been left as produced by Claude AI for
// where there are ~155k key/value pairs, which have been omitted.
uint32_t hash(const char *key) {
uint32_t h = 2166136261u;
for (; *key; key++) {
h ^= *key;
h *= 16777619;
}
return h % TABLE_SIZE;
}
void init_hash_table() {
if (initialized) return;
for (int i = 0; dictionary[i].key != NULL; i++) {
uint32_t index = hash(dictionary[i].key);
HashNode *new_node = malloc(sizeof(HashNode));
new_node->key = dictionary[i].key;
new_node->value = dictionary[i].value;
new_node->next = hash_table[index];
hash_table[index] = new_node;
}
initialized = 1;
}
const char* get_value(const char *key) {
if (!initialized) {
init_hash_table();
}
uint32_t index = hash(key);
HashNode *current = hash_table[index];
while (current != NULL) {
if (strcmp(current->key, key) == 0) {
return current->value;
}
current = current->next;
}
return "{}";
}
If this is saved as eg.c
, the command line gcc -O3 -shared -o eg.dll eg.c
,
(Windows) or gcc -O3 -fPIC -shared -o eg.so eg.c
(Linux) should create
the associated eg.dll
or eg.so
file.
💡
|
These commands are saved to eg_dll.sh and eg_so.sh in
the libcall_Vim_builtin.7z file, below. The former should be run with MSYS2 UCRT64 Shell and
the latter in Linux / WSL.
|
To directly use libcall()
with eg.dll
or eg.so
, the
following Windows and Linux instructions explain how.
Open Vim, enter command-line mode (with :
), then put the following,
replacing FULLPATH
with the full file path (to the .dll
):
call append('$', libcall('{FULLPATH}\eg', 'get_value', '00A1'))
ℹ️
|
In Windows, the .dll extension is omitted from {libname} , hence
it is {FULL_PATH}\eg , with no .dll extension. This is explained
in builtin.txt.
|
The demo files eg_dll_test.vim
and eg_so_test.vim
may be used to see
this working in action. Their content is not reproduced here, nor are
they essential. They include using eval()
to take the returned
string, turn it into a dictionary, and return a value relating
to a specified key. They are also shown working in the
animated .gif
files, below.