ap29600 · June 26, 2025 21:28
diff --git a/primer.txt b/primer.txt
 not yet described in this note: memory management, parsing, bytecode, printing

 C types
 [UW]   [48]byte unsigned
 [GHIL] [1248]byte int 
 C      1-byte char
 S      string
 F      8-byte double
 A      generic value
 A[1234] A (*f)(A x[, A y[, A z[, ...]]])
 VS(size) vector attribute

 K types
 heap
  tA            mixed array
  t[mM]         [dictionary,columnar table], looks like tA (keys; values)
  tE            enumeration, looks like tL (start;end)
  t[B[GHIL]CFS] [bit,[1248]byte int,char,float,symbol] list
  t[lf]         8byte [int,float]
  to            function block, looks like tA (source string;bytecode;argument names;captures...)
  tp            projection, looks like tA (verb;arg_or_GAP...)
  tq            train, looks like tA (verb1;verb2;...)
  tr            applied adverb, looks like tA ,(verb) with the adverb stored in the header.
 inline
  ti            4byte int
  tc            1byte char
  ts            4byte symbol
  t[uvw]        1byte builtin [monad,dyad,adverb]
  t[x]          6byte function pointer tagged by 1byte arity

 anatomy of arrays, access macros
 heap objects are allocated in a buddy system with minimum block size and alignment of 32 bytes
 code may assume this to read/write in full blocks and use vector intrinsics
 -32                             -24                                             -12 -11 -10 -9  -8              -4               0
 v                               v                                               v   v   v   v   v               v               v
 +---+====----====----====----===+----====----====----====----===+----====----===+---+===+---+===+----====----===+----====----===+----====
 |idx|###########################|         next  chunk           |###############|off|adv|ari|typ|   ref count   |    length     | content..
 +---+====----====----====----===+----====----====----====----===+----====----===+---+===+---+===+----====----===+----====----===+----====
  _b                                            _X                                _O  _E  _k  _T         _r              _n           _V
 #          currently unused
 idx        index of memory allocator bucket
 next chunk next chunk in buddy memory allocator
 off        symbol list offset
 adv        adverb in derived verb
 ari        arity of function
 typ        type tag of heap object
 ref count  reference count of heap object
 length     number of elements in heap object

 common accessors
 [xyz]t           type
 [xyz]n           count
 [xyz]w           log2 of bitwidth of each element
 [xyz][VACGHILSF] pointer to [xyz]'s typed data

 constructors
 an(n,t)           any list from length and type
 a[cils](n)        [char,int,long,symbol] scalar from integral value
 a[uvw]+n          builtin [monad,dyad,adverb] from integral value
 ax(p,k)           verb from function pointer and arity
 a[ABC[GHIL]FS](n) [mixed,bit,char,[1248]byte int,float,symbol] list from length
 aV(t,n,p)         any list from type, length and raw data pointer

 conversions
 ct(x,t)           to specified type
 c[BC[GHIL]FS](x)  to [bit,char,[1248]byte int,float,symbol] array
 gZ(x)             enumeration to integer array
 gl(x)             extract heap allocated long

 reference count
 unref
  [xyz](value)     evaluate value, then release [xyz]. e.g. monads are likely to end with `x(<result>)`, dyads with `y(<result>)`
 ref 
  _R(value)        evaluate and acquire value
  [xyz]R           evaluate and acquire [xyz]
 conventions
  (most) monads release their argument. exceptions are usually named with an underscore (e.g. gl_(A x), mtc_(A x,A y))
  (most) [dy,tri]ads release only y[, z]
  (most) 8-args release all their arguments.
  (most) 1+8-args release the later 8 args
  functions may signal their behaviour in the signature 
   A f(A x, A y, /*01*/) /releases y but not x
   A f(A x, A y, /*00*/) /releases neither
   A f(A x, A*a, U n /*0111..*/) / releases only the later n args 

 variable naming scheme and macros
 A [xyz]  function arguments
 A a[8]   function arguments to 8-adic
 Ab8      A b[8];
 Lij      i=x[0], j=x[1]

 control structures
 _(body...)             { return ({body...}); }
 I(cond, body...)J(cond, body...)E(body...) if-then-else
 P(cond, body...)       if cond then return body...
 F[j](n, body...)       for i[j] below n
 W(cond, body...)       while
 S(value, cases...)     switch
  C(value, body...)     case value do body... and break
  R(value, body...)     case value return body...
  R[GHILFA...](body...) case t[GHILFA...]: return body...
  R_(body...)           default return body...
  S4(value, b,o,d,y)    switch with cases 0..<4
 B(cond, body...)       if cond then body... and break
 X(cases...)            switch on xt
 Y(cases...)            switch on yt
 e[cdilnoprstvz](value) produce [compile,domain,index,length,nyi,io,parse,rank,stack,type,value,limit] error and consume value
 N(value,cleanup...)    forward error: if value is an error value (zero) then cleanup... and return zero

 function declarations:
 A[01234](name, body...) [01234]-arg function
 AA(name, body...)       [1-8]-arg function (args in A a[8], count in U n)
 AX(name, body...)       1+[1-8]-arg function (args in A x,A a[8], count in U n)
 X[12](name, body...)    [12]-arg function with switch on xt.
 Y2(name, body...)       2-arg function with switch on yt. 

 dispatch techniques:
 most primitives are implemented as a function per valence, the dispatch code
 indexes the vectors v1,v2,v8 from a.h (e.g. `?` is implemented by `unq`
 (monad), `que` (dyad), `ins` (tri+)). overloads are usually handled based on
 type (with `X1()` to introduce a function) or rank (with `rnk` from `f.c`) and
 are resolved within the function.
 these functions can be thin dispatch layers and delegate to specialized
 implementations; e.g. `asc` dispatches to `ascZ`,`ascA`,`grdm` and `opn` based
 on wether x is numeric, mixed, map, or symbol.
 functions that mostly just dispatch on numeric types often use the `G` macro
 to select from an array of function pointers based on width.
 for example a function `f` that calls `fG` on byte arrays, `fH` on short
 arrays, etc. may be implemented as:
 `X1(f,R_(et(x))RGHIL(G(&fG,fH,fI,fL)[xw-3](x)))`
 the arithmetic functions avoid all implementing the same implicit mapping
 logic with the following scheme:
  - toplevel arith functions like `add` and `sub` all call the `ari` function with their own verb index as extra argument
  - `ari` does implicit mapping mostly independently of the verb
  - `ari` then dispatches back to special code like `addZZ` or `addzZ` for each verb at ranks 0 and 1
 file structure:
 see readme.txt
	not yet described in this note: memory management, parsing, bytecode, printing

	C types
	[UW] [48]byte unsigned
	[GHIL] [1248]byte int
	C 1-byte char
	S string
	F 8-byte double
	A generic value
	A[1234] A (*f)(A x[, A y[, A z[, ...]]])
	VS(size) vector attribute

	K types
	heap
	tA mixed array
	t[mM] [dictionary,columnar table], looks like tA (keys; values)
	tE enumeration, looks like tL (start;end)
	t[B[GHIL]CFS] [bit,[1248]byte int,char,float,symbol] list
	t[lf] 8byte [int,float]
	to function block, looks like tA (source string;bytecode;argument names;captures...)
	tp projection, looks like tA (verb;arg_or_GAP...)
	tq train, looks like tA (verb1;verb2;...)
	tr applied adverb, looks like tA ,(verb) with the adverb stored in the header.
	inline
	ti 4byte int
	tc 1byte char
	ts 4byte symbol
	t[uvw] 1byte builtin [monad,dyad,adverb]
	t[x] 6byte function pointer tagged by 1byte arity

	anatomy of arrays, access macros
	heap objects are allocated in a buddy system with minimum block size and alignment of 32 bytes
	code may assume this to read/write in full blocks and use vector intrinsics
	-32 -24 -12 -11 -10 -9 -8 -4 0
	v v v v v v v v v
	+---+====----====----====----===+----====----====----====----===+----====----===+---+===+---+===+----====----===+----====----===+----====
	\|idx\|###########################\| next chunk \|###############\|off\|adv\|ari\|typ\| ref count \| length \| content..
	+---+====----====----====----===+----====----====----====----===+----====----===+---+===+---+===+----====----===+----====----===+----====
	_b _X _O _E _k _T _r _n _V
	# currently unused
	idx index of memory allocator bucket
	next chunk next chunk in buddy memory allocator
	off symbol list offset
	adv adverb in derived verb
	ari arity of function
	typ type tag of heap object
	ref count reference count of heap object
	length number of elements in heap object

	common accessors
	[xyz]t type
	[xyz]n count
	[xyz]w log2 of bitwidth of each element
	[xyz][VACGHILSF] pointer to [xyz]'s typed data

	constructors
	an(n,t) any list from length and type
	a[cils](n) [char,int,long,symbol] scalar from integral value
	a[uvw]+n builtin [monad,dyad,adverb] from integral value
	ax(p,k) verb from function pointer and arity
	a[ABC[GHIL]FS](n) [mixed,bit,char,[1248]byte int,float,symbol] list from length
	aV(t,n,p) any list from type, length and raw data pointer

	conversions
	ct(x,t) to specified type
	c[BC[GHIL]FS](x) to [bit,char,[1248]byte int,float,symbol] array
	gZ(x) enumeration to integer array
	gl(x) extract heap allocated long

	reference count
	unref
	[xyz](value) evaluate value, then release [xyz]. e.g. monads are likely to end with `x(<result>)`, dyads with `y(<result>)`
	ref
	_R(value) evaluate and acquire value
	[xyz]R evaluate and acquire [xyz]
	conventions
	(most) monads release their argument. exceptions are usually named with an underscore (e.g. gl_(A x), mtc_(A x,A y))
	(most) [dy,tri]ads release only y[, z]
	(most) 8-args release all their arguments.
	(most) 1+8-args release the later 8 args
	functions may signal their behaviour in the signature
	A f(A x, A y, /01/) /releases y but not x
	A f(A x, A y, /00/) /releases neither
	A f(A x, Aa, U n /0111..*/) / releases only the later n args

	variable naming scheme and macros
	A [xyz] function arguments
	A a[8] function arguments to 8-adic
	Ab8 A b[8];
	Lij i=x[0], j=x[1]

	control structures
	_(body...) { return ({body...}); }
	I(cond, body...)J(cond, body...)E(body...) if-then-else
	P(cond, body...) if cond then return body...
	F[j](n, body...) for i[j] below n
	W(cond, body...) while
	S(value, cases...) switch
	C(value, body...) case value do body... and break
	R(value, body...) case value return body...
	R[GHILFA...](body...) case t[GHILFA...]: return body...
	R_(body...) default return body...
	S4(value, b,o,d,y) switch with cases 0..<4
	B(cond, body...) if cond then body... and break
	X(cases...) switch on xt
	Y(cases...) switch on yt
	e[cdilnoprstvz](value) produce [compile,domain,index,length,nyi,io,parse,rank,stack,type,value,limit] error and consume value
	N(value,cleanup...) forward error: if value is an error value (zero) then cleanup... and return zero

	function declarations:
	A[01234](name, body...) [01234]-arg function
	AA(name, body...) [1-8]-arg function (args in A a[8], count in U n)
	AX(name, body...) 1+[1-8]-arg function (args in A x,A a[8], count in U n)
	X[12](name, body...) [12]-arg function with switch on xt.
	Y2(name, body...) 2-arg function with switch on yt.

	dispatch techniques:
	most primitives are implemented as a function per valence, the dispatch code
	indexes the vectors v1,v2,v8 from a.h (e.g. `?` is implemented by `unq`
	(monad), `que` (dyad), `ins` (tri+)). overloads are usually handled based on
	type (with `X1()` to introduce a function) or rank (with `rnk` from `f.c`) and
	are resolved within the function.
	these functions can be thin dispatch layers and delegate to specialized
	implementations; e.g. `asc` dispatches to `ascZ`,`ascA`,`grdm` and `opn` based
	on wether x is numeric, mixed, map, or symbol.
	functions that mostly just dispatch on numeric types often use the `G` macro
	to select from an array of function pointers based on width.
	for example a function `f` that calls `fG` on byte arrays, `fH` on short
	arrays, etc. may be implemented as:
	`X1(f,R_(et(x))RGHIL(G(&fG,fH,fI,fL)[xw-3](x)))`
	the arithmetic functions avoid all implementing the same implicit mapping
	logic with the following scheme:
	- toplevel arith functions like `add` and `sub` all call the `ari` function with their own verb index as extra argument
	- `ari` does implicit mapping mostly independently of the verb
	- `ari` then dispatches back to special code like `addZZ` or `addzZ` for each verb at ranks 0 and 1
	file structure:
	see readme.txt