Introduction

My name is Alrick Grandison. This is my initial design for my programming language, Azen. Azen is designed to be a modern alternate to C. I will commence building the language using LLVM. I will stream my progress on youtube. The initial design shown here in this document is subject to change. However, I will do my best to stay true to the initial design.

For the time being, I will be keeping the compiler source code closed.

Tooling features I hope to accomplish:

LLVM compiler that compiles to machine code
Auto generating import modules

Algodal Zen Programming Language (Azen)

Algodal Zen Programming Language or Azen is a modern programming language that provides the low-level control and efficiency of C while incorporating modern features expected in contemporary languages. Designed as an alternative to C, Azen was created with C developers in mind.

C, like most established languages, prioritizes stability, making it difficult to introduce modern features without breaking compatibility. While this is beneficial for maintaining legacy code, C was created over 50 years ago before many modern programming concepts and best practices were established. Continuing to use C, whether for maintaining legacy code or leveraging new features introduced in C23 and beyond, is perfectly valid. However, Azen offers a fresh start, integrating modern features while aiming for stability.

One of Azen’s key advantages is its 100% ABI compatibility with C. This means you can:

Write new code in Azen while integrating it with existing C projects.
Gradually rewrite parts, or even the entirety, of a C project in Azen.
Use Azen for entirely new projects.

While Azen takes inspiration from several modern languages, it follows its own philosophy, emphasizing different priorities and solutions. We encourage developers to choose the right tool for their projects, whether that’s Azen other languages, since no single language is the perfect fit for all situations. Azen focuses on simplicity and intuitiveness while introducing modern memory management features for automatic coding safety and software security.

The file extension for this language is .azen.

Key Features of `Azen`

Azen retains some syntactic similarities with C, making it easy for C developers to transition. However, it also introduces breaking changes and modern conveniences. Additionally, Azen is accessible to developers without a C, C++, or assembly background, as it provides automatic memory management by default.

Some features of Azen:

Default Zero-Initialization of all variables
Format Strings and Raw Strings
Multiple Return Values
Unnamed Parameters
Coroutines
Structure Merge
Structure Association
Import and Module
Defer
Scopeless Block
Type Template and Memory Template
Macro
Raw Pointer, Managed Pointer, Value Reference, Variable Reference and Transient Pointer Reference Model
Code Configuration
Inline C and ASM embedding

Azen is well-suited for systems programming and other low-level programming as well as high-level programing. Examples include drivers, kernels, operating systems, applications and video games development.

Write whatever you desire in Azen 😊 and happy coding!

Azen Syntax

import "std:io";

export "main" proc {
    print("Hello world!");
}

A basic Hello world! code in Azen. Import the io module from the standard library into the default namespace using import "std:io";. The io module provides the print function that we use to print to the screen. Anything you define in a module is private unless you make it public or export it. Here we are exporting the function we defined using export "main". "main" is the label saved in our object binary to identify our function. The keyword, proc defines a function, called a procedure in Azen. The main function is where the program starts. In this case, the main function has no signature which means that it takes no parameter and returns no result.

Comments

Line comment starts with #--.

#-- This is a line comment

Block comment is between #--[[ and #--]].

#--[[
    "This is a block comment"
    "This is a block comment"
#--]]

Imports

Include a module using the import keyword.

import "math.azen"; #-- import a Azen file as a module into default namespace
import "utils.c"; #-- import a C file as a module into default namespace
import "converter.yml"; #-- import files via config file into default namespace
import "std:hash"; #-- import the standard hash module into default namespace
import "math.azen" in math; #-- import a Azen file as a module into math namespace
import "utils.c" in utils; #-- import a C file as a module into utils namespace
import "converter.yml" in cv; #-- import files via config file into cv namespace
import "std:hash" in hash; #-- import the standard hash module into hash namespace
import "[lang=  c] algorithm"; #-- import a C file as a module into default namespace
import "[lang=azen] vector"; #-- import a Azen file as a module into default namespace
import "[lang=yml] bitmap"; #-- import a config file into default namespace

The import keyword can be used to import Azen source files (modules), C source files and Azen Import Configs (yaml files). It differenciates different types of files by the extension. If the file does not have an extension or the extension is unknown, you can use the import specifier (square brackets) to specify the type of the file you are importing. If you import a C file into a Azen module, the C source is automatically translated to Azen source by the import feature. A Azen source file is a module. Everything in this module is private unless made public to other modules using the pub keyword or export keyword which also exports the label for storing in object binaries. There is a special file called a "Azen Import Config" which is just a yaml file that defines the source file (.azen, .c or other .yml) and any associating library binaries (.so, .dll, .a, .lib, .o, .obj) that is needed to be linked by the linker.

When a module name is provided via the in syntax, the labels from that module is guarded by the provided name. For instance, you can say math.sin(45) or utils.print_amount().

Primitive Types

Primitive types are the lowest level types avaiable.

type	size (bytes)	description
`bool`	8 bits / 1 byte*	boolean
`b001`	1 bit	boolean
`b004`	4 bits	boolean
`b008`	8 bits / 1 byte	boolean
`b016`	16 bits / 2 bytes	boolean
`b032`	32 bits / 4 bytes	boolean
`schr`	1 byte	signed character (typically -128 to 127)
`uchr`	1 byte	unsigned character (typically 0 to 255)
`sint`	32 bits / 4 byte*	signed integer
`s008`	1 byte	signed 8-bit integer
`s016`	2 bytes	signed 16-bit integer
`s032`	4 bytes	signed 32-bit integer
`s064`	8 bytes	signed 64-bit integer
`uint`	32 bits / 4 bytes*	unsigned integer
`u008`	1 byte	unsigned 8-bit integer
`u016`	2 bytes	unsigned 16-bit integer
`u032`	4 bytes	unsigned 32-bit integer
`u064`	8 bytes	unsigned 64-bit integer
`real`	32 bits / 4 bytes*	real number (could vary, e.g., float or double)
`r032`	4 bytes	32-bit floating point number (single precision)
`r064`	8 bytes	64-bit floating point number (double precision)
`r080`	10 bytes	extended precision floating point number
`void`	0 bytes	absense of type and size

*size may change based on platform limitation nb - void has special meaning when used as a pointer

Numeric Values

These are number values. They have multiple ways of representing them. They all can use underscore as separators. For Octal o, hex x, binary b and unicode u can be as long as the largest literal the platform can represent. All numbers can be visually separated using underscore _. All numbers can be represented as an exponent using e. Only base 10 is currently supported. Exceptionally, c is used to have a number representing a character. There can only be one character.

0;
1000;
1_000_000;
1e10;
1.0;
1.52e-10;
0xdd110;
0ued52ddee;
0b101010;
0o7707;
0c"A"; #-- utf-8 character
0c"Z";

String Values

This is a series of utf-8 characters.

"Hello World";
"Nice to meet you 😊!";
"Hello world\nIt is nice to meet you!";

The above is considered normal strings. They allow escape characters.

String Concatenation

Just like in C multiple strings are automatically concatenated.

"Hello " "World. " 
"My name is Alrick." " Nice to meet you";

Block String

"""
3-double quotes create a block string.
That reads text newlines
to allow strings to break
like this.
"""

Format Strings

These are strings that you can pass variables to. Number variables are converted to unicode character representation. A format string is defined using \f.

\f"Hello World, my name is {name}";
\f"Answer is equal to {number}";

Raw Strings

Both normal and format strings support escape characters. However, if you want a string to be a simple text representation and ignore all special symbols then use a raw string via \r.

\r"This is a raw string. Blackslash \ means nothing here likewise curly brace {}";
\r"C:\\paths\" #-- useful for representing paths

Hex and Unicode in Normal Strings

"Hello, I am \xffeedd\e years old";
"I am \uddee1100\e years old";

The escape for unicode and hex must be terminated with the \e.

Hex and Unicode in Format Strings

\f"Hello, I am {0xffeedd} years old";
\f"I am {0uddee1100} years old";

Ranges

Ranges creates a sequence of values without explicitly listing out the values. It uses .. in between two numeric values. Ranges are inclusive.

1..20; #-- 1 to 20 inclusive

Variables

Variables are defined as type label. All variables are initialized to 0 automatically if not explicit initialization is provided. If you do not want a variable to be initialized by 0, assign the value undefined to it.

sint x; #-- initialized to 0
sint y = 5; #-- initialized to 5
sint z = undefined; #-- uninitialized
#-- Even Arrays and Structures are initialized by 0 by default whether global field or local field
#-- to define any variable without 0 initialization, you must assign it `undefined`.

Final Variables

A variable can be made constant programatically using the fn keyword. All variables are already optimized to constantness by the compiler if their values do not change after initialization and therefore there is no need to use fn. However, fn becomes useful in cases such as where the programmer is creating a library and wants to ensure that another programmer who is consuming the library doesn't change the value of a variable or in cases where making a reference to a readonly value.

sint x fn = 100; #-- x value can not be changed
sint y = 20; #-- y value can be changed
y = 10; #-- y value is changed
y fn; #-- y value can no longer be changed
sint z = 100;
z fn = 50; #-- value can no longer be changed

x variable was made fn on initialization. As we can see in the case of y and z, a variable can be made fn post initialization. Once it's made fn it can not be undone.

Arrays

Arrays are variables that are a list of items. They are defined by the type[size] label.

sint[10] x;
real[10] y = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0};
schr[10] z = {1, 2};  #-- only index 0 and 1 items are assigned values. The rest is initialized to 0

Assignment to the array is by {} which lists the values of the list. Each item in the list is a single unit. The list can be up to the size of the array or less. If less, the assignment is done in sequence starting from 0 index.

uint[10] x = {[2]=3, [4]=5};

Specific indices can be assigned during initialization by using [] within the {}.

uint[10] x = {[0..3]=1, [4..6]=7};

Specific ranges of indices of items can be assigned a single value by using the range specifier ... In this case, items of indices 0 to 3 were assigned 1 and of 4 to 6 were assigned 7.

uint[10] x = 41..50;

Arrays can be assigned ranges. In this case, the items of the array are assigned 41, 42, and so on until index 9 item which is assigned 50.

uint[10] x = {[0..3]=4..7, [4..6]=9..11};

Here we are using ranges to specify both the indices and the values.

Strings can be assigned directly to arrays without any {}.

schr[10] x = "Hello World!";

In this case, the string "Hello World!" is copied into the array x.

Arrays can set the exact size for holding a value during initialization. This is done using the _ which is replaced by the actual value automatically.

sint[_] x = 5..20; #-- size 16
sint[_] y = {1, 2, 3, 4, 5, 6, 7}; #-- size 7
schr[_] z = "Hi, my name is Azen!"; #-- size 19

You can define arrays with readonly items. That is, you can not change the value of individual items of the array. This is done by appending fn.

sint[3] x fn = {5, 6, 7};
#-- compile error: x[0] = 8;
x[0]; #-- can only read items

Struct

Structure type is very similar to C with the exception that no need for a typedef.

struct MyStructuredType {
    sint x, y, z;
    real a, b, c;
    uint f;
}; #-- this semi colon is required because you can allocate variables at the same time

MyStructuredType mst;
mst = {1, 2, 3, 4.0, 5.0, 6.0, 7};
MyStructuredType ms = {.x=4, .f=12.3};

struct AnotherType {
    real x;
    schr y;
} at, at2 = {4.4, 0c"A"};

#-- unnamed struct
struct {
    sint x, y, z;
} pos;

pos = {1, 2, 3};

Enum

Enums are similar to C in addition to specify more than integer types. By default enums are signed integer types like C.

enum MySintEnums {
    ZERO,
    ONE,
    TWO,
    THREE = 3
}; #-- ; required

MySintEnums x; #-- this is now a type(sint) enum type that only accepts specific values
x = 5; #-- console error: value is not enum listed
x = ZERO; #-- OK and enum value names can be used without any type specifier, compiler figures it out
x = 0; #-- also ok, using raw values without enum name is allowed
x = MySintEnums:ZERO; #-- can use the type specifier to specify which type's enum you are using

type(real) enum MyRealEnums {
    ZERO,
    TWO = 2.0001,
    THREE,
    SIX = 6
} y, z = THREE;

#-- using the `type` template you can give enums different types

sint a = TWO; 
real b = TWO;

#-- you can assign enums to variables of based types in the case of `a` and `b`

MySintEnums c = TWO;
MyRealEnums d = TWO;

#-- different enums values can share names. The compiler will know which value to assign
#-- based on the type of the variable. If the compiler can not decifier for some reason
#-- it will throw an error

auto e = MySintEnums:TWO;
auto d = MyRealEnums:TWO;

#-- otherwise the programmer can specify when needed

Union

Union is very similar to C in addition to type only values. Each member type in a union must be unique.

union MyUnionType {
    sint x;
    real y
    uint z;
    schr a;
};

union MyUnionType2 {
    sint x, y; #-- compiler error: multiple members of the exact same type
    real z
    real a; #-- compiler error: multiple members with the same exact type
};

MyUnionType ut;
ut = sint:5;
ut = real:10;
ut = uint:25;
ut = 0c"A";
ut = 4.4;
ut.a = 16;
ut.b = 7.7;
ut.c = 2;
ut.d = 0c"B";
#-- all the above is allowed

union MyUnionType3 {
    sint _;
    real _;
} ut3 = 500;

#-- you can ignore naming members
#-- in that case, you can only assign by cast (without member access).

Logic Operation

The logic operation work on any truthy or falsy expression.

and - both conditions are true or - either conditions are true xor - one and only one condition can be true ! - (NOT): condition is false

5 and (6 < 4);
!7;
6 == 7 or 5 != 8

Binary Functions

asm is a keyword that is used to access built-in assembly functions. This includes binary functions such as and, or, xor and not. The Binary operations only work on primitive numeric types.

asm.and(lhs, rhs) - and binary operation between two values asm.or(lhs, rhs) - or binary operation between two values asm.xor(lhs, rhs) - xor binary operation between two values asm.not(val) - not binary operation on one value asm.shl(val, amt) - shift left binary operation on one value by an amount asm.shr(val, amt) - shift right binary operation on one value by an amount asm.rol(val, amt) - rotate left binary operation on one value by an amount asm.ror(val, amt) - rotate right binary operation on one value by an amount

asm.and(5, 8);
asm.shl(8, 1);
asm.not(120);

If statement

Unlike in C, brackets are not required.

if(false) {}
if true {}
if x == 4 {}
if y != 5 and x == 5 {}

Switch Which and Match Which

The which selector is similar to if but it is more low level. Its is the equivalent to switch in C and it can be used as a matcher similar in other modern languages.

Which as switch

import "std:io";

sint x = 40;

which x {
    when 5 {
        print("five");
    }

    when 7, 8 {
        print("seven or eight");
    }

    when else {
        print(\f"x = {x}");
    }
}

Which as match

import "std:io";

sint x = 40;

sint y = which x when (5, (7, 8), else) -> (6, 12, 30);

For Loop

for sint i in 1..10 {}
for auto i in 1..10 {}
for i in 1..10 {}

The type for i can be omitted and its inferred. for can apply to ranges, arrays and structures. The type of i must be inferred for structures. Ofcourse, i can be named anything.

sint[10] a;
struct S { sint i, j; real k;} s;

for i in a {}
for i in s {}

While

while i < 10 {}
while i < 10 | i++ {}
while i < 10 | i++ | sint i {}
while i < 10 || sint i {}

Generic Block

with {}
with {
    sint x, y;
    real z;
}

Procedures

Azen representation of functions are procedures. Procedures (or functions) are defined with the proc keyword.

proc (sint a) -> sint foo { return a; }
proc () -> void bar {}
proc (sint a, sint b) -> sint goo { return a + b; }
proc (sint a, sint b) -> (sint, sint) jar { return a, b; }

A procedure is defined as proc input -> output label. The signature of the function is defined using input -> output. As you can see in the examples, procedures can have multiple inputs and outputs.

With procedures, you are not required to give the inputs and outputs labels.

proc (sint) -> sint foo { return $1; }
proc () -> void bar {}
proc (sint, sint) -> sint goo { return $1 + $2; }
proc (sint, sint) -> (sint, sint) jar { return $1, $2; }

In this case, if the parameters (inputs) do not have labels, you can refer to the inputs using the parameter specifier via the $ symbol. From $1 and above refer to the inputs from left to right.

For procedures that input nothing and output nothing you can drop the signature.

proc () -> void foo1 {}
proc void -> void foo2 {}
proc -> void foo3 {}
proc void foo4 {}
proc foo5 {}
proc {}

foo1, foo2, foo3, foo4, foo5 and the unnamed procedure all have the same signature.

You can define unlabelled procedure by just dropping the label.

proc sint->sint {}

This is useful when creating nested function or callbacks.

Coroutines

Procedures can save their state and resume from the pause state. They are called coroutines. The procedure is specified a coroutine and when pausing the state, it yields the return.

proc 'coroutine' sint -> sint co_fn {
    if $1 == 5 {
        return 'yield' 1;
    }
    return $1
}

Chain Procedures

Procedures can return from itself and its caller at the same time. This is called a chain procedure. When it does a long return, it return from itself and from the caller. The caller of a chain procedure must have the same return type inorder to call it.

proc 'chain' sint -> sint long_fn {
    if $1 == 5 {
        return 'long' 1;
    }
    return $1;
}

Use (Pure Functions)

Scoped (Regular) blocks access to outside variables can be limited using the use keyword.

use() with {} #-- inside the block can not access any variables defined outside of the block
sint x, y, z;
use(x, y) with {} #-- inside the block can only access x and y access any variables defined outside of the block
use(x, y) proc foo {} #-- inside the block can only access x and y access any variables defined outside of the block
use(extern sint errorc, extern sint maxvar) proc bar {} #-- only the global variables errorc and maxvar can be access inside the procedure blocl

Single line Blocks

then keyword is used to create a single statement scoped-block.

if true then print();
for i in 1..10 then print();
while i < 10 | i++ then print();
with then print();
proc pprint then print();

Scopeless Block

A scopeless block is a block that doesn't create a new scope. This is similar to the preprocessor if-block (#if) in C. Scopeless blocks are created using begin and end instead of {}. Anything that uses regular block can be assigned scopeless block. use does not work with scopeless blocks.

if true begin print(); end
for i in 1..10 begin print(); end
while i < 10 | i++ begin print(); end
with begin print(); end
proc pprint begin print(); end

Pointers

Azen supports raw pointers which is equivalent to pointers in C. This allows Azen to do low-level programming such as embedded programming without any issues. Raw pointers requires manual management by the programmers but allows fine-grain access to and manipulation of memory.

Azen answers critique of C regarding "memory safety concerns" with its Managed Pointers. Managed Pointers is part of Azen unique memory management strategy known as Transient-Pointer-Reference-Model or (TPRM). Managed Pointers are pointers created with a Memory Manager. This manager frees the memory automatically when the memory is no longer needed and enforces prohibition against use-after-free and other memory security concerns. The TPRM is Azen solution to automatic memory management and memory security which is comparative to other programming languages Garbage Collector Model, Reference Counter Model and Ownership Model.

Transient-Pointer-Reference-Model (TPRM)

The transient pointer reference model is a memory management and security model invented for Azen that ensures that:

Only one variable (strong-pointer) has access to full pointer privileges
All other variable (weak-pointer or reference) refer to that variable
Memory is freed when out of scope

The TPRM satisfies all memory safety concerns, this includes security and integrity.

TPRM works with only managed pointers.

Managed Pointers

These are pointers created by Memory Managers. A Memory Manager is a special structure that provides every procedure needed to managed memory. The standard library provides one plus programmers can create their own. The standard memory manager is Memory in the memory module.

import "std:memory";

export "main" proc {
    sint^ ptr = Memory.alloc(); #-- allocates 4 bytes
}

The manager Memory is used to allocate a sint size memory and assigned it to the variable ptr. The manager can infer the type and size based on the variable the memory is beign assigned to.

import "std:memory";

export "main" proc {
    sint[10]^ ptr1 = Memory.alloc(); #-- allocates 40 bytes
    sint[*]^ ptr2 = Memory.alloc(); #-- compile error: needs explicit size
    sint[*]^ ptr3 = Memory.alloc(2); #-- allocates 8 bytes
    schr[*]^ ptr4 = Memory.alloc(2); #-- allocates 2 bytes
    void[*]^ ptr5 = Memory.alloc(2); #-- allocates 2 bytes
}

In this example, if a * is used as the array size, the manager needs the programmer to specify how much memory to allocate. alloc takes the number of items to allocate, so in the case of ptr3, 2 allocates 8 bytes because it is allocating memory for 2 integers, while ptr4 and ptr5 are 2 bytes because their types are one unit. * is a special, and very import, size specifier for arrays. It can only be used in pointer types. It means the array can have any size or whose size is not known at compile time, that is, the array is dynamic. For ptr, the manager will throw an error if there is an attempt to change the size of the array. However, the sizes of the other variables can increase or reduce without restriction.

export "main" proc {
    sint[5..10]^ ptr1 = Memory.alloc(); #-- allocates 20 bytes
    sint[10..*]^ ptr2 = Memory.alloc(); #-- allocates 40 bytes
    sint[*..10]^ ptr3 = Memory.alloc(4); #-- allocates 16 bytes    
}

Dynamic array can be defined as ranges. For ptr1, the array can be a minimum of 5 items to a maximum of 10 items. Default allocation will be for the minimum amount. For ptr2, the minimum can be 10 but the maximum can be any number. While in ptr3 the minimum can be any amount but the maximum must be 10. You are required to specify the size when allocating.

Range Sizes of Arrays are only supported by Managed Pointers.

Managed Pointers provided features that can be accessed through pointer procedures. Also, manager pointers are larger in size than raw pointers because it keeps track of the size of the pointers at all times.

export "main" proc {
    sint[*]^ p = Memory.null(); #-- do not allocate any memory but assign the manager
    p = {4, 5, 6, 7, 8}; #-- automatically allocate the size needed (20 bytes) and store the array value
    p[3]; #-- 7
    p^.len(); #-- pointer procedure to access the length of the pointer array: 5
    p^.size(); #-- pointer procedure to access the size of the pointer array: 20
    p^.unit_size(); #-- pointer procedure to get the unit size: 4
    p^.append(17); #-- adds an item to the array and extend the pointer length: {4, 5, 6, 7, 8, 17}
    p^; #-- access the pointer address: 0xff525225515ac1515
    p; #-- access the memory value: {4, 5, 6, 7, 8, 17}
    p^ = Memory.alloc(20); #-- assigns new memory. The previous is automatically freed.

    #-- current memory pointed to by p is freed automatically at the end of the scope.
}

From what you can see in the example above, you don't have to allocate memory during initialization. You can assign it null and the manager will automatically allocate memory when you assign value to it. Calling the variable name is its value access. For pointer access postfix ^ the pointer attribute. This allows you to access the memory address as well as calling built-in memory manager procedures.

Raw Pointers

Raw pointers are equivalent to C pointers which is low level pointers. Raw pointers are created using ^^ which is the raw pointer specifier. Memory Managers do not work with raw pointers.

sint^^ ptr1 = malloc(sizeof(sint));
defer mfree(ptr1);
sint[*]^^ ptr2 = malloc(sizeof(sint) * 100);
ptr1 = 4; #-- assign value
ptr2 = {4, 6, 8}; #-- assign value
ptr1^^; #-- access pointer
ptr2^^; #-- access pointer
sint[10]^^ ptr3; #-- null pointer
return ptr2;

The defer keyword provides some support for manual management by calling the statement provided to it at the end of the current scope. It can be used to call mfree at the end of the scope. Raw pointers do not have any special pointer functions. However, their values and address are accessed in similar fashioned to managed pointer. You have to ensure that you allocate enough memory. The compiler do add some built-in checks for out-of-bounds and null pointer access. These checks can be removed using compiler flag. A raw pointer can not be converted into a managed pointer nor vice-versa.

Value Reference and Variable Reference

A value reference is a reference to the value in the variable rather than to the variable that holds the value. It works in tandem with managed pointers where it pointers to the memory pointed to by the managed pointer rather than the variable of the managed pointer. A value reference is defined using &. Value reference has no concept in C. Variable reference is a reference to the variable itself. It is equivalent to getting the address of a variable in C. It is defined using &&. A variable reference is always a raw pointer and does not work with Managed pointers.

Value Reference is a direct reference to the value of the variable
Variable Reference is the address of the variable

sint x = 10;
typeof(x); #-- sint
typeof(&x); #-- sint&
typeof(&&x); #-- sint^^
sint^^ y = &&x;
typeof(y); #-- sint^^
typeof(&y); #-- sint^^&
typeof(&&y); #-- sint^^2

Noticed the last line typeof(&&y); #-- sint^^2. That's how Azen represent pointers to pointers - with number literal. sint^^2 means a pointer to a sint raw pointer. If it was sint^^3 it would be a pointer to a pointer to a sint raw pointer.

sint^ mp1 = Memory.null(); #-- managed pointer to sint
sint^2 mp1 = Memory.null(); #-- pointer to managed pointer to sint
sint^3 mp1 = Memory.null(); #-- pointer to pointer to managed pointer to sint
sint^^ rp1; #-- raw pointer to sint
sint^^2 rp1; #-- pointer to raw pointer to sint
sint^^3 rp1; #-- pointer to pointer to raw pointer to sint

Implementing TPRM

sint[*]^ p1 = Memory.null(); #-- Memory Manager `Memory` gets pointer p1
p1 = {4, 5, 6, 7, 8}; #-- value is assigned to p1 memory; lets call the memory m1
typeof(p1); #-- sint[*]^
sint[*]^& r1 = &p1; #-- a value reference of m1 is created and assigned
foo(r1); can pass the reference to procedures; if p1 was passed directly, the memory would have been freed automatically
typeof(r1); #-- sint[*]^&
sint[*]^ p2 = p1^; TPRM: p2 becomes the strong pointer of m1; p1 becames a weak pointer which is basically a reference 
typeof(p2); #-- sint[*]^
typeof(p1); #-- sint[*]^(&)
p1 = p2^;
typeof(p2); #-- sint[*]^(&)
typeof(p1); #-- sint[*]^

While Azen being a strongly typed language, the TPRM model allows Azen to change the type of managed pointers to either strong pointers ^ or weak pointers ^(&) based on which pointer is currently assigned the memory. The weak pointer can only reference the memory just like a regular reference. The difference between a reference and a pointer is that the pointer can manipulate (increase, reduce, free , allocate, etc) the memory while a reference can only read or write to the memory. Anything done to the memory by the pointer is shared to the reference. Once a strong pointer memory is freed, it is freed for the weak pointers and references as well and any further access is forbidden.

Sample Program showing pointers and references

import "std:memory";
import "std:print";

export "main" proc {
    sint[*]^ list_of_ints = Memory.alloc(10); #-- if failed the allocated, runtime will through an error (no need to manually check)
    initialize(&list_of_ints);
    add_extra_ints(&list_of_ints, 10);
    print_ints(&list_of_ints);
    #-- list_of_ints is auto free here
}

proc (sint[*]^& ref) -> void initialize {
    for i in 0..(ref^.len()-1) {
        sint v= (i + 1) * 10;
        ref[i] = v;
    }
}

proc (sint[*]^& ref) -> void print_ints {
    for i in 0..(ref^.last_index()) {
        print(ref[i]);
    }
}

proc (sint[*]^& ref, sint amount) -> void add_extra_ints  {
    sint next_index = ref^.len();
    ref^.extend_alloc(amount);
    initialize(ref^.get_section(next_index)); #-- creates a reference to a section of the memory
}

Code Configuration

Azen provides a special way to extend the language or modify how the language behaves. This way is called configuration and is defined using single quotes ''. The single quotes follow the syntax that is is modifying.

schr[*]^ secret = Memory.alloc() 'zeroedfree';
struct 'align:4' MyData {schr d1, d2;};
proc 'inline' (sint a, sint b) -> sint sum {return a + b;}

In the above example, 'zeroedfree' is telling the memory manager to clear the memory before freeing. This is useful for hiding secrets from junk memory. While 'align:4 is telling the compiler to ensure the structure is 4 bytes aligned. Then 'inline' is telling the compiler to always force inline this procedure whenever it is called.

Embedded C and Assembly

C and ASM code can be written inside of Azen code by embedding it. The Azen compiler will call a C compiler to compile the C code and a Assembler to assemble the assembly. To embed code, use the code specifier \[]"". The string portion can be format-strings where you can pass fields directly from Azen into the foreign source code or you can use a vars specifier to pass variables that automatically format the variables for the foreign code, which is especially important for assembly.

\[code=ASM, platform=x86, syntax=i86, vars=[
    f = myvar,
    myvar2
]]
"""
xor eax, eax
"""

\[code=C, compiler=clang]
"""
#include "maths.h"

int func() {return 5;}
"""

Azen Source Strings

Azen source code can be treated like values. This is useful for macro definition. Azen source is defined between ``.

sint x = 5;
`sint y = 7`; #-- automatically written to the source file
sint z = `x + y`;
print(z);

You can use format source string

schr[_] name = "Alrick";
\f`sint var_{name} = 20`;
print(\f`var_{name}`);

Macro

Azen macro is similar to C preprocessor. Macro is used for text replacement, functional text replacement and code generation.

macro HUNDRED -> 100;
macro NAME -> "ALRICK";
macro CONDITION -> `if $1 == true`;
macro FUNC(x, y) -> \f`export "{x}" proc y;`;
macro FUNC(x 'Type', y 'Type') -> (x) * (y);
macro BLOCK -> with {};
macro BLOCK -> with begin end;

To generate exponential code, macro has a for feature.

macro COUNT for i in 1..4 -> \f`print({i});\n`;
COUNT;
#-- resolved to:
#-- print(1);
#-- print(2);
#-- print(3);
#-- print(4);

Rickodesea/Azen-lang-design.md