My name is Alrick Grandison. This is my initial design for my programming language, Azen
. Azen
is designed to be a modern alternate to C
. I will commence building the language using LLVM. I will stream my progress on youtube. The initial design shown here in this document is subject to change. However, I will do my best to stay true to the initial design.
For the time being, I will be keeping the compiler source code closed.
Tooling features I hope to accomplish:
- LLVM compiler that compiles to machine code
- Auto generating import modules
Algodal Zen Programming Language
or Azen
is a modern programming language that provides the low-level control and efficiency of C
while incorporating modern features expected in contemporary languages. Designed as an alternative to C
, Azen
was created with C
developers in mind.
C
, like most established languages, prioritizes stability, making it difficult to introduce modern features without breaking compatibility. While this is beneficial for maintaining legacy code, C
was created over 50 years ago before many modern programming concepts and best practices were established. Continuing to use C
, whether for maintaining legacy code or leveraging new features introduced in C23
and beyond, is perfectly valid. However, Azen
offers a fresh start, integrating modern features while aiming for stability.
One of Azen’s key advantages is its 100% ABI compatibility with C
. This means you can:
- Write new code in
Azen
while integrating it with existingC
projects. - Gradually rewrite parts, or even the entirety, of a
C
project inAzen
. - Use
Azen
for entirely new projects.
While Azen
takes inspiration from several modern languages, it follows its own philosophy, emphasizing different priorities and solutions. We encourage developers to choose the right tool for their projects, whether that’s Azen
other languages, since no single language is the perfect fit for all situations. Azen
focuses on simplicity and intuitiveness while introducing modern memory management features for automatic coding safety and software
security.
The file extension for this language is .azen
.
Azen retains some syntactic similarities with C
, making it easy for C
developers to transition. However, it also introduces breaking changes and modern conveniences. Additionally, Azen
is accessible to developers without a C
, C++
, or assembly
background, as it provides automatic memory management by default.
Some features of Azen
:
- Default Zero-Initialization of all variables
- Format Strings and Raw Strings
- Multiple Return Values
- Unnamed Parameters
- Coroutines
- Structure Merge
- Structure Association
- Import and Module
- Defer
- Scopeless Block
- Type Template and Memory Template
- Macro
- Raw Pointer, Managed Pointer, Value Reference, Variable Reference and Transient Pointer Reference Model
- Code Configuration
- Inline
C
andASM
embedding
Azen is well-suited for systems programming and other low-level programming as well as high-level programing. Examples include drivers, kernels, operating systems, applications and video games development.
Write whatever you desire in Azen
😊 and happy coding!
import "std:io";
export "main" proc {
print("Hello world!");
}
A basic Hello world!
code in Azen
. Import the io
module from the standard library into the default namespace
using import "std:io";
. The io
module provides the print
function that we use to print to the screen. Anything you define in a module is private unless you make it public or export it. Here we are exporting the function we defined using export "main"
. "main"
is the label saved in our object binary to identify our function. The keyword, proc
defines a function, called a procedure
in Azen
. The main function is where the program starts. In this case, the main function has no signature which means that it takes no parameter and returns no result.
Line comment starts with #--
.
#-- This is a line comment
Block comment is between #--[[
and #--]]
.
#--[[
"This is a block comment"
"This is a block comment"
#--]]
Include a module using the import
keyword.
import "math.azen"; #-- import a Azen file as a module into default namespace
import "utils.c"; #-- import a C file as a module into default namespace
import "converter.yml"; #-- import files via config file into default namespace
import "std:hash"; #-- import the standard hash module into default namespace
import "math.azen" in math; #-- import a Azen file as a module into math namespace
import "utils.c" in utils; #-- import a C file as a module into utils namespace
import "converter.yml" in cv; #-- import files via config file into cv namespace
import "std:hash" in hash; #-- import the standard hash module into hash namespace
import "[lang= c] algorithm"; #-- import a C file as a module into default namespace
import "[lang=azen] vector"; #-- import a Azen file as a module into default namespace
import "[lang=yml] bitmap"; #-- import a config file into default namespace
The import
keyword can be used to import Azen source files (modules), C source files and Azen Import Configs (yaml files). It differenciates different types of files by the extension. If the file does not have an extension or the extension is unknown, you can use the import specifier (square brackets) to specify the type of the file you are importing. If you import a C
file into a Azen
module, the C
source is automatically translated to Azen
source by
the import feature. A Azen
source file is a module. Everything in this module is private unless made public to other modules using the pub
keyword or export
keyword which also exports the label for storing in object binaries. There is a special file called a "Azen Import Config" which is just a yaml file that defines the source file (.azen, .c or other .yml) and any associating library binaries (.so, .dll, .a, .lib, .o, .obj) that is needed to be linked
by the linker.
When a module name is provided via the in
syntax, the labels from that module is guarded by the provided name. For instance, you can say math.sin(45)
or utils.print_amount()
.
Primitive types are the lowest level types avaiable.
type | size (bytes) | description |
---|---|---|
bool |
8 bits / 1 byte* | boolean |
b001 |
1 bit | boolean |
b004 |
4 bits | boolean |
b008 |
8 bits / 1 byte | boolean |
b016 |
16 bits / 2 bytes | boolean |
b032 |
32 bits / 4 bytes | boolean |
schr |
1 byte | signed character (typically -128 to 127) |
uchr |
1 byte | unsigned character (typically 0 to 255) |
sint |
32 bits / 4 byte* | signed integer |
s008 |
1 byte | signed 8-bit integer |
s016 |
2 bytes | signed 16-bit integer |
s032 |
4 bytes | signed 32-bit integer |
s064 |
8 bytes | signed 64-bit integer |
uint |
32 bits / 4 bytes* | unsigned integer |
u008 |
1 byte | unsigned 8-bit integer |
u016 |
2 bytes | unsigned 16-bit integer |
u032 |
4 bytes | unsigned 32-bit integer |
u064 |
8 bytes | unsigned 64-bit integer |
real |
32 bits / 4 bytes* | real number (could vary, e.g., float or double) |
r032 |
4 bytes | 32-bit floating point number (single precision) |
r064 |
8 bytes | 64-bit floating point number (double precision) |
r080 |
10 bytes | extended precision floating point number |
void |
0 bytes | absense of type and size |
*size may change based on platform limitation nb - void has special meaning when used as a pointer
These are number values. They have multiple ways of representing them. They all can use underscore as separators. For Octal o
, hex x
, binary b
and unicode u
can be as long as the largest literal the platform can represent. All numbers can be visually separated using underscore _
. All numbers can be represented as an exponent using e
. Only base 10 is currently supported. Exceptionally, c
is used to have a number representing a character. There can only be one character.
0;
1000;
1_000_000;
1e10;
1.0;
1.52e-10;
0xdd110;
0ued52ddee;
0b101010;
0o7707;
0c"A"; #-- utf-8 character
0c"Z";
This is a series of utf-8 characters.
"Hello World";
"Nice to meet you 😊!";
"Hello world\nIt is nice to meet you!";
The above is considered normal strings. They allow escape characters.
Just like in C
multiple strings are automatically concatenated.
"Hello " "World. "
"My name is Alrick." " Nice to meet you";
"""
3-double quotes create a block string.
That reads text newlines
to allow strings to break
like this.
"""
These are strings that you can pass variables to. Number variables are converted to unicode character representation. A format string is defined using \f
.
\f"Hello World, my name is {name}";
\f"Answer is equal to {number}";
Both normal and format strings support escape characters. However, if you want a string to be a simple text representation and ignore all special symbols then use a raw string via \r
.
\r"This is a raw string. Blackslash \ means nothing here likewise curly brace {}";
\r"C:\\paths\" #-- useful for representing paths
"Hello, I am \xffeedd\e years old";
"I am \uddee1100\e years old";
The escape for unicode and hex must be terminated with the \e
.
\f"Hello, I am {0xffeedd} years old";
\f"I am {0uddee1100} years old";
Ranges creates a sequence of values without explicitly listing out the values. It uses ..
in between two numeric values. Ranges are inclusive.
1..20; #-- 1 to 20 inclusive
Variables are defined as type label
. All variables are initialized to 0 automatically if not explicit initialization is provided. If you do not want a variable to be initialized by 0, assign the value undefined
to it.
sint x; #-- initialized to 0
sint y = 5; #-- initialized to 5
sint z = undefined; #-- uninitialized
#-- Even Arrays and Structures are initialized by 0 by default whether global field or local field
#-- to define any variable without 0 initialization, you must assign it `undefined`.
A variable can be made constant programatically using the fn
keyword. All variables are already optimized to constantness by the compiler if their values do not change after initialization and therefore there is no need to use fn
. However, fn
becomes useful in cases such as where the programmer is creating a library and wants to ensure that another programmer who is consuming the library doesn't change the value of a variable or in cases where making a reference to a readonly value.
sint x fn = 100; #-- x value can not be changed
sint y = 20; #-- y value can be changed
y = 10; #-- y value is changed
y fn; #-- y value can no longer be changed
sint z = 100;
z fn = 50; #-- value can no longer be changed
x
variable was made fn
on initialization. As we can see in the case of y
and z
, a variable can be made
fn
post initialization. Once it's made fn
it can not be undone.
Arrays are variables that are a list of items. They are defined by the type[size] label
.
sint[10] x;
real[10] y = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0};
schr[10] z = {1, 2}; #-- only index 0 and 1 items are assigned values. The rest is initialized to 0
Assignment to the array is by {}
which lists the values of the list. Each item in the list is a single unit. The list can be up to the size of the array or less. If less, the assignment is done in sequence starting from 0 index.
uint[10] x = {[2]=3, [4]=5};
Specific indices can be assigned during initialization by using []
within the {}
.
uint[10] x = {[0..3]=1, [4..6]=7};
Specific ranges of indices of items can be assigned a single value by using the range specifier ..
. In this case, items of indices 0 to 3 were assigned 1 and of 4 to 6 were assigned 7.
uint[10] x = 41..50;
Arrays can be assigned ranges. In this case, the items of the array are assigned 41, 42, and so on until index 9 item which is assigned 50.
uint[10] x = {[0..3]=4..7, [4..6]=9..11};
Here we are using ranges to specify both the indices and the values.
Strings can be assigned directly to arrays without any {}
.
schr[10] x = "Hello World!";
In this case, the string "Hello World!"
is copied into the array x
.
Arrays can set the exact size for holding a value during initialization. This is done using the _
which is replaced by the actual value automatically.
sint[_] x = 5..20; #-- size 16
sint[_] y = {1, 2, 3, 4, 5, 6, 7}; #-- size 7
schr[_] z = "Hi, my name is Azen!"; #-- size 19
You can define arrays with readonly items. That is, you can not change the value of individual items of the array. This is done by appending fn
.
sint[3] x fn = {5, 6, 7};
#-- compile error: x[0] = 8;
x[0]; #-- can only read items
Structure type is very similar to C
with the exception that no need for a typedef.
struct MyStructuredType {
sint x, y, z;
real a, b, c;
uint f;
}; #-- this semi colon is required because you can allocate variables at the same time
MyStructuredType mst;
mst = {1, 2, 3, 4.0, 5.0, 6.0, 7};
MyStructuredType ms = {.x=4, .f=12.3};
struct AnotherType {
real x;
schr y;
} at, at2 = {4.4, 0c"A"};
#-- unnamed struct
struct {
sint x, y, z;
} pos;
pos = {1, 2, 3};
Enums are similar to C
in addition to specify more than integer types.
By default enums are signed integer types like C
.
enum MySintEnums {
ZERO,
ONE,
TWO,
THREE = 3
}; #-- ; required
MySintEnums x; #-- this is now a type(sint) enum type that only accepts specific values
x = 5; #-- console error: value is not enum listed
x = ZERO; #-- OK and enum value names can be used without any type specifier, compiler figures it out
x = 0; #-- also ok, using raw values without enum name is allowed
x = MySintEnums:ZERO; #-- can use the type specifier to specify which type's enum you are using
type(real) enum MyRealEnums {
ZERO,
TWO = 2.0001,
THREE,
SIX = 6
} y, z = THREE;
#-- using the `type` template you can give enums different types
sint a = TWO;
real b = TWO;
#-- you can assign enums to variables of based types in the case of `a` and `b`
MySintEnums c = TWO;
MyRealEnums d = TWO;
#-- different enums values can share names. The compiler will know which value to assign
#-- based on the type of the variable. If the compiler can not decifier for some reason
#-- it will throw an error
auto e = MySintEnums:TWO;
auto d = MyRealEnums:TWO;
#-- otherwise the programmer can specify when needed
Union is very similar to C
in addition to type only values.
Each member type in a union must be unique.
union MyUnionType {
sint x;
real y
uint z;
schr a;
};
union MyUnionType2 {
sint x, y; #-- compiler error: multiple members of the exact same type
real z
real a; #-- compiler error: multiple members with the same exact type
};
MyUnionType ut;
ut = sint:5;
ut = real:10;
ut = uint:25;
ut = 0c"A";
ut = 4.4;
ut.a = 16;
ut.b = 7.7;
ut.c = 2;
ut.d = 0c"B";
#-- all the above is allowed
union MyUnionType3 {
sint _;
real _;
} ut3 = 500;
#-- you can ignore naming members
#-- in that case, you can only assign by cast (without member access).
The logic operation work on any truthy or falsy expression.
and
- both conditions are true
or
- either conditions are true
xor
- one and only one condition can be true
!
- (NOT): condition is false
5 and (6 < 4);
!7;
6 == 7 or 5 != 8
asm
is a keyword that is used to access built-in assembly functions. This includes binary functions such as and
, or
, xor
and not
. The Binary operations only work on primitive numeric types.
asm.and(lhs, rhs)
- and binary operation between two values
asm.or(lhs, rhs)
- or binary operation between two values
asm.xor(lhs, rhs)
- xor binary operation between two values
asm.not(val)
- not binary operation on one value
asm.shl(val, amt)
- shift left binary operation on one value by an amount
asm.shr(val, amt)
- shift right binary operation on one value by an amount
asm.rol(val, amt)
- rotate left binary operation on one value by an amount
asm.ror(val, amt)
- rotate right binary operation on one value by an amount
asm.and(5, 8);
asm.shl(8, 1);
asm.not(120);
Unlike in C
, brackets are not required.
if(false) {}
if true {}
if x == 4 {}
if y != 5 and x == 5 {}
The which selector is similar to if
but it is more low level. Its is the equivalent to switch in C
and it can be used as a matcher similar in other modern languages.
import "std:io";
sint x = 40;
which x {
when 5 {
print("five");
}
when 7, 8 {
print("seven or eight");
}
when else {
print(\f"x = {x}");
}
}
import "std:io";
sint x = 40;
sint y = which x when (5, (7, 8), else) -> (6, 12, 30);
for sint i in 1..10 {}
for auto i in 1..10 {}
for i in 1..10 {}
The type for i
can be omitted and its inferred. for
can apply to ranges, arrays and structures. The type of i
must be inferred for structures. Ofcourse, i
can be named anything.
sint[10] a;
struct S { sint i, j; real k;} s;
for i in a {}
for i in s {}
while i < 10 {}
while i < 10 | i++ {}
while i < 10 | i++ | sint i {}
while i < 10 || sint i {}
with {}
with {
sint x, y;
real z;
}
Azen representation of functions are procedures. Procedures (or functions) are defined with the proc
keyword.
proc (sint a) -> sint foo { return a; }
proc () -> void bar {}
proc (sint a, sint b) -> sint goo { return a + b; }
proc (sint a, sint b) -> (sint, sint) jar { return a, b; }
A procedure is defined as proc input -> output label
. The signature of the function is defined using input -> output
. As you can see in the examples, procedures can have multiple inputs and outputs.
With procedures, you are not required to give the inputs and outputs labels.
proc (sint) -> sint foo { return $1; }
proc () -> void bar {}
proc (sint, sint) -> sint goo { return $1 + $2; }
proc (sint, sint) -> (sint, sint) jar { return $1, $2; }
In this case, if the parameters (inputs) do not have labels, you can refer to the inputs using the parameter specifier
via the $
symbol. From $1
and above refer to the inputs from left to right.
For procedures that input nothing and output nothing you can drop the signature.
proc () -> void foo1 {}
proc void -> void foo2 {}
proc -> void foo3 {}
proc void foo4 {}
proc foo5 {}
proc {}
foo1
, foo2
, foo3
, foo4
, foo5
and the unnamed procedure all have the same signature.
You can define unlabelled procedure by just dropping the label.
proc sint->sint {}
This is useful when creating nested function or callbacks.
Procedures can save their state and resume from the pause state. They are called coroutines. The procedure is specified a coroutine and when pausing the state, it yields the return.
proc 'coroutine' sint -> sint co_fn {
if $1 == 5 {
return 'yield' 1;
}
return $1
}
Procedures can return from itself and its caller at the same time. This is called a chain procedure. When it does a long return, it return from itself and from the caller. The caller of a chain procedure must have the same return type inorder to call it.
proc 'chain' sint -> sint long_fn {
if $1 == 5 {
return 'long' 1;
}
return $1;
}
Scoped (Regular) blocks access to outside variables can be limited using the use
keyword.
use() with {} #-- inside the block can not access any variables defined outside of the block
sint x, y, z;
use(x, y) with {} #-- inside the block can only access x and y access any variables defined outside of the block
use(x, y) proc foo {} #-- inside the block can only access x and y access any variables defined outside of the block
use(extern sint errorc, extern sint maxvar) proc bar {} #-- only the global variables errorc and maxvar can be access inside the procedure blocl
then
keyword is used to create a single statement scoped-block.
if true then print();
for i in 1..10 then print();
while i < 10 | i++ then print();
with then print();
proc pprint then print();
A scopeless block is a block that doesn't create a new scope. This is similar to the preprocessor if-block (#if
) in C
. Scopeless blocks are created using begin
and end
instead of {}
. Anything that uses regular block can be assigned scopeless block. use
does not work with scopeless blocks.
if true begin print(); end
for i in 1..10 begin print(); end
while i < 10 | i++ begin print(); end
with begin print(); end
proc pprint begin print(); end
Azen
supports raw pointers which is equivalent to pointers in C
. This allows Azen
to do low-level programming such as embedded programming without any issues. Raw pointers requires manual management by the programmers but allows fine-grain access to and manipulation of memory.
Azen
answers critique of C
regarding "memory safety concerns" with its Managed Pointers. Managed Pointers is part of Azen
unique memory management strategy known as Transient-Pointer-Reference-Model
or (TPRM
). Managed Pointers are pointers created with a Memory Manager
. This manager frees the memory automatically when the memory is no longer needed and enforces prohibition against use-after-free and other memory security concerns. The TPRM
is Azen
solution to automatic memory management and memory security which is comparative to other programming languages Garbage Collector Model, Reference Counter Model and Ownership Model.
The transient pointer reference model is a memory management and security model invented for Azen
that ensures that:
- Only one variable (strong-pointer) has access to full pointer privileges
- All other variable (weak-pointer or reference) refer to that variable
- Memory is freed when out of scope
The TPRM satisfies all memory safety concerns, this includes security and integrity.
TPRM works with only managed pointers.
These are pointers created by Memory Managers
. A Memory Manager
is a special structure that provides every procedure needed to managed memory. The standard library provides one plus programmers can create their own. The standard memory manager is Memory
in the memory
module.
import "std:memory";
export "main" proc {
sint^ ptr = Memory.alloc(); #-- allocates 4 bytes
}
The manager Memory
is used to allocate a sint
size memory and assigned it to the variable ptr
. The manager can infer the type
and size
based on the variable the memory is beign assigned to.
import "std:memory";
export "main" proc {
sint[10]^ ptr1 = Memory.alloc(); #-- allocates 40 bytes
sint[*]^ ptr2 = Memory.alloc(); #-- compile error: needs explicit size
sint[*]^ ptr3 = Memory.alloc(2); #-- allocates 8 bytes
schr[*]^ ptr4 = Memory.alloc(2); #-- allocates 2 bytes
void[*]^ ptr5 = Memory.alloc(2); #-- allocates 2 bytes
}
In this example, if a *
is used as the array size, the manager needs the programmer to specify how much memory to allocate. alloc
takes the number of items to allocate, so in the case of ptr3
, 2 allocates 8 bytes because it is allocating memory for 2 integers, while ptr4
and ptr5
are 2 bytes because their types are one unit.
*
is a special, and very import, size specifier for arrays. It can only be used in pointer types. It means the array can have any size or whose size is not known at compile time, that is, the array is dynamic. For ptr
, the manager will throw an error if there is an attempt to change the size of the array. However, the sizes of the other variables can increase or reduce without restriction.
export "main" proc {
sint[5..10]^ ptr1 = Memory.alloc(); #-- allocates 20 bytes
sint[10..*]^ ptr2 = Memory.alloc(); #-- allocates 40 bytes
sint[*..10]^ ptr3 = Memory.alloc(4); #-- allocates 16 bytes
}
Dynamic array can be defined as ranges. For ptr1
, the array can be a minimum of 5 items to a maximum of 10 items. Default allocation will be for the minimum amount. For ptr2
, the minimum can be 10 but the maximum can be any number. While in ptr3
the minimum can be any amount but the maximum must be 10. You are required to specify the size when allocating.
Range Sizes of Arrays are only supported by Managed Pointers.
Managed Pointers provided features that can be accessed through pointer procedures. Also, manager pointers are larger in size than raw pointers because it keeps track of the size of the pointers at all times.
export "main" proc {
sint[*]^ p = Memory.null(); #-- do not allocate any memory but assign the manager
p = {4, 5, 6, 7, 8}; #-- automatically allocate the size needed (20 bytes) and store the array value
p[3]; #-- 7
p^.len(); #-- pointer procedure to access the length of the pointer array: 5
p^.size(); #-- pointer procedure to access the size of the pointer array: 20
p^.unit_size(); #-- pointer procedure to get the unit size: 4
p^.append(17); #-- adds an item to the array and extend the pointer length: {4, 5, 6, 7, 8, 17}
p^; #-- access the pointer address: 0xff525225515ac1515
p; #-- access the memory value: {4, 5, 6, 7, 8, 17}
p^ = Memory.alloc(20); #-- assigns new memory. The previous is automatically freed.
#-- current memory pointed to by p is freed automatically at the end of the scope.
}
From what you can see in the example above, you don't have to allocate memory during initialization. You can assign it null and the manager will automatically allocate memory when you assign value to it. Calling the variable name is its value access
. For pointer access
postfix ^
the pointer attribute. This allows you to access the memory address as well as calling built-in memory manager procedures.
Raw pointers are equivalent to C
pointers which is low level pointers. Raw pointers are created using ^^
which is the raw pointer specifier. Memory Managers do not work with raw pointers.
sint^^ ptr1 = malloc(sizeof(sint));
defer mfree(ptr1);
sint[*]^^ ptr2 = malloc(sizeof(sint) * 100);
ptr1 = 4; #-- assign value
ptr2 = {4, 6, 8}; #-- assign value
ptr1^^; #-- access pointer
ptr2^^; #-- access pointer
sint[10]^^ ptr3; #-- null pointer
return ptr2;
The defer
keyword provides some support for manual management by calling the statement provided to it at the end of the current scope. It can be used to call mfree
at the end of the scope. Raw pointers do not have any special pointer functions. However, their values and address are accessed in similar fashioned to managed pointer. You have to ensure that you allocate enough memory. The compiler do add some built-in checks for out-of-bounds and null pointer access. These checks can be removed using compiler flag. A raw pointer can not be converted into a managed pointer nor vice-versa.
A value reference is a reference to the value in the variable rather than to the variable that holds the value. It works in tandem with managed pointers where it pointers to the memory pointed to by the managed pointer rather than the variable of the managed pointer. A value reference is defined using &
. Value reference has no concept in C
. Variable reference is a reference to the variable itself. It is equivalent to getting the address of a variable in C
. It is defined using &&
. A variable reference is always a raw pointer and does not work with Managed pointers.
- Value Reference is a direct reference to the value of the variable
- Variable Reference is the address of the variable
sint x = 10;
typeof(x); #-- sint
typeof(&x); #-- sint&
typeof(&&x); #-- sint^^
sint^^ y = &&x;
typeof(y); #-- sint^^
typeof(&y); #-- sint^^&
typeof(&&y); #-- sint^^2
Noticed the last line typeof(&&y); #-- sint^^2
. That's how Azen
represent pointers to pointers - with number literal. sint^^2
means a pointer to a sint raw pointer. If it was sint^^3
it would be a pointer to a pointer to a sint raw pointer.
sint^ mp1 = Memory.null(); #-- managed pointer to sint
sint^2 mp1 = Memory.null(); #-- pointer to managed pointer to sint
sint^3 mp1 = Memory.null(); #-- pointer to pointer to managed pointer to sint
sint^^ rp1; #-- raw pointer to sint
sint^^2 rp1; #-- pointer to raw pointer to sint
sint^^3 rp1; #-- pointer to pointer to raw pointer to sint
sint[*]^ p1 = Memory.null(); #-- Memory Manager `Memory` gets pointer p1
p1 = {4, 5, 6, 7, 8}; #-- value is assigned to p1 memory; lets call the memory m1
typeof(p1); #-- sint[*]^
sint[*]^& r1 = &p1; #-- a value reference of m1 is created and assigned
foo(r1); can pass the reference to procedures; if p1 was passed directly, the memory would have been freed automatically
typeof(r1); #-- sint[*]^&
sint[*]^ p2 = p1^; TPRM: p2 becomes the strong pointer of m1; p1 becames a weak pointer which is basically a reference
typeof(p2); #-- sint[*]^
typeof(p1); #-- sint[*]^(&)
p1 = p2^;
typeof(p2); #-- sint[*]^(&)
typeof(p1); #-- sint[*]^
While Azen
being a strongly typed language, the TPRM model allows Azen
to change the type of managed pointers to either strong pointers ^
or weak pointers ^(&)
based on which pointer is currently assigned the memory. The weak pointer can only reference the memory just like a regular reference. The difference between a reference and a pointer is that the pointer can manipulate (increase, reduce, free , allocate, etc) the memory while a reference can only read or write to the memory. Anything done to the memory by the pointer is shared to the reference.
Once a strong pointer memory is freed, it is freed for the weak pointers and references as well and any further access is forbidden.
import "std:memory";
import "std:print";
export "main" proc {
sint[*]^ list_of_ints = Memory.alloc(10); #-- if failed the allocated, runtime will through an error (no need to manually check)
initialize(&list_of_ints);
add_extra_ints(&list_of_ints, 10);
print_ints(&list_of_ints);
#-- list_of_ints is auto free here
}
proc (sint[*]^& ref) -> void initialize {
for i in 0..(ref^.len()-1) {
sint v= (i + 1) * 10;
ref[i] = v;
}
}
proc (sint[*]^& ref) -> void print_ints {
for i in 0..(ref^.last_index()) {
print(ref[i]);
}
}
proc (sint[*]^& ref, sint amount) -> void add_extra_ints {
sint next_index = ref^.len();
ref^.extend_alloc(amount);
initialize(ref^.get_section(next_index)); #-- creates a reference to a section of the memory
}
Azen
provides a special way to extend the language or modify how the language behaves. This way is called configuration and is defined using single quotes ''
. The single quotes follow the syntax that is is modifying.
schr[*]^ secret = Memory.alloc() 'zeroedfree';
struct 'align:4' MyData {schr d1, d2;};
proc 'inline' (sint a, sint b) -> sint sum {return a + b;}
In the above example, 'zeroedfree'
is telling the memory manager to clear the memory before freeing. This is useful for hiding secrets from junk memory. While 'align:4
is telling the compiler to ensure the structure is 4 bytes aligned. Then 'inline'
is telling the compiler to always force inline this procedure whenever it is called.
C
and ASM
code can be written inside of Azen
code by embedding it. The Azen
compiler will call a C
compiler to compile the C
code and a Assembler
to assemble the assembly. To embed code, use the code specifier \[]""
. The string portion can be format-strings where you can pass fields directly from Azen
into the foreign source code or you can use a vars specifier to pass variables that automatically format the variables for the foreign code, which is especially important for assembly.
\[code=ASM, platform=x86, syntax=i86, vars=[
f = myvar,
myvar2
]]
"""
xor eax, eax
"""
\[code=C, compiler=clang]
"""
#include "maths.h"
int func() {return 5;}
"""
Azen
source code can be treated like values. This is useful for macro definition. Azen
source is defined between ``.
sint x = 5;
`sint y = 7`; #-- automatically written to the source file
sint z = `x + y`;
print(z);
You can use format source string
schr[_] name = "Alrick";
\f`sint var_{name} = 20`;
print(\f`var_{name}`);
Azen
macro is similar to C
preprocessor. Macro is used for text replacement, functional text replacement and code generation.
macro HUNDRED -> 100;
macro NAME -> "ALRICK";
macro CONDITION -> `if $1 == true`;
macro FUNC(x, y) -> \f`export "{x}" proc y;`;
macro FUNC(x 'Type', y 'Type') -> (x) * (y);
macro BLOCK -> with {};
macro BLOCK -> with begin end;
To generate exponential code, macro
has a for
feature.
macro COUNT for i in 1..4 -> \f`print({i});\n`;
COUNT;
#-- resolved to:
#-- print(1);
#-- print(2);
#-- print(3);
#-- print(4);
Initially I had settled on calling my language
Zen
and using the extension.zen
because that name relates to my language being simple and easy to code in. However, recently I discovered 2 other programming language projects that have used the same name. So I have renamed the language toAlgodal Zen Programming Language
andAzen
for short and using the extension.azen
to distinguish my language from other languages called zen.