Lately I've been struggling creating custom file formats for my game, and every time i wanted to create a file format i would have gone through the same design process that's tedious and boring, usually i become selective on whether it should be binary/XML/JSON etc, there's so many options to do the same exact thing but they all have tradeoffs, it heavily depends on what i'm storing
Take renderable objects as an example, meshes usually have arbitrary primitives that are not constant in count size, and so jumping
around and putting next_offset
and previous_offset
variables with magic numbers everywhere manually is extremely annoying,
and they sometimes describe scenes of sub renderables as well and so you would have to somehow store the transform components along
how many children a node currently has (which is unknown!) and so on
Outputting these kinds of data into a binary is awkward too, usually later you need to follow a very specific sequence when reading them, redefine the constants into an actual program that's going to read off of the file and so on, it's pain
The files usually had a common pattern which was doable to put into some perspective and i didn't know what it is, so, at some point just for fun i thought: "Well, if you put these grouped chunks of objects into a board and try to view them as a tree, you would notice they can be represented as trees!"
And that's literally it, it's nothing more complicated than this, even better is that it's kind of trivial to implement, i was a little blown away when i thought of this, and immediately went ahead and started trying to implement this and architect it, it took a couple of weeks to think about the architecture alone and it wasn't really easy to come up with something partical, but in the end it worked!
Now with the architecture i came up with, it's a lot easier to write out each of the nodes that are put in an array to memory with a simple procedure call, and the procedure in of itself does all of the work with the recursive tree approach which is simple too, you could easily move nodes around if you want and things would still work perfectly
Here's a code example written in odin:
// ------ scene ------
scene_tree_node := push_binary_node(Magic_Number(BNCFT.scene_tree), nil, size_of(Scene_Tree))
scene_tree_data := cast(^Scene_Tree)scene_tree_node.data_ptr
scene_tree_data.node_count = len(gltf_scene.nodes)
for node in gltf_scene.nodes
{
s_tree_node := push_binary_node(Magic_Number(BNCFT.scene_tree_node), scene_tree_node, size_of(Scene_Tree_Node))
s_tree_node_data := cast(^Scene_Tree_Node)s_tree_node.data_ptr
s_tree_node_data ^= cast(Scene_Tree_Node)node
}
scene_node := push_binary_node(Magic_Number(BNCFT.scene), nil, size_of(Scene))
scene_data := cast(^Scene)scene_node.data_ptr
scene_data.node_count = len(gltf_data.nodes)
for node in gltf_data.nodes
{
s_node := push_binary_node(Magic_Number(BNCFT.node), scene_node, size_of(Node))
s_node_data := cast(^Node)s_node.data_ptr
s_node_data.transform = Transform {
position = node.translation,
scale = node.scale,
orientation = node.rotation,
}
s_node_data.mesh_id = cast(int)node.mesh.?
if len(node.children) > 0
{
s_child_node := push_binary_node(Magic_Number(BNCFT.node_children_ids), s_node, cast(Memory_Size)(size_of(Child_Node) * len(node.children)))
s_child_node_data := slice.from_ptr(cast(^Child_Node)s_child_node.data_ptr, len(node.children))
for child_node, child_node_id in node.children {
s_child_node_data[child_node_id] = cast(Child_Node)child_node
}
s_node_data.child_count = len(node.children)
s_node_data.flags += {.has_child}
}
}
scene_memory_buffer := allocate_and_write_nodes_to_memory({
scene_tree_node,
scene_node,
})
output_scene_file_name := strings.concatenate([]string{cmd_arguments[2], "_sc.scene"})
if output_buffer_to_file(output_folder_path, output_scene_file_name, scene_memory_buffer) == -1
{
fmt.println("[ERROR] failed to output the scene memory buffer!")
return
}
push_binary_node(...)
procedure takes the following arguments:
push_binary_node :: proc(type: Magic_Number, parent: ^Binary_Node, data_size: Memory_Size, allocator := context.temp_allocator) -> ^Binary_Node
{
// ...
}
Here allocate_and_write_nodes_to_memory(...)
procedure takes an array of nodes and spits them out to memory, now this resultant
memory is actually the entire file constructed and organized for us, i didn't have to do much here at all, each node's children
would be spitted out to a specific node memory chunk like the following
The headers have a constant size, they are necessary to tell you what the next main node is
Suppose you have a polygon file that has four nodes: scene_obj
, poly_obj
, vertex_buffer
, index_buffer
. these four
nodes are the main sequential nodes that you iterate through, so the headers would follow from
scene_obj
to poly_obj
to vertex_buffer
to index_buffer
and that's it, the headers would not iterate through it's children,
and also it's children point to each other as well and the cycle continues as the hierarchy gets deeper and deeper, and this really
helped make the tree concept an actual partical thing
Now reading a file is also quite straightforward, here's an example in odin code:
poly_obj_file := _load_file(strings.concatenate([]string{folder_path, "/", json_root["poly_obj"].(json.String)}))
// ...
current_poly_obj_file_offset := 0x0
poly_obj_file_is_done := false
for poly_obj_file_is_done == false
{
current_poly_obj_file_header := get_header_from_file(task_data.poly_obj_file, current_poly_obj_file_offset)
#partial switch cast(model_format.BNPFT)current_poly_obj_file_header.magic_number
{
case .vertex_buffer:
{
current_poly_obj_file_offset += size_of(model_format.Binary_Node_Header)
vertex_buffer_info := get_data_from_file(model_format.Vertex_Buffer_Info, task_data.poly_obj_file, current_poly_obj_file_offset)
vertex_buffer_size := size_of(Generic_Model_Vertex) * vertex_buffer_info.vertex_count
vertex_buffer_slice := task_data.poly_obj_file[
current_poly_obj_file_offset + size_of(model_format.Vertex_Buffer_Info):
current_poly_obj_file_offset + size_of(model_format.Vertex_Buffer_Info) + vertex_buffer_size]
_ = push_buffer(
command_buffer, Push_Storage_Buffer_Create_Info {
buffer = raw_data(vertex_buffer_slice),
size = vertex_buffer_size,
},
input_resource_handle = task_data.gpu_vertex_buffer,
)
}
case .index_buffer:
{
current_poly_obj_file_offset += size_of(model_format.Binary_Node_Header)
index_buffer_info := get_data_from_file(model_format.Index_Buffer_Info, task_data.poly_obj_file, current_poly_obj_file_offset)
index_buffer_size := size_of(u32) * index_buffer_info.index_count
index_buffer_slice := task_data.poly_obj_file[
current_poly_obj_file_offset + size_of(model_format.Index_Buffer_Info):
current_poly_obj_file_offset + size_of(model_format.Index_Buffer_Info) + index_buffer_size]
_ = push_buffer(
command_buffer, Push_Storage_Buffer_Create_Info {
buffer = raw_data(index_buffer_slice),
size = index_buffer_size,
},
input_resource_handle = task_data.gpu_index_buffer,
)
}
}
if current_poly_obj_file_header.next_offset == model_format.BINARY_NODE_HEADER_NEXT_OFFSET_UNDEFINED_VALUE {
poly_obj_file_is_done = true
}
else {
current_poly_obj_file_offset = cast(int)current_poly_obj_file_header.next_offset
}
}
Here we didn't have to go through meshes by themselves, we just iterate through main nodes that we care about (which are vertex_buffer
and index_buffer
nodes) and you can see we didn't have to do much here, just simple buffer offset increments
Now however, you would also notice you can't simply index into a certain node that way, since you don't know all possible main nodes that are outputted, so you have to iterate through each of them manually and find out, so maybe providing a layout buffer with a hash map structure to index faster if needed could help? i'm not sure
And that's all there is to this tree approach, there's really not much else to talk about because of how simple this is, however i wouldn't necessarily go absolutely nuts with this for everything, that would be stupid, in most scenarios you don't really need this approach to be used, and honestly JSON could just do it for most purposes anyways, but at least it's an interesting idea i've got to work out and it's neat!
Thank you for reading the article!
And here's the implementation detail behind all of this:
Memory_Offset :: distinct u64
Memory_Size :: distinct u64
Magic_Number :: distinct u64
/*
An important thing to note is that your magic numbers must have an undefined value as 0 for things to work out, as an example:
BNSFT :: enum(Magic_Number) // Binary_Node_Shading_File_Type
{
undefined = 0, // `undefined` must be at zero
shading_obj = 1,
image_obj = 2,
sampler_obj = 3,
material = 4,
image = 5,
sampler = 6,
}
You have to follow this sort of enum if you want things to work out with the current implementation,
you can modify things a little in the implementation if you have a better approach to this.
*/
Binary_Node_Header :: struct
{
magic_number: int,
// NOTE: need to know the total size of the chunk (which does not include the header)
total_size: Memory_Size,
prev_offset: Memory_Offset,
prev_magic_number: Magic_Number,
next_offset: Memory_Offset,
next_magic_number: Magic_Number,
}
BINARY_NODE_HEADER_NEXT_OFFSET_UNDEFINED_VALUE :: ~Memory_Offset(0x0)
BINARY_NODE_HEADER_PREV_OFFSET_UNDEFINED_VALUE :: ~Memory_Offset(0x0)
DEFAULT_BINARY_NODE_HEADER_VALUE :: Binary_Node_Header {
magic_number = 0x0,
total_size = 0x0,
prev_offset = BINARY_NODE_HEADER_PREV_OFFSET_UNDEFINED_VALUE,
prev_magic_number = 0x0,
next_offset = BINARY_NODE_HEADER_NEXT_OFFSET_UNDEFINED_VALUE,
next_magic_number = 0x0,
}
get_total_node_header_memory_size :: #force_inline proc(node: ^Binary_Node) -> Memory_Size { return cast(Memory_Size)(size_of(Binary_Node_Header) * node.total_node_count) }
get_total_node_header_memory_offset :: #force_inline proc(node: ^Binary_Node) -> Memory_Offset { return cast(Memory_Offset)(size_of(Binary_Node_Header) * node.total_node_count) }
get_total_node_children_size :: #force_inline proc(node: ^Binary_Node) -> Memory_Size
{
return cast(Memory_Size)(get_total_node_header_memory_size(node)) + cast(Memory_Size)(node.total_node_size - node.data_size)
}
get_total_node_children_offset :: #force_inline proc(node: ^Binary_Node) -> Memory_Offset
{
return cast(Memory_Offset)(get_total_node_header_memory_size(node)) + cast(Memory_Offset)(node.total_node_size - node.data_size)
}
Binary_Node :: struct
{
next, prev: ^Binary_Node,
parent: ^Binary_Node,
child: ^Binary_Node,
// NOTE: the idea is that we could use the total size of the
// node as total backend memory so we could pre-allocate later
total_node_size: Memory_Size,
total_node_count: int,
data_size: Memory_Size,
data_ptr: rawptr,
// NOTE: an undefined magic number is 0
type: Magic_Number,
}
push_binary_node :: proc(type: Magic_Number, parent: ^Binary_Node, data_size: Memory_Size, allocator := context.temp_allocator) -> ^Binary_Node
{
node_ptr := new(Binary_Node, allocator)
node_ptr ^= Binary_Node {
parent = parent,
total_node_size = data_size,
data_size = data_size,
data_ptr = cast(rawptr)make([^]byte, data_size, allocator),
type = type,
}
if parent != nil
{
if parent.child != nil
{
prev_child_ptr := parent.child
prev_child_ptr.prev = node_ptr
node_ptr.next = prev_child_ptr
}
parent.child = node_ptr
}
for parent_node := node_ptr.parent;
parent_node != nil;
parent_node = parent_node.parent
{
parent_node.total_node_size += data_size
parent_node.total_node_count += 1
}
return node_ptr
}
get_total_node_memory_size :: #force_inline proc(node: ^Binary_Node) -> Memory_Size
{
return size_of(Binary_Node_Header) + get_total_node_header_memory_size(node) + node.total_node_size
}
allocate_memory_buffer_from_nodes :: proc(nodes: []^Binary_Node, allocator := context.allocator) -> []byte
{
total_memory_size: Memory_Size
for node in nodes {
total_memory_size += size_of(Binary_Node_Header) + get_total_node_header_memory_size(node) + node.total_node_size
}
return make([]byte, total_memory_size, allocator)
}
Write_Node_To_Memory_Info :: struct
{
memory: []byte,
current_offset: Memory_Offset,
previous_offset: Memory_Offset,
previous_magic_number: Magic_Number,
next_offset: Memory_Offset,
next_magic_number: Magic_Number,
}
write_node_to_memory :: proc(node: ^Binary_Node, write_info: Write_Node_To_Memory_Info)
{
node_header := cast(^Binary_Node_Header)slice.as_ptr(write_info.memory[:size_of(Binary_Node_Header)])
node_header ^= DEFAULT_BINARY_NODE_HEADER_VALUE
node_header.magic_number = cast(int)node.type
node_header.total_size = get_total_node_memory_size(node) - size_of(Binary_Node_Header)
if write_info.previous_magic_number != 0 // is not undefined
{
node_header.prev_offset = cast(Memory_Offset)write_info.previous_offset
node_header.prev_magic_number = write_info.previous_magic_number
}
if write_info.next_magic_number != 0 // is not undefined
{
node_header.next_offset = cast(Memory_Offset)write_info.next_offset
node_header.next_magic_number = write_info.next_magic_number
}
data_slice := write_info.memory[size_of(Binary_Node_Header): size_of(Binary_Node_Header) + node.data_size]
data_ptr := slice.as_ptr(data_slice)
mem.copy(data_ptr, node.data_ptr, cast(int)node.data_size)
/* NOTE:
make sure we get the first ever allocated child node to maintain proper order,
since it's unfortunately already going the opposite way with how we stored them,
this can be solved if you store the first ever child
node in the parent, but i didn't bother
*/
first_child_node := node.child
for ;first_child_node != nil; first_child_node = first_child_node.next
{
if first_child_node.next == nil {
break
}
}
curr_child_node_mem_off := cast(Memory_Offset)(size_of(Binary_Node_Header) + node.data_size)
for child_node := first_child_node; child_node != nil; child_node = child_node.prev
{
child_prev_offset: Memory_Offset
child_prev_magic_number: Magic_Number
child_next_offset: Memory_Offset
child_next_magic_number: Magic_Number
child_prev_offset = BINARY_NODE_HEADER_PREV_OFFSET_UNDEFINED_VALUE
child_next_offset = BINARY_NODE_HEADER_NEXT_OFFSET_UNDEFINED_VALUE
if child_node.next != nil
{
child_prev_offset = write_info.current_offset + curr_child_node_mem_off - cast(Memory_Offset)get_total_node_memory_size(child_node.next)
child_prev_magic_number = child_node.next.type
}
if child_node.prev != nil
{
child_next_offset = write_info.current_offset + curr_child_node_mem_off + cast(Memory_Offset)get_total_node_memory_size(child_node)
child_next_magic_number = child_node.prev.type
}
child_buffer_min_offset := cast(Memory_Size)curr_child_node_mem_off
child_buffer_max_offset := cast(Memory_Size)curr_child_node_mem_off + get_total_node_memory_size(child_node)
write_node_to_memory(child_node, Write_Node_To_Memory_Info {
memory = write_info.memory[child_buffer_min_offset:child_buffer_max_offset],
current_offset = write_info.current_offset + curr_child_node_mem_off,
previous_offset = child_prev_offset,
previous_magic_number = child_prev_magic_number,
next_offset = child_next_offset,
next_magic_number = child_next_magic_number,
})
curr_child_node_mem_off += cast(Memory_Offset)get_total_node_memory_size(child_node) // skip to the other node
}
}
allocate_and_write_nodes_to_memory :: proc(nodes: []^Binary_Node, allocator := context.allocator) -> []byte
{
memory := allocate_memory_buffer_from_nodes(nodes, allocator)
min_offset := Memory_Offset(0)
max_offset := Memory_Offset(0)
for node, id in nodes
{
previous_offset := (id > 0) ? min_offset - cast(Memory_Offset)get_total_node_memory_size(nodes[id - 1]) : BINARY_NODE_HEADER_PREV_OFFSET_UNDEFINED_VALUE
previous_magic_number := (id > 0) ? nodes[id - 1].type : 0
next_offset := (id < len(nodes) - 1) ? min_offset + cast(Memory_Offset)get_total_node_memory_size(node) : BINARY_NODE_HEADER_NEXT_OFFSET_UNDEFINED_VALUE
next_magic_number := (id < len(nodes) - 1) ? nodes[id + 1].type : 0
total_node_size := get_total_node_memory_size(node)
max_offset += cast(Memory_Offset)total_node_size
write_node_to_memory(node, Write_Node_To_Memory_Info {
memory = memory[min_offset:max_offset],
current_offset = min_offset,
previous_offset = previous_offset,
previous_magic_number = previous_magic_number,
next_offset = next_offset,
next_magic_number = next_magic_number,
})
min_offset += cast(Memory_Offset)total_node_size
}
return memory
}