This is not meant to be a hard proposal for a new file format, it's just what I've occasionally thought about and decided to put out into the world to crystalize my own thinking and invite criticism.
Why a new ANSi Art file format?
.ANS files have limitations:
- I can only use certain fonts, these are limited to IBM Codepages, Amiga fonts, and a few others. The XBin format allows a user-defined font to be used, but these can only be 8 pixels wide and up to 32 pixels high, what if I wanted to use a font only 4 pixels wide and 8 pixels high? Maybe I want to create my own font and give it a name that people can see.
- SAUCE records are attached to the end of the file - this is problematic when streaming data, if you are defining how something should be displayed then you need to specify this before sending the data you're displaying, not afterwards.
- SAUCE records are also fairly wasteful and contain lots of unused or padded bytes, this is especially true for records with comments, and strings have to be encoded as Codepage 437.
- ANSi art might be UTF-8 instead of a single-byte encoding scheme.
- If we're displaying an ansimation then we need to know exactly how to throttle the stream (or not at all) to display it at the correct rate.
- There are certain characters that we can't display, bit can in a .BIN or .XB files, specifically the carriage return, line feed, substiute (escape), and tab characters, as these are interpreted as whitespace but an artist might want to use the literal representation of these codes in their work.
- The code for ANSi art is awkard and requires interpreting numerical values from ascii using a parser, things could be made more simpler.
So this new file format works as follows:
The file starts with the literal characers ANSi
in ASCII, the magic number.
Then follows a single byte which is equivalent to the ANSiFlags byte in SAUCE records, but with the three unused bits reserved for the font definition, UTF-8, and ANSimation flags:
Font definition | UTF-8 | ANSimation | Aspect Ratio | Letter Spacing | Blink |
---|---|---|---|---|---|
0 | 0 | 0 | 0 0 | 0 0 | 0 |
Aspect ratio, letter spacing, and blink work the same as SAUCE, but the ANSimation flag is a replacement for the ANSimation filetype in the SAUCE record. As you'd expect the UTF-8 flag means we're encoding as UTF-8 and not within the 0-255 byte range, and when the Font definition bit is set we're going to be supplying our own font bitmask.
After the flags we store two unsigned 16 bit integers for the amount of columns and rows for the display. Zero is an acceptable value for the rows field, but columns must always have a non-zero value. In the case of ANSimations then the rows field should be set as relative and absolute cursor position might be broken as a result, but we may want to specify an infinite scrollback buffer by indicating a display that isn't limited by a predetermined amount of rows.
The unsigned 16 bit integers (u16) used for the columns and row values are in little-endian byte order. This is also true for all subsequent u16s and u32s that follow.
If we have the ANSimation bit set then we read in a u16 for the baud rate. If this is set to zero then limit to whatever rate the stream of bytes allows.
If the Font definition bit is set then we read a u8 for the width, then the height, and then a bitmask for the font. This is assumed to be in the same order as the XBin file format, except now we have to pad the bitmask to accomidate fonts with width that aren't exactly divisible by 8, for example for a font with the dimensions 6x7:
# LSB MSB
1 0 0 0 0 0 0 0 0
2 0 0 1 1 0 0 0 0
3 0 1 0 0 1 0 0 0
4 0 1 1 1 1 0 0 0
5 0 1 0 0 1 0 0 0
6 0 1 0 0 1 0 0 0
7 0 0 0 0 0 0 0 0
You can work out how many bytes the font data section contains by the width and height of the font, and how many padding bits are necessary, e.g. For a font 6x7, there's two bits padding for each font width, effectively bringing the size to exactly 1 byte, so 7 bytes per character, 1x7x256=1792 bytes.
Obviously having UTF-8 bit set with the font data doesn't make much sense unless you want to define characters up to U+00FF, and allow fallback with whatever provision the system allows.
While it's possible with this format to turn letter-spacing on with a custom fonr definition, it only really makes sense in the context of IBM Codepages. In any case if this is to occur then the last column is duplicated for the range of characters it would apply to if this were an IBM Codepage.
Now the metadata for the file format is defined. These a five UTF-8 strings of a specified value read before the string (determined in bytes, not characters).
u8 - Length of Title string
UTF-8 String for Title.
u8 - Length of Author string
UTF-8 String for Author.
u8 - Length of Group string
UTF-8 String for Group.
u16 - Length of Comment string
UTF-8 String for Comment.
if the length of any of these fields a zero, then the field has not got a value.
Obviously this provides the same function as a SAUCE record, except the encoding is no longer limited to Codepage 437, and you get a slight increase in size for each field, and a reduction in wastage.
Then we have one final field for the Font Name, this works in two ways. If we're not supplying the font bitmask ourselves we can specify the font we want to use similar to a SAUCE record, e.g. IBM VGA
, or if we are supplying a our own font we can give it a name.
u8 - Length of Font Name string
UTF-8 String for Font Name.
This means the smallest possible header would be 15 bytes:
ANSi (Literal String - 4 bytes)
ANSiFlags (1 byte)
columns & rows field (4 bytes)
bytes for metadata (5 zeroed bytes)
length for font name (1 zeroed byte)
In most cases I would expect the header to be twice that size.
Now the header is out of the way we have a u32 to specify the length of the data in bytes that follows this value. It may be important to know how many bytes are yet to come down the pipe if we're showing some sort of progress bar or visual guide on how far an ANSimation has progressed, for instance. If we're not dealing with files of a known length, and for instance streaming stuff we're generating in response to some external force then this value can be zero, but we must send a special terminator sequence mentioned later.
The bytes that follow are interpreted literally and displayed according to the value (either UTF-8 or not) unless we encounter CR, LF, TAB, Escape, or backslash. CR, LF, TAB work exactly how it would in normal ANSi Art, if we see a backslash then we print the literal representation of the next character, which means we can show the glyph for CR, LF, TAB, Escape, ot backslash. If we see Escape without a blackslash first then we know we're going to process an escape sequence. The next u8 after the escape determines how many values we're going to send, then we read the amount of values as u16s, then we receive a terminator. For example:
Escape | Number of Values | Value | Terminator |
---|---|---|---|
0x1b | 0x01 | 0x0002 (u16) | 0x4a (J ) |
Would be equivalent to Esc[2J
(Clear screen)
The motivation for changing these escape sequences to binary equivalents is that it makes the logic for parsing sequences much more straightfoward, using the current system usually means we're creating a vector of indeterminate length, reading bytes and converting from ascii symbol to it's numerical equivalent, then multiplying by 10 if we get another digit and adding that to it, etc.
This means that the data isn't pure UTF-8 text, and it not meant to read as such.
Escape | Number of Values | Terminator |
---|---|---|
0x1b | 0x00 | 0x73 (s ) |
Is the same as Esc[s
(save cursor position)
In the unlikely case that we need more then 255 values in a sequence then sequences are chained until all values are exhausted.
There are two MS-DOS ANSI escape sequences for Set Mode and Reset Mode that don't neatly map onto this new escaping system, so instead they are redefined with the terminators {
and }
respectively, e.g.
Escape | Number of Values | Value | Terminator |
---|---|---|---|
0x1b | 0x01 | 0x0007 | 0x7b ({ ) |
Is equivalent to ESC[=7h
which enables line wrapping, which is used in some ansimations to prevent the display from scrolling when the cursor reaches the end of the screen. Using the terminator }
would be equivalent to ESC[=7l
, which turns line wrapping off.
Finally, if we want to signal that we're going to stop sending bytes that should be interpreted as ANSi Art, then we send
Escape | Number of Values | Terminator |
---|---|---|
0x1b | 0x00 | 0x7e (~ ) |
Which would be equivalent to Esc[~
, if such an escape sequnce existed.