RRunes
        The Symbolic Programming Language
        =================================
        a guide for humans, from zero

Welcome! This is a complete, from-scratch introduction to Symbolic, a small systems language with no keywords. If you have written code in any language before, you can finish this guide and be productive. If you have never programmed, you can still follow along - every symbol is explained the first time it appears.

It is modeled on the structure of The Rust Programming Language: we start by getting a program running, then build up concepts one at a time, each with runnable examples.

Prefer learning by doing? Work through Learn Symbolic by Building first - seven hands-on projects from a tip calculator to Conway's Game of Life - then come back here for the systematic reference.

How to read the examples. Every block of code in this guide is a real program or fragment. Lines beginning with ::: are comments - the compiler ignores them, and we use them to show expected output.

Table of contents

  1. Getting Started
  2. The Symbol Grammar
  3. Registers & Mutability
  4. Numbers & Operators
  5. Data Flow & Width
  6. Control Flow
  7. Functions
  8. Memory & Segments
  9. Hashes
  10. Structs, Enums & Matching
  11. Generics, Traits & Closures
  12. Ternary Computing
  13. Building & Targets

1. Getting Started

1.1 Install the compiler

The recommended path is install.sh, which builds the Rust seed (symc0), bootstraps the self-hosting compiler (symc), compiles the package manager (sigil), and puts everything on your PATH:

git clone <this-repo> symbolic && cd symbolic
bash install.sh
source ~/.symbolic/env

There is no assembler, linker, or C compiler in Symbolic's pipeline. symc writes finished executables by itself.

To build only the Rust seed for development:

cargo build --release -p symc0
# produces ./target/release/symc0

1.2 Your first program

Create a file hello.sym:

::: hello.sym - my first Symbolic program
((hello, world\n)) > @screen
!!

Compile and run it (the self-hosting symc reads source from stdin and writes a binary to stdout):

symc < hello.sym > hello && chmod +x hello && ./hello
::: hello, world

Using the Rust seed directly (symc0 takes a filename and -o):

./target/release/symc0 --no-run --target x64-linux -o hello hello.sym
./hello
::: hello, world

Let's read those two lines:

((hello, world\n)) > @screen
'-----.-----'      |  '--.--'
    a string       |   the screen
    literal        |
                   '-- "flow this value into that destination"
!!
'-- halt the program (like `exit(0)`)
  • (( ... )) is a string literal. Escapes: \n newline, \t tab, \r carriage return, \0 NUL, \\ backslash, \" quote. To print parentheses, escape them — \(( and \)) — so they aren't mistaken for the closing )). (A lone ) not followed by another ) also works unescaped.) These are resolved at compile time into the literal's bytes: no runtime cost.
  • > means flow - send the value on the left to the destination on the right.
  • @screen is the screen segment - standard output.
  • !! halts the program.

That's a whole program. No main, no imports, no boilerplate. Top-level statements are the program.

1.3 Printing a number

@screen prints raw bytes. To print a number as decimal, use the built-in :wrint ("write integer"):

:wrint { [42] } !
::: 42

We'll explain the :name { [args] } ! call shape in Chapter 7; for now, treat :wrint { [x] } ! as "print the number x followed by a newline."


2. The Symbol Grammar

Symbolic has no reserved words. Instead, a handful of rules generate the entire language. Learn these four and the rest is recognition, not memorization.

   +-----------------------------------------------------------------
   |  RULE 1 .  -  negates       flips an operator's meaning
   |  RULE 2 .  +  extends       loosens / broadens an operator
   |  RULE 3 .  ,  reduces       shrinks width, adds delay
   |  RULE 4 . spacing matters   touching = modifier
   |                             spaced   = standalone token
   +-----------------------------------------------------------------
   plus:  doubling / tripling intensifies   ( +  ->  ++  ->  +++ )

Watch one symbol family grow from these rules:

+         add            ::: base
++        multiply       ::: RULE: doubling intensifies "increase"
+++       power          ::: tripling intensifies further
-         subtract       ::: RULE 1: "-" is the decreasing counterpart
--        divide
---       modulo

And the ? family:

?[c]{ ... }      run once if c is true        ::: the conditional
?[c]{ ... }?     loop while c is true          ::: RULE 4: trailing `?` = back-edge
-?{ ... }        else                          ::: RULE 1: `-` negates the `?`
??             pattern match

Two more conventions you'll see everywhere:

  • ::: starts a comment to end of line.
  • Sigils introduce names: $x is a register, :f is a label/function, ::T is a type, #h is a hash cell, @s is a memory segment. The sigil tells you what kind of thing a name is, instantly.

Naming rule. Register, label, and type names use a–z A–Z 0–9 _ and are at most 6 characters long. Short names are a deliberate constraint that keeps code dense and scannable.


3. Registers & Mutability

A register is Symbolic's variable. It holds a 64-bit integer by default.

3.1 Creating a register

You create a register by flowing a value into it with > ~$name:

42 > ~$x          ::: create $x, give it 42
:wrint { [$x] } !  ::: 42
  • $x reads the register.
  • ~$x declares / writes the register. The ~ marks ownership and mutability - it says "I am (re)binding this name."

3.2 Immutable by default

Reading is $x; it does not change anything. To change a register, you write to it again with ~:

0 > ~$n           ::: $n = 0
$n + 1 > ~$n      ::: $n = 1   (read $n, add 1, write back)
$n + 1 > ~$n      ::: $n = 2
:wrint { [$n] } !  ::: 2

This read-compute-write pattern is the bread and butter of Symbolic. Because the write target is explicit (~$n), data flow is always visible: values move left-to-right into named slots.

3.3 Ownership and moves

The ~ sigil is also Symbolic's ownership marker (the same idea as Rust's ownership). A plain $x is a use; ~$x is a binding. For the scalar registers in this guide the distinction is simply "read vs. write," but for larger values it governs moves - a value flowed into a new owner is moved, not copied, and the compiler tracks initialization so you cannot read a register before it has been written.

5 > ~$a
$a > ~$b          ::: $b now owns the value; for scalars this is a copy
:wrint { [$b] } !  ::: 5

4. Numbers & Operators

Everything in this chapter returns a value you can flow somewhere.

4.1 Integer literals

42            ::: decimal
0xFF          ::: hexadecimal -> 255
0b1010        ::: binary      -> 10
(A)           ::: a character literal -> its byte value, 65

4.2 Arithmetic

3 + 4         ::: 7     add
3 ++ 4        ::: 12    multiply
3 +++ 4       ::: 81    power  (3 to the 4th)
10 - 3        ::: 7     subtract
10 -- 3       ::: 3     divide (integer)
10 --- 3      ::: 1     modulo (remainder)

A worked example:

3 ++ 4 > ~$area     ::: 12
$area + 1 > ~$area  ::: 13
:wrint { [$area] } !  ::: 13

4.3 Comparisons

Comparisons evaluate to 1 (true) or 0 (false), so you can store or print them directly. Read them through the grammar: = is the comparison base, + extends it upward (greater), - negates it downward (less).

5 == 5        ::: 1   equal
5 != 3        ::: 1   not equal
5 =+ 3        ::: 1   greater-or-equal     (= extended by +)
5 =++ 3       ::: 1   greater-than         (extended twice)
5 --= 3       ::: 0   less-than            (- negates toward "less")
5 -= 3        ::: 0   less-or-equal
7 =++ 4 > ~$big
:wrint { [$big] } !   ::: 1

4.4 Bitwise & shifts

0xFF & 0x0F   ::: 15    bitwise AND
12 &+ 3       ::: 15    bitwise OR     (& extended by +)
12 -&+ 10     ::: 6     bitwise XOR    (- negates, & , extended)
-& 0          ::: -1    bitwise NOT    (prefix; flips all bits)

1 -< 4        ::: 16    shift left     (-< )
256 <+ 2      ::: 64    shift right    (<+ )
1 --< 1       ::: rotate left
1 <++ 1       ::: rotate right

4.5 Precedence

From lowest to highest binding power:

comparisons        ==  !=  --=  -=  =+  =++
bitwise            &   &+  -&+
shifts / rotates   -<  <+  --<  <++
add / subtract     +   -
multiply/div/mod   ++  --  ---
power              +++          (right-associative)

So 2 + 3 ++ 4 is 2 + (3 ++ 4) = 2 + 12 = 14. When in doubt, compute a sub-result into a register first - it reads clearly and never surprises you.


5. Data Flow & Width

5.1 The flow operator >

> is how values move. Its left side is a value; its right side is a destination - a register (~$x), a hash cell (~#h), a memory segment (@screen), or a control target.

99 > ~$x          ::: into a register
$x > @screen      ::: into the screen (prints the raw byte 99 = 'c')

5.2 Returning and breaking with the > family

The > symbol composes with ! and ? into the control verbs:

>!?      return the value to the caller of the current function
>?       conditional flow / jump
!!>      break out of the innermost loop

These are introduced properly in Chapters 6 and 7.

5.3 Bit widths with the comma

By default a register and a flow are 64-bit. Touching commas reduce the width (RULE 3 - ", reduces"):

$a,        ::: $a as a 32-bit register
$a,,       ::: 16-bit
$a,,,      ::: 8-bit
>,         ::: a 32-bit flow

A spaced comma is a different token entirely - a sequence separator - because spacing changes meaning (RULE 4). You will rarely need widths until you do systems-level work; they're here when you need them.


6. Control Flow

6.1 The conditional ?[ ... ]{ ... }

? opens a condition in [ ]; the body goes in { }. The body runs once if the condition is non-zero (true):

5 > ~$x
?[$x =++ 3]{           ::: if x > 3
    ((big\n)) > @screen
}
::: big

6.2 Else with -?

-? is "else" - literally the ? negated by -:

?[$x =++ 10]{ ((big\n)) > @screen }
-?{ ((small\n)) > @screen }
::: small

You can chain -?{ ?[...]{...} -?{...} } to get else-if ladders.

6.3 Loops ?[ ... ]{ ... }?

Add a trailing ? to the closing brace and the condition becomes a loop: it re-checks on every pass and repeats while true.

0 > ~$i
?[$i --= 5]{           ::: while i < 5
    :wrint { [$i] } !
    $i + 1 > ~$i
}?
::: 0 1 2 3 4

Counters work exactly as written: a register updated across the loop's back-edge keeps one stable location.

6.4 Breaking out with !!>

!!> jumps to the exit of the innermost loop:

0 > ~$i
?[1 == 1]{             ::: loop "forever"...
    ?[$i =+ 5]{ !!> }  ::: ...until i >= 5, then break
    :wrint { [$i] } !
    $i + 1 > ~$i
}?
:wrint { [999] } !      ::: 0 1 2 3 4 999

6.5 Halting and panicking

!!      halt the program normally (exit 0)
!!!     panic (abort)

6.6 Pattern matching ??

?? matches a value against patterns; each arm is pattern > { body } and _ is the catch-all (see Chapter 10):

?? $n {
    0 > { ((zero\n))  > @screen }
    1 > { ((one\n))   > @screen }
    _ > { ((other\n)) > @screen }   ::: wildcard arm
}

7. Functions

7.1 Declaring a function

A function is a label :name, a parameter block { ... }, a body, and a return:

:add { [i64:$a] & [i64:$b] }
    $a + $b >!?
  • :add is the function's name.
  • { [i64:$a] & [i64:$b] } is the parameter list: each parameter is [type:$name], separated by &.
  • $a + $b >!? computes the sum and returns it with >!?.

Why the type annotation? Writing [i64:$a] (rather than just [$a]) is what tells the compiler "this is a declaration of a parameter," as opposed to [expr] which is an argument at a call site. The type also documents the parameter. i64 is the 64-bit integer used throughout this guide.

7.2 Calling a function

A call is :name { [arg] & [arg] ... } !. The trailing ! means "invoke." Capture the result with > ~$dest:

:add { [20] & [22] } ! > ~$sum
:wrint { [$sum] } !   ::: 42

A function with no parameters is called :name { } !.

7.3 Many arguments

Functions take any number of arguments - there is no fixed limit. The first six travel in registers and the rest on the stack, automatically:

:sum8 { [i64:$a]&[i64:$b]&[i64:$c]&[i64:$d]&[i64:$e]&[i64:$f]&[i64:$g]&[i64:$h] }
    $a + $b + $c + $d + $e + $f + $g + $h >!?
:sum8 { [1]&[2]&[3]&[4]&[5]&[6]&[7]&[8] } ! > ~$r
:wrint { [$r] } !   ::: 36

7.4 Early return

>!? can appear anywhere, including inside a conditional, to return early:

:max { [i64:$a] & [i64:$b] }
    ?[$a =+ $b]{ $a >!? }   ::: if a >= b, return a
    $b >!?
:max { [3] & [9] } ! > ~$m
:wrint { [$m] } !   ::: 9

7.5 Recursion

Functions may call themselves. Here is factorial:

:fac { [i64:$n] }
    ?[$n --= 2]{ 1 >!? }    ::: base case: n < 2 -> 1
    $n - 1 > ~$m
    :fac { [$m] } ! > ~$r
    $n ++ $r >!?            ::: n * fac(n-1)
:fac { [5] } ! > ~$f
:wrint { [$f] } !   ::: 120

Tip - program shape. Define all your functions first, then write the top-level statements that drive them. Each function body ends at its top-level >!? return.

7.6 The built-in functions

These behave like functions and are always available; they are how a program talks to the outside world:

Built-in Meaning
:wrint { [n] } ! print integer n (decimal + newline)
:wrch { [b] } ! write one byte b to the screen
:wrbuf { [ptr] & [len] } ! write len bytes starting at ptr
:rdall { } ! -> ptr read all of stdin; returns a pointer to [len:i64][bytes...]
:alloc { [n] } ! -> ptr allocate n bytes on the heap
:ld8 / :ld64 { [addr] } ! -> v load a byte / 8 bytes from memory
:st8 / :st64 { [addr] & [v] } ! store a byte / 8 bytes to memory

We use these next.


8. Memory & Segments

8.1 Segments

Memory is addressed through named segments, all written @name:

Segment Purpose
@screen display output (stdout)
@inp keyboard / standard input
@mem general RAM (@mem[addr])
@stck the stack
@sec secure memory
@net @rng @time @sys network, randomness, clock, system vectors
((hi\n)) > @screen        ::: write to the display

8.2 The heap: allocate, store, load

Use :alloc to get memory and the :ld*/:st* built-ins to use it:

:alloc { [64] } ! > ~$p        ::: 64 bytes, $p points at them
123456789 > ~$v
:st64 { [$p] & [$v] } !         ::: store 8 bytes at $p
:ld64 { [$p] } ! > ~$r          ::: read them back
:wrint { [$r] } !                ::: 123456789

65 > ~$c
:st8 { [$p + 8] & [$c] } !       ::: store the byte 65 ('A') at $p+8
:ld8 { [$p + 8] } ! > ~$b
:wrch { [$b] } !                  ::: A
:wrch { [10] } !                  ::: newline

Addresses are ordinary integers, so [$p + 8 + $i] indexes naturally.

8.3 Reading standard input

:rdall slurps all of stdin into a heap buffer laid out as [length: 8 bytes][raw bytes...]. This echo reads input and writes it back:

:rdall { } ! > ~$buf
:ld64 { [$buf] } ! > ~$len      ::: first 8 bytes = the length
0 > ~$i
?[$i --= $len]{
    :ld8 { [$buf + 8 + $i] } ! > ~$ch
    :wrch { [$ch] } !
    $i + 1 > ~$i
}?
!!
echo -n "round trip" | ./echo
::: round trip

This is exactly the input path the self-hosting compiler uses to read your source code.


9. Hashes

Symbolic has first-class hash cells: named, program-global storage organized into three tiers by lifetime, written with one, two, or three #.

Form Tier Meaning
#name ephemeral a mutable global cell
##name persistent survives across runs (where the platform supports it)
###name ROM a constant baked into the binary

9.1 Declaring and using cells

A declaration is #name <constant>. After that, read it as #name and write to it with > ~#name:

###MAX 1000          ::: a ROM constant
#count 0             ::: a mutable counter

###MAX > ~$m
:wrint { [$m] } !     ::: 1000

5 > ~#count          ::: write the cell
#count + 1 > ~#count
:wrint { [#count] } ! ::: 6

Cells are how a long-running program (like the compiler) keeps global state - buffers, counters, tables.

9.2 Content-addressed hashes

The spec's most distinctive feature: #[:fn & key] computes a slot by applying a hash function :fn to a key, then reads or writes that slot. It is a built-in hash map with collision handling, expressed in the grammar:

:h { [i64:$k] } $k --- 64 >!?   ::: our hash function: key mod 64

111 > ~#[:h & 7]      ::: store 111 at slot hash(7)
222 > ~#[:h & 71]     ::: 71 mod 64 == 7 -> collides; resolved by probing
#[:h & 7] > ~$a
:wrint { [$a] } !      ::: 111
#[:h & 71] > ~$b
:wrint { [$b] } !      ::: 222

The runtime keeps an open-addressed table; colliding keys probe to the next free slot, so distinct keys never clobber each other.


10. Structs, Enums & Matching

10.1 Types

A type is named with ::Name. A struct groups fields; each field is written type:[name], separated by &:

::Plyr { i32:[hp] & i32:[mp] & f64:[spd] }

Create one with a value list, read fields with ., and write a field with > ~$p.field:

::Plyr { [100] & [50] & [1.5] } > ~$p
$p.hp > @screen      ::: 100   (writes the raw byte; use :wrint to see decimal)
75 > ~$p.hp          ::: fields are mutable
$p.hp > @screen      ::: 75
$p.mp > @screen      ::: 50

10.2 Enums and pattern matching

The ?? operator matches a value against patterns. Each arm is pattern > { body }, and _ is the wildcard that matches anything else, so a match is always exhaustive:

3 > ~$n
?? $n {
    0 > { :wrint { [100] } ! }
    1 > { :wrint { [101] } ! }
    _ > { :wrint { [999] } ! }   ::: $n is 3 -> 999
}

Enums are types with named variants (::Shape { .Circle { i64:[r] } & ... }), selected with .Variant; you match their variants the same way.


11. Generics, Traits & Closures

11.1 Generics with $T$

A generic type parameter is written $T$ (a name between two dollar signs). It lets one function work for any type:

:id $T$ { [$T$ $x] }
    $x >!?
:id { [42] } ! > ~$r
:wrint { [$r] } !   ::: 42

Here $T$ after the function name introduces the type parameter, and [$T$ $x] declares a parameter $x of that generic type. A single compiled body serves every instantiation.

11.2 Traits & impls with ^.^

^.^ introduces an implementation block - methods attached to a type, optionally satisfying a trait. Each method is a label with a braced body, and several methods are scoped together inside one { }:

^.^ ::Plyr {
    :heal { [i32:$h] } $p.hp + $h >!? }
    :dbl  { } $p.hp ++ 2 >!? }
}

Both inherent impls (^.^ ::Type { ... }) and trait impls (^.^ ::Type ::Trait { ... }, naming the trait after the type) use this same shape - the ^.^ marker makes an implementation block unambiguous.

Method dispatch is static - each call resolves to a concrete function at compile time and lowers to a direct call (no vtable, no runtime lookup). This is what makes the abstractions below zero-cost: they compile to exactly the code you'd write by hand.

11.3 Iterators (zero-cost)

An iterator is just a struct holding its state, advanced by ^.^ methods. Since the state lives in the struct (no per-step heap allocation) and the methods are statically dispatched (direct calls), the loop compiles to a tight increment-compare-call with no overhead - see examples/iterator.sym:

::Rng { i64:[cur] & i64:[end] }
^.^ ::Rng {
    :more { $self.cur --= $self.end >!? }                                      ::: more items?
    :take { $self.cur > ~$v  $self.cur + 1 > ~$nc  $nc > ~$self.cur  $v >!? }  ::: yield + advance
}
::Rng { [0] & [5] } > ~$it
?[1 == 1]{ $it :more ! > ~$m  ?[$m == 0]{ !!> }  $it :take ! > ~$x  :wrint { [$x] } ! }?   ::: 0 1 2 3 4

(The split more/take keeps it allocation-free. A single next returning the Option enum is also valid, but Option is heap-allocated, so it is not zero-cost in a hot loop.)

11.4 Closures with >{ }

A closure is an anonymous function written >{ ... } that captures registers from its surrounding scope by value. Capture-only closures take no arguments:

3 > ~$a
4 > ~$b
>{ $a + $b } > ~$add   ::: captures $a and $b
$add ! > ~$r            ::: call with `!`
:wrint { [$r] } !        ::: 7

To take an argument, write the parameter as type:$name > before the body:

10 > ~$base
>{ i32:$x > $x + $base } > ~$f   ::: parameter $x, capturing $base
$f ! { [5] } > ~$s                ::: call with an argument list
:wrint { [$s] } !                  ::: 15

The closure's body is the value of its last expression; captured registers are frozen at the point the closure is created.


12. Ternary Computing

Symbolic is unusual in offering base-3 types alongside binary ones. This is useful for logic with a third state and for balanced-ternary arithmetic.

12.1 trit - three-valued logic

A trit is a Kleene three-state truth value: False, True, Unknown. The built-ins :tand, :tor, :tnot, :teq implement Kleene logic, where Unknown propagates sensibly (e.g. True OR Unknown = True, but False OR Unknown = Unknown).

12.2 trool - four-valued logic

A trool adds a distinct Undefined state that always propagates: anything combined with Undefined is Undefined. This models hardware "don't care / X" signals.

12.3 Balanced ternary integers

The tadd/tsub built-ins do single-digit balanced ternary arithmetic (digits 0, +1, -1), and the wider ti*/tu* types extend this to multi-digit integers, with carries handled by the runtime. Balanced ternary represents negative numbers without a separate sign bit - a genuinely different way to count.


13. Building & Targets

13.1 Compiler flags

The self-hosting symc reads source from stdin and writes a binary to stdout. The --target flag selects the output format (default x64-linux):

symc [--target T] [-O0..3] [--no-std] [--heap MiB] < prog.sym > out
symc --target wasm32 < prog.sym > prog.wasm
Flag Default Meaning
--target T x64-linux output format (see §13.2)
-O0-O3 -O2 optimization level (fold/DCE/copy-prop at -O1, algebraic identities at -O2, strength reduction at -O3)
--no-std off don't prepend the std prelude — used when the source already bundles std (the toolchain itself does)
--heap <max> or <min>:<max> 0:4096 the produced binary's bump heap, MiB, Java -Xms:-Xmx style (§see below)

--heap <max> (or <min>:<max>) sets the bump-allocator heap baked into the output binary, in MiB — like Java's -Xms/-Xmx:

  • max (default 4 GiB) is the heap's hard ceiling — a lazily-mapped BSS reservation on Unix, large enough that the compiler self-compiles to every target (incl. wasm32) without exhausting it. --heap 64 sets just the max.
  • min (default 0) is pre-faulted resident at startup (-Xms): the first min MiB are touched so they're committed immediately rather than lazily on first use. --heap 256:512 → 256 MiB resident up front, 512 MiB ceiling. If min > max it's clamped to max.

It is honored on every hosted platform, each using that platform's natural allocation:

Platform How the heap is provided --heap
Linux / macOS / FreeBSD BSS region, lazily mapped by the kernel (a large ceiling costs nothing until touched) yes
arm64 / riscv64 / loongarch64 same lazily-mapped BSS yes
Windows VirtualAlloc(NULL, min(heap,1 GiB), MEM_RESERVE|MEM_COMMIT) at startup — committed up front, so capped at 1 GiB to avoid over-committing yes (≤ 1 GiB)
UEFI / bare-metal fixed 16 MiB (no OS to grow it) no (fixed)
wasm32 WebAssembly linear memory (64 KiB pages, growable) — sized by its own page model n/a

Because the Unix heap is a lazily-mapped BSS, a 4 GiB default is free until touched, so the allocator never runs off the end on any realistic program. Lower it for embedded / valgrind & callgrind builds (--heap 64) — a multi-GiB bump heap can't be mapped under Valgrind, which is why tests/iai.sh compiles its workloads with symc --heap 64. Raise it for programs that allocate even more.

A subtlety worth knowing: a binary's own heap size is fixed by the compiler that builds it (the emitter reads the running compiler's #heapb when it writes the output's BSS), not by --heap passed to that same binary later. So to grow the compiler's own heap you rebuild it through one self-host stage with the new size; --heap on a normal symc invocation sizes the program symc is currently producing.

The Rust seed symc0 takes a filename and uses -o:

symc0 [--no-run] [--target T] [-o OUT] FILE.sym
symc0 --dump-tokens FILE.sym     # print the token stream
symc0 --dump-ast    FILE.sym     # print the parsed item count
symc0 --dump-ir     FILE.sym     # print IR statistics
symc0 --emit-asm    FILE.sym     # print assembly text instead of a binary

13.2 Targets

All thirteen targets are selected with --target. Every output is a raw, self-contained binary — no assembler or linker is involved.

--target Output format Verified by
x64-linux static x86-64 Linux ELF native execution + fixpoint
x64-macos x86-64 Mach-O format check
x64-windows x86-64 PE/COFF (.exe) format check
x64-uefi x86-64 PE UEFI application format check + qemu OVMF
x64-freebsd x86-64 FreeBSD ELF (OSABI 9) format check
arm64-linux AArch64 Linux ELF qemu-aarch64
arm64-macos AArch64 Mach-O format check
ios-arm64 AArch64 Mach-O (LC_BUILD_VERSION iOS) format check
android-arm64 AArch64 Linux ELF + PT_NOTE format check
riscv64-linux RISC-V 64 Linux ELF qemu-riscv64
loongarch64 LoongArch64 Linux ELF qemu-loongarch64
wasm32 WebAssembly binary (WASI) browser WASI sandbox
spirv SPIR-V binary validator

13.3 The self-hosting compiler

The compiler is the five runes std/ lex/ parse/ ir/ back/, assembled by build-symc.sh and compiled to symc. To reproduce the bootstrap fixpoint:

bash install.sh   # builds symc0 -> symc.lcc -> symc.s1 -> s2 -> s3, asserts s2==s3

The assembled single-file source lives in sigil/runes/symc/src/main.sym. See docs/SELFHOSTING.md for a guided tour.

13.4 Testing & benchmarking

Differential fuzzer (tests/fuzz.sh). The strongest correctness check: it generates random integer programs (tests/fuzzgen.py) in both Symbolic and C, compiles the Symbolic at every optimization level and the C with cc, runs all five, and requires byte-exact agreement. Any divergence is a real bug:

tests/fuzz.sh [symc] [count] [start-seed]
tests/fuzz.sh dist/x64-linux/symc 300 1     # 300 random programs, seeds 1..300
  • the four -O0..-O3 outputs disagreeing → an optimizer / regalloc / codegen bug;
  • all agreeing but != the C oracle → a front-end (parse/lower) or codegen bug.

The C oracle uses uint64_t (defined wraparound), so it's sound. Failing seeds are printed and reproducible: python3 tests/fuzzgen.py --seed <n> --sym p.sym --c p.c regenerates the exact program.

A related check, tests/validate-rcx.sh, guards changes to the backend: it re-runs the 3-stage self-host fixpoint (s2 == s3) and the kernel scoreboard, failing on any miscompile.

Benchmark suite (the bench rune + sigil bench). Benchmarking is native — std provides a monotonic clock (:nowns) and decimal print (:bint); the bench rune (sigil/runes/bench/) is a Criterion-style harness on top:

sigil bench       # times the compile, then samples each benchmark

Each benchmark takes 10 samples at varying iteration counts and reports [min median max] per-op plus a least-squares slope (ns/iter, cancels fixed overhead), classifies outliers with Tukey's fences, auto-scales units (ns/us/ms), and compares the median to a saved criterion-baseline to flag Improved/Regressed/No change. It also prints a perf stat-style cache/memory summary (via perf_event_open; shows n/a when a CI container restricts it).

Cross-language comparisons live alongside it:

  • sigil/runes/bench/vs.sh — the same 64-bit LCG+xorshift loop in every available language (C, Rust, Java, Node, Python, …), each self-timing its loop (startup/JIT excluded), normalized to ns/iter vs C. Symbolic's from-scratch backend lands ~1.18× gcc -O2. The printed 64-bit result must match across languages — a built-in correctness check, so a "win" is never a miscompile.
  • sigil/runes/bench/kernels.sh — a multi-kernel scoreboard vs cc -O3 -march=native, each kernel's integer result checked exactly.
  • sigil/runes/bench/ilp.sh — an instruction-level-parallelism companion to vs.sh (independent chains rather than one latency-bound recurrence).

Appendix A: Complete Symbol Index

NAMES & SIGILS
  $name      register (variable)            ~        ownership / mutability / write
  :name      label / function               ::Name   type
  :::        comment                         #name    hash cell (ephemeral)
  ##name     hash cell (persistent)          ###name  hash cell (ROM constant)
  @name      memory segment                  $T$      generic type parameter
  '          lifetime annotation

LITERALS
  42         decimal int    0xFF hex    0b1010 binary
  3.14       float          (A)  char   ((text\n)) string

ARITHMETIC                COMPARISON               BITWISE / SHIFT
  +   add                  ==  equal                &    and
  ++  multiply             !=  not equal            &+   or
  +++ power                =+  >=                    -&+  xor
  -   subtract             =++ >                     -&   not (prefix)
  --  divide               --= <                     -<   shift left
  --- modulo               -=  <=                     <+   shift right
                                                      --<  rotate left
                                                      <++  rotate right

FLOW & CONTROL
  >          flow value -> destination        >!?    return from function
  >?         conditional flow / jump          >,     32-bit flow (touching comma)
  ?[c]{}     run once if c                    ?[c]{}? loop while c
  -?{}       else                             ??     pattern match
  !          call                             !!     halt
  !!>        break loop                       !!!    panic
  -?!  ?!    conditional calls                -!     dam (guard)

ENCAPSULATION
  [ ]  grouping / argument & parameter lists       { }  blocks / scopes
  &    argument separator / bitwise-and             .   field access
  ,    width & temporal reduction

CONTENT-ADDRESSED
  #[:fn & key]   hash `key` with `:fn`, address the resulting slot

IMPL / CLOSURE
  ^.^   implementation block        >{ ... }   closure (captures by value)

Appendix B: Grammar

A simplified EBNF of the core language (the subset the self-hosting compiler accepts; the reference compiler accepts a superset including types, traits, generics, closures, and ternary forms):

program     := item* ;
item        := fn_decl | cell_decl | stmt ;

fn_decl     := LABEL generic? param_block? stmt* '>!?' ;
generic     := '$' NAME '$' ;
param_block := '{' ( '[' type? REG ']' ('&' '[' type? REG ']')* )? '}' ;
type        := NAME ':' | '$' NAME '$' ;

cell_decl   := ('#'|'##'|'###') NAME INT ;

stmt        := if | loop | 'break' | 'halt' | expr_stmt ;
if          := '?' '[' expr ']' '{' stmt* '}' ( '-?' '{' stmt* '}' )? ;
loop        := '?' '[' expr ']' '{' stmt* '}?' ;
break       := '!!>' ;
halt        := '!!' ;
expr_stmt   := expr ( '>!?' | '>' ('~')? (REG | CELL) )? ;

expr        := primary ( binop expr )* ;
primary     := INT | REG | CELL | call | '[' expr ']'
             | '-' primary | '-&' primary ;
call        := LABEL '{' ( '[' expr ']' ('&' '[' expr ']')* )? '}' '!' ;

binop       := '+'|'++'|'+++'|'-'|'--'|'---'
             | '=='|'!='|'--='|'-='|'=+'|'=++'
             | '&'|'&+'|'-&+'|'-<'|'<+'|'--<'|'<++' ;

        You now know Symbolic. Go read examples/features/ -
        there is one runnable program per feature - then open
        sigil/runes/symc/src/main.sym and watch the language describe itself.

Sources & inspiration: structure modeled on The Rust Programming Language and its table of contents; authoritative symbol semantics are defined in spec.md.