Author’s Note: This article is for my in-progress language Esta. I’m about 5 months new to Rust and 3 months new to programming language design so I’d appreciate any insights on things I can do better as a comment on the associated Reddit post or as an Issue/PR on the repo. Thanks and I hope you enjoy! :)

Background

We live in a great era for language design. Within the last 5-10 years, several innovative languages have come out and won over the hearts of many developers with a newfound focus on memory safety (Rust), runtime interoperability (JVM: Kotlin, V8: Typescript, BEAM: Elixir), first class concurrency (Go, Pony), dependent types (Idris), Language oriented Programming (Racket) and many more inspired features. In this spirit, I have decided to throw my hat into the ring as well and create my own language for fun.

So you want to build a Programming Language

To be honest, for a while, compilers and langauge design seemed like the “final frontier” of computer science to me. This is understandable if you ever pick up a textbook only to be inundated by the formal expressions of type coercion or or the implementation of an LALR(1) parser. I have actually had some background translating C back and from x86 so the idea of assembly was not too intimidating but I’d assume it would have definitely been high on the list if I had attempted this project a few years earlier.

The best way to fight this complexity is to dive straight in and try to build a simple project. There are a myriad of good tutorials online but the one I finally committed to was Crafting Interpreters. Bob works on the Dart Language and has a fantastic way of balancing abstract concepts with concrete implementations.

This tutorial is writen in Java but can be easily imitated in pretty much any other sane language so I decided to do mine in Rust. It’s really a testament to Bob’s skills as a teacher to make this process not only highly enjoyable, but very educational and inspiring. My langauge (that I will talk about later in the article) takes a lot of ideas from the Lox language he implements.

I won’t have time to link all of these below but I’d like to take a shoutout to some great languages implemented in Rust that gave me a lot of good design practices to try to build off of: Gleam, Gluon, Iridium.

Building an AST in Rust

Rust has turned out to be both a fantastic and a terrible language to write a compiler in. I’ll go ahead and focus on the bad stuff first. Writing a language will almost always require you to parse your tokens into some kind of Abstract Syntax Tree (AST). If you want to learn more about AST, there are a million great sources but for now, I can link you to a page from Crafting Interpreters for a more in depth explanation. Typically you’d implement a treelike data structure in a language like C in the following way:

typedef struct Node {
    int data;
    struct Node* left;
    struct Node* right;
} Node;

int main()
{
    Node n;
    Node l;
    n.left = &l;
}

The same code would translate to rust like:

#[derive(Clone, Default, Debug)]
struct Node {
    data: isize,
    left: Box<Node>,
    right: Box<Node>,
}

fn main() {
    let mut n: Node = Default::default();
    let l: Node = Default::default();
    n.left = Box::new(l);
}

But when we run the code:

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
timeout: the monitored command dumped core

Why is this? Since Rust has no Null types, left and right must always also be initialized in our code. This means that when we try to create a node, the rust compiler will also generate code to generate the children nodes. Since these chilren nodes try to do the same thing, this is recursion with no base case, which blows the stack.

If this kind of memory management seems interesting to you, check out Learning Rust with entirely too many linked lists. The simple way to fix this error is to make left and right Option<Box<Node>>. Option is an enum that has two variants, Some<T> and None. By defaulting to None, the code will stop recursing once the first struct has been allocated.

#[derive(Clone, Default, Debug)]
struct Node {
    data: isize,
    left: Option<Box<Node>>,
    right: Option<Box<Node>>,
}

fn main() {
    let mut n: Node = Default::default();
    let mut l: Node = Default::default();
    n.left = Some(Box::new(l));
    println!("{:?}", n);
}
    Finished dev [unoptimized + debuginfo] target(s) in 0.53s
    Running `target/debug/playground
    Node { data: 0, left: Some(Node { data: 0, left: None, right: None }), right: None }

We have several problems with this approach, but the biggest is that nodes in our AST aren’t homogenous. Some of them have one child, some have none and some don’t even have a fixed amount. One strategy that is pretty popular in the Rust community is to make an enum of Tuple Structs. There are many examples, but one concrete one you can take a look at is from Nick Fitsgerald here.

In my language (Esta), the AST looks like:

#[derive(Debug, Clone)]
pub enum Stmt {
    Block(Vec<Box<Stmt>>, bool),
    If(Box<Expr>, Box<Stmt>, Box<Stmt>),
    While(Box<Expr>, Box<Stmt>),
    Return(Option<Box<Expr>>),
    Declaration(Identifier),
    FunDecl(Identifier, Vec<Identifier>, Box<Stmt>),
    Assignment(Box<Expr>, Box<Expr>),
    Struct(String, Vec<Identifier>),
}

#[derive(Debug, Clone)]
pub enum Expr {
    Id(Identifier),
    Literal(Literal),
    List(Vec<Box<Expr>>),
    BinaryOp(Box<Expr>, Opcode, Box<Expr>),
    UnaryOp(Opcode, Box<Expr>),
    FunCall(String, Vec<Expr>),
}

Nice things about writing an AST with enum’s of tuple structs

  1. This is concise. By using tuple structs, you can skip a lot of boilerplate code for initializing and formally definining each of these structs. When your AST has 20+ different types, this can quickly become overwhelming and less of a pain to refactor and extend.

  2. Rust has fantastic support for enums storing other types. Unlike in C or C++, where enums only map a tag to an integer, Rust enums can store data. This unification between a type of the container and the contents of the container is extremely useful when doing iteration.

  3. Enums are practically made for Rust’s pattern matching. If you don’t know, pattern matching in Rust is extremely powerful and can be used to quickly dispatch generic data based on its type. I’ve included an example of this in my code below for effect.

fn dispatch_expr(ctx: &mut AsmCtx, e: &Expr, l_value: bool) -> DispatchRet {
    match e {
        Expr::Id(id) => make_identifier(ctx, id, l_value),
        Expr::Literal(lit) => make_literal(ctx, &lit),
        Expr::BinaryOp(lhs, op, rhs) => make_binary(ctx, &lhs, &op, &rhs),
        Expr::UnaryOp(op, rhs) => make_unary(ctx, &op, rhs),
        Expr::FunCall(id, args) => make_funcall(ctx, &id, &args),
    }
}

Pain points of writing an AST with enum’s of tuple structs

  1. Initializing enums with tuple structs and boxes is a bit wordy. Since every variant must be namespaced, a simple enum initialization may turn into something like:
    let inst = MetaAsm::Inst(MetaInst::new_label(ByteCode::JUMPZ, cont_lbl.clone()));
    
  2. As far as I can tell, Rust has no way for a function to accept a specific variant of an enum. A lot of my functions are meant to process a specific type of an expression, e.g. a function that only takes literal expressions. To get around with this, I have had to add a number of if let Expr::Literal(lit) = e expressions to my functions which is a little annoying, because I’d like to somehow enforce that this function can only be called on enums of a certain variant. I do accept that this is a safer method because it forces me to accept all possible inputs, but that dosen’t mean it makes me happy to use it. Here’s an example of what I have to do vs. what I’d like to do.
enum Measure {
    Inches(i32),
    Centimeters(i32),
}

// Will compile
fn print_cm(b: Measure) {
    if let Measure::Centimeters(b) = b {
        println!("{}", b);
    }
}

// Won't compile
/*
fn print_cm(b: Measure::Centimeters) {
    println!("{}", b);
}
*/

fn main() {
    let b = Measure::Centimeters(3);
    print_cm(b);
}
  1. My last issue is that tuple structs have no field names. While this certainly makes them easier to instantiate, it leads to inconsistent and misleading naming whenever I have to unpack them into discrete values. I worry that as my project grows, this will lead to inconsitent variable naming, which will make it harder for me to use common interfaces and designs.

Traversing over the AST

Once we have built our AST, we want to be able to do something with it. Typically, this happens in one of three ways. A pre-order traversal looks at the parent node first, and then at the children. This is useful when you want to implement some kind of top-down algorithm which derives information from the parent and propogates it down to the children. An in-order traversal visits the left subtree first, then examines the parent, and then finally the right subtree. I have not found a use-case for this type of traversal, but it is possible that it might make sense where an operator has an order of operations. The last traversal is a post-order traversal in which the children are examined first, and then the parents are examined only after. This is useful when the parent node needs information from it’s children because you can percolate it back up the tree after the nodes return from their traversals.

A lot of languages try to implement this in two ways. Procedural/Object Oriented languages like Java often employ a software design called the Visitor Pattern. The idea with a visitor pattern is that there is some common traversal order and iterator shared between all traversals, and then each specific traversal also implements other functions which can do certain things. This is the design pattern that Crafting Interpreters uses. I believe that this is also the pattern that the Rust Compiler uses, although the project is so large I think it is misleading to say it totally relies on the visitor pattern. The first few iterations of my traversal very closely followed a visitor pattern I found on Github which implements the Visitor Pattern on a enum of tuple structs.

// Source:
// https://github.com/rust-unofficial/patterns/blob/master/patterns/visitor.md
mod visit {
    use ast::*;

    pub trait Visitor<T> {
        fn visit_name(&mut self, n: &Name) -> T;
        fn visit_stmt(&mut self, s: &Stmt) -> T;
        fn visit_expr(&mut self, e: &Expr) -> T;
    }
}

struct Interpreter;
impl Visitor<i64> for Interpreter {
    fn visit_name(&mut self, n: &Name) -> i64 { panic!() }
    fn visit_stmt(&mut self, s: &Stmt) -> i64 {
        match *s {
            Stmt::Expr(ref e) => self.visit_expr(e),
            Stmt::Let(..) => unimplemented!(),
        }
    }

    fn visit_expr(&mut self, e: &Expr) -> i64 {
        match *e {
            Expr::IntLit(n) => n,
            Expr::Add(ref lhs, ref rhs) => self.visit_expr(lhs) + self.visit_expr(rhs),
            Expr::Sub(ref lhs, ref rhs) => self.visit_expr(lhs) - self.visit_expr(rhs),
        }
    }
}

In my experience, this is a very good pattern for “top-down” iteration but suffers for algorithms where data must flow “bottom-up”. One example of this was when I wanted to check typing on binary operators so that expressions like 1 + 1 would show the + operator that there are two numbers below, but operations like 1 + 'a' would show the + operator that the types below didn’t match. The issue with this is that in most cannonical visitor pattern implementations, all information is stored in the self state. This makes it extremely awkward to keep adding and removing information from the self state as levels are passed.

This is where we get into our second type of traversal, common in the functional programming world. I’m not sure what this is formally called but I have been calling it a Fold and that seems to be somewhat consitent with what I read on forums. The idea is that similar to the traversal, each node returning its own data, and then some kind of reduce function that takes many instances of returned data and turns it into one instance of data to continue going up. The best code I found regarding this was on a Reddit post in r/haskell. Op eventually comes up with a very cool folding traversal that shows off just how functional Rust can be if you know what you are doing: Link.

My Fold Trait

I opted to work with something in the middle that can be used for both “bottom-up” and “top-down” algorithms. This is seen in the associated types, DownT and UpT. You’ll see later an example of what these types define to, but if they are not used, they can also be defined to the empty tuple (), which is a zero-space data type in Rust. As it might sound, DownT is propogated to the child functions as an argument, and is useful for “top-down” algorithms, where as UpT is returned from child functions and therefore is useful for “bottom-up” algorithms. The way this design works is to perform what I like to call a “Dispatch” where a single Stmt or Expr is decoded into it’s variant and sent to a specialized function to process. Once this specialized function is done, it returns it’s value, which propogates back up the call tree to it’s parent expression. My main interest now is if Self::DownT should be mutable or not, but I haven’t run into an issue with that as of yet.

pub trait Fold {
    type UpT; // This value is passed down before the traversal
    type DownT; // This value is passed up after the traversal

    fn reduce(children: Vec<Option<Self::UpT>>) -> Option<Self::UpT> {
        None
    }

    fn fold_block(down: &Self::DownT, body: &Vec<Box<Stmt>>, is_scope: &bool) -> Option<Self::UpT> {
        let children = body.iter().map(|b| Self::fold_stmt(down, b)).collect();
        Self::reduce(children)
    }

    // Other fold functions are similar, but skipped for brevity

    fn fold_stmt(down: &Self::DownT, s: &Stmt) -> Option<Self::UpT> {
        match s {
            Stmt::Block(body, is_scope) => Self::fold_block(down, body, is_scope),
            Stmt::If(test, body, alter) => Self::fold_if(down, test, body, alter),
            Stmt::While(test, body) => Self::fold_while(down, test, body),
            Stmt::Return(value) => Self::fold_return(down, value),
            Stmt::Declaration(id) => Self::fold_declaration(down, id),
            Stmt::FunDecl(id, params, body) => Self::fold_fundecl(down, id, params, body),
            Stmt::Assignment(lhs, rhs) => Self::fold_assignment(down, lhs, rhs),
            Stmt::Struct(id, fields) => Self::fold_struct(down, id, fields),
        }
    }

    fn fold_expr(down: &Self::DownT, e: &Expr) -> Option<Self::UpT> {
        match e {
            Expr::Id(id) => Self::fold_id(down, id),
            Expr::Literal(lit) => Self::fold_literal(down, lit),
            Expr::BinaryOp(lhs, op, rhs) => Self::fold_binary(down, lhs, op, rhs),
            Expr::UnaryOp(op, rhs) => Self::fold_unary(down, op, rhs),
            Expr::FunCall(id, args) => Self::fold_funcall(down, id, args),
            Expr::List(xs) => Self::fold_list(down, xs),
            Expr::Dot(this, action) => Self::fold_dot(down, this, action),
        }
    }

And as a last thing, here is it in action being used to discover all unique structs in a program. This uses a “bottom-up” approach, so as you can see DownT is defined as the empty tuple, but UpT is a Vector of Struct Definitions. The custom reduce function is used to combine several vectors returned by child nodes into a single vector and then continue to propogate this up. I only got this working last night, so it is a very much work in progress, but after about 4 different designs, this one seems like the most solid by far.

impl TypeCollector {
    pub fn collect_types(body: &Stmt) -> Option<Vec<EstaStruct>> {
        if let Some(s) = TypeCollector::fold_stmt(&(), body) {
            let s = s
                .into_iter()
                .enumerate()
                .map(move |(i, s)| EstaStruct { tag: i, ..s })
                .collect();
            Some(s)
        } else {
            None
        }
    }
}

#[derive(Debug, Clone, Default)]
pub struct EstaStruct {
    pub tag: usize,                     // Unique identifier
    pub size: usize,                    // Size of the entire struct
    pub fields: HashMap<String, usize>, // Offset of each field.
}

impl Fold for TypeCollector {
    type UpT = Vec<EstaStruct>;
    type DownT = ();

    fn reduce(children: Vec<Option<Self::UpT>>) -> Option<Self::UpT> {
        if children.len() > 0 {
            let mut new_children = Vec::new();
            // TODO: Can make this more functional with iter() and collect()
            for child in children {
                if let Some(child) = child {
                    new_children.extend(child)
                }
            }
            Some(new_children)
        } else {
            None
        }
    }

    fn fold_struct(_: &Self::DownT, id: &String, fields: &Vec<Identifier>) -> Option<Self::UpT> {
        Some(vec![EstaStruct::new(&Stmt::Struct(
            id.clone(),
            fields.clone(),
        ))])
    }

Conclusion

Thanks for sticking around to the end! I hope to add several more parts as I continue to add features and flesh out my Virtual Machine as well as add compound data types, closures and heap allocation. Like I said above, I’m about 5 months new to Rust and 3 months new to programming language design so I’d appreciate any insights on things I can do better as a comment on the associated Reddit post or as an Issue/PR on the repo.