Skip to content

Architecture

This section explains how Vex source moves through the compiler and where the major subsystems live in the repository.

Pipeline Overview

text
source (.vx)
  -> lexer
  -> parser / syntax tree
  -> HIR lowering
  -> type inference + borrow analysis + semantic checks
  -> codegen path selection
    -> LLVM/native path
    -> SIR path for data-parallel graphs
  -> link / run / test integration

The important split is that Vex has both:

  • a native LLVM-oriented path for ordinary systems code
  • a SIR path for tensor-, SIMD-, and graph-oriented lowering

Major Crates

text
crates/
  vex-lexer         tokenization
  vex-parser        syntax parsing and recovery
  vex-syntax        syntax node definitions
  vex-hir           semantic IR, inference, borrow checking
  vex-sir           Silicon IR graphs and backend lowering
  vex-diagnostics   diagnostics formatting and reporting
  vex-compiler      main codegen pipeline and prelude/runtime integration

Front End

vex-lexer

Responsible for turning source text into tokens.

vex-parser

Builds the syntax tree and handles recovery from malformed input well enough to support diagnostics and tooling.

vex-hir

This is where most language semantics become concrete:

  • name resolution
  • type inference
  • pattern handling
  • borrow and move analysis
  • enum/result/option semantics

Native Codegen Path

The native path in vex-compiler lowers checked HIR into LLVM IR and from there into object code and linked executables.

This path is used for ordinary systems programming, CLI tools, servers, runtime code, and general application code.

SIR Path

vex-sir handles data-parallel graph lowering for tensor- and array-oriented computation. It is the basis for:

  • SIMD-friendly expression lowering
  • graph fusion
  • compute backends such as SPIR-V, WGSL, and Metal
  • graph/runtime dispatch decisions for heterogeneous execution

If you are working on vectorized operators, tensors, masks, or GPU-facing computation, this is the path to understand.

Runtime and Tooling

Outside the language crates, the repository also includes:

  • runtime support under lib/runtime/
  • CLI tooling under tools/vex-cli/
  • editor/LSP tooling under tools/vex-lsp/
  • formatter and supporting developer tools under tools/

Where to Start Reading

  • Start in Guide if you want language semantics.
  • Start in CLI Reference if you want execution and command behavior.
  • Start in GPU & SIR and SIMD if you care about the graph path.

Architecture Pages

Lexer (vex-lexer)

The lexer uses Logos for high-performance tokenization.

Token Types

rust
#[derive(Logos, Debug, Clone, PartialEq)]
pub enum Token {
    // Keywords
    #[token("fn")] Fn,
    #[token("let")] Let,
    #[token("let!")] LetMut,
    #[token("if")] If,
    #[token("else")] Else,
    // ...

    // Literals
    #[regex(r"[0-9]+", parse_int)]
    IntLiteral(i64),

    #[regex(r#""[^"]*""#, parse_string)]
    StringLiteral(String),

    // Identifiers
    #[regex(r"[a-zA-Z_][a-zA-Z0-9_]*")]
    Identifier,
}

Automatic Semicolon Insertion

rust
fn insert_semicolons(tokens: Vec<Token>) -> Vec<Token> {
    // Insert semicolons at newlines when:
    // - Previous token can end a statement
    // - Next token can start a statement
    // - Not inside brackets/parens
}

Parser (vex-parser)

Built on Rowan for lossless syntax trees with error recovery.

Syntax Kinds

rust
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
pub enum SyntaxKind {
    // Expressions
    BinaryExpr,
    UnaryExpr,
    CallExpr,
    IndexExpr,
    FieldExpr,

    // Statements
    LetStmt,
    ExprStmt,
    ReturnStmt,

    // Items
    FnDef,
    StructDef,
    EnumDef,
    NODE_CONTRACT_IMPL,
    NODE_CONTRACT,

    // Types
    PathType,
    RefType,
    ArrayType,
    FnType,

    // Patterns
    IdentPat,
    TuplePat,
    StructPat,

    // Tokens
    Ident,
    IntLit,
    StringLit,
    // ...
}

Error Recovery

rust
fn parse_block(&mut self) -> Block {
    self.expect(T!['{']);

    let mut stmts = vec![];
    while !self.at(T!['}']) && !self.at_end() {
        match self.parse_stmt() {
            Ok(stmt) => stmts.push(stmt),
            Err(e) => {
                self.error(e);
                self.recover_to(&[T!['}'], T![;]]);
            }
        }
    }

    self.expect(T!['}']);
    Block { stmts }
}

HIR (vex-hir)

High-level IR with semantic information, powered by Salsa for incremental computation.

Type System

rust
pub enum Ty {
    // Primitives
    Int(IntTy),    // i8, i16, i32, i64, i128
    Uint(UintTy),  // u8, u16, u32, u64, u128
    Float(FloatTy), // f16, f32, f64
    Bool,
    Char,
    Str,
    Never,
    Unit,

    // Compound
    Tuple(Vec<Ty>),
    Array(Box<Ty>, usize),
    Slice(Box<Ty>),
    Ref(Box<Ty>, Mutability, Lifetime),
    Ptr(Box<Ty>, Mutability),

    // User-defined
    Adt(AdtId, Substs),
    Fn(FnSig),
    Closure(ClosureId, Substs),

    // Inference
    Infer(InferTy),
    Error,
}

Borrow Checker

The borrow checker implements Polonius-style Non-Lexical Lifetimes (NLL).

rust
pub struct BorrowChecker {
    cfg: ControlFlowGraph,
    facts: AllFacts,
    regions: RegionInference,
}

impl BorrowChecker {
    pub fn check(&mut self) -> Vec<BorrowError> {
        // Phase 1: Build control flow graph
        self.build_cfg();

        // Phase 2: Compute liveness
        self.compute_liveness();

        // Phase 3: Compute regions
        self.compute_regions();

        // Phase 4: Check borrows
        self.check_borrows()
    }
}

Four-Phase Analysis

  1. Immutability Check: Verify let vs let! usage
  2. Move Analysis: Track ownership transfers
  3. Borrow Analysis: Verify borrowing rules
  4. Lifetime Analysis: NLL region inference

VUMM (Vex Unified Memory Model)

Automatic memory strategy selection.

rust
pub enum BoxKind {
    Unique,     // Single owner
    SharedRc,   // Reference counted (single-thread)
    AtomicArc,  // Atomic ref counted (multi-thread)
    Unknown,    // Not yet determined
}

pub struct VummAnalysis {
    kinds: HashMap<ExprId, BoxKind>,
}

impl VummAnalysis {
    pub fn analyze(&mut self, hir: &Hir) {
        // Phase 1: Escape analysis
        self.escape_analysis();

        // Phase 2: Clone analysis
        self.clone_analysis();

        // Phase 3: Thread analysis
        self.thread_analysis();

        // Phase 4: Kind decision
        self.decide_kinds();

        // Phase 5: Elision optimization
        self.optimize_elisions();
    }
}

SIR (Silicon IR)

Intermediate representation for GPU compute.

Node Types

rust
pub enum SirNode {
    // Data
    Constant(Value),
    Parameter(usize),
    Tensor(Shape, DType),

    // Arithmetic
    Add(NodeId, NodeId),
    Sub(NodeId, NodeId),
    Mul(NodeId, NodeId),
    Div(NodeId, NodeId),

    // Tensor ops
    MatMul(NodeId, NodeId),
    Transpose(NodeId),
    Reshape(NodeId, Shape),

    // Control flow
    If(NodeId, NodeId, NodeId),
    Loop(NodeId, NodeId),

    // Parallel
    ParallelFor(Range, NodeId),
    Reduce(NodeId, ReduceOp),
    Scan(NodeId, ScanOp),
}

Backends

rust
pub trait Backend {
    fn compile(&self, graph: &SirGraph) -> CompiledKernel;
    fn execute(&self, kernel: &CompiledKernel, inputs: &[Tensor]) -> Vec<Tensor>;
}

pub struct SpirVBackend;   // Vulkan
pub struct WgslBackend;    // WebGPU
pub struct MetalBackend;   // Apple Silicon
pub struct ScalarBackend;  // CPU fallback
pub struct SimdBackend;    // CPU SIMD

Automatic Differentiation

rust
pub struct Autograd {
    forward_graph: SirGraph,
    backward_graph: SirGraph,
    adjoints: HashMap<NodeId, NodeId>,
}

impl Autograd {
    pub fn backward(&mut self, output: NodeId) -> SirGraph {
        // Reverse-mode automatic differentiation
        self.adjoints.insert(output, self.constant(1.0));

        for node in self.forward_graph.reverse_postorder() {
            let adjoint = self.adjoints[&node];

            match &self.forward_graph[node] {
                SirNode.Add(a, b) => {
                    self.accumulate_adjoint(*a, adjoint);
                    self.accumulate_adjoint(*b, adjoint);
                },
                SirNode.Mul(a, b) => {
                    self.accumulate_adjoint(*a, self.mul(adjoint, *b));
                    self.accumulate_adjoint(*b, self.mul(adjoint, *a));
                },
                // ... other ops
            }
        }

        self.backward_graph.clone()
    }
}

Runtime

C Runtime Layer

lib/runtime/
├── runtime/
│   └── src/
│       ├── alloc/
│       │   ├── slab.c      # Slab allocator
│       │   ├── arena.c     # Arena allocator
│       │   └── vumm.c      # VUMM runtime support
│       ├── platform/
│       │   ├── syscall.c   # Raw syscalls
│       │   └── thread.c    # Threading primitives
│       └── core/
│           ├── panic.c     # Panic handling
│           └── print.c     # Basic I/O

Slab Allocator

c
typedef struct slab {
    void* memory;
    size_t object_size;
    size_t capacity;
    uint64_t* bitmap;
    struct slab* next;
} slab_t;

typedef struct slab_cache {
    slab_t* partial;
    slab_t* full;
    slab_t* empty;
    size_t object_size;
    pthread_mutex_t lock;
} slab_cache_t;

Thread-Local Caching

c
typedef struct thread_cache {
    void* free_list[SIZE_CLASSES];
    size_t free_count[SIZE_CLASSES];
} thread_cache_t;

__thread thread_cache_t* tlc = NULL;

Diagnostics (vex-diagnostics)

Rich error reporting with source locations.

rust
pub struct Diagnostic {
    severity: Severity,
    message: String,
    span: Span,
    labels: Vec<Label>,
    notes: Vec<String>,
    suggestions: Vec<Suggestion>,
}

impl Diagnostic {
    pub fn emit(&self, source: &str) {
        // Pretty-print error with source context
        // Colored output, underlines, suggestions
    }
}

Example Output

error[E0382]: borrow of moved value: `data`
  --> src/main.vx:10:20
   |
8  |     let data = vec![1, 2, 3];
   |         ---- move occurs because `data` has type `Vec<i32>`
9  |     consume(data);
   |             ---- value moved here
10 |     println(data);
   |             ^^^^ value borrowed here after move
   |
help: consider cloning the value if you need to use it again
   |
9  |     consume(data.clone());
   |                 ++++++++

Build System Integration

Incremental Compilation

rust
#[salsa::query_group(CompilerDatabase)]
pub trait Compiler {
    #[salsa::input]
    fn source(&self, file: FileId) -> Arc<String>;

    fn tokens(&self, file: FileId) -> Arc<Vec<Token>>;
    fn syntax(&self, file: FileId) -> Arc<SyntaxTree>;
    fn hir(&self, file: FileId) -> Arc<Hir>;
    fn types(&self, file: FileId) -> Arc<TypeInfo>;
}

Parallel Compilation

rust
pub fn compile_crate(files: Vec<FileId>) -> Result<(), Error> {
    // Parse all files in parallel
    let parsed: Vec<_> = files
        .par_iter()
        .map(|f| parse(f))
        .collect();

    // Type check with dependency ordering
    let sorted = topological_sort(&parsed);
    for batch in sorted {
        batch.par_iter().for_each(|f| type_check(f));
    }

    // Generate code
    files.par_iter().for_each(|f| codegen(f));

    Ok(())
}

Next Steps

Released under the MIT License.