Architecture
This section explains how Vex source moves through the compiler and where the major subsystems live in the repository.
Pipeline Overview
source (.vx)
-> lexer
-> parser / syntax tree
-> HIR lowering
-> type inference + borrow analysis + semantic checks
-> codegen path selection
-> LLVM/native path
-> SIR path for data-parallel graphs
-> link / run / test integrationThe important split is that Vex has both:
- a native LLVM-oriented path for ordinary systems code
- a SIR path for tensor-, SIMD-, and graph-oriented lowering
Major Crates
crates/
vex-lexer tokenization
vex-parser syntax parsing and recovery
vex-syntax syntax node definitions
vex-hir semantic IR, inference, borrow checking
vex-sir Silicon IR graphs and backend lowering
vex-diagnostics diagnostics formatting and reporting
vex-compiler main codegen pipeline and prelude/runtime integrationFront End
vex-lexer
Responsible for turning source text into tokens.
vex-parser
Builds the syntax tree and handles recovery from malformed input well enough to support diagnostics and tooling.
vex-hir
This is where most language semantics become concrete:
- name resolution
- type inference
- pattern handling
- borrow and move analysis
- enum/result/option semantics
Native Codegen Path
The native path in vex-compiler lowers checked HIR into LLVM IR and from there into object code and linked executables.
This path is used for ordinary systems programming, CLI tools, servers, runtime code, and general application code.
SIR Path
vex-sir handles data-parallel graph lowering for tensor- and array-oriented computation. It is the basis for:
- SIMD-friendly expression lowering
- graph fusion
- compute backends such as SPIR-V, WGSL, and Metal
- graph/runtime dispatch decisions for heterogeneous execution
If you are working on vectorized operators, tensors, masks, or GPU-facing computation, this is the path to understand.
Runtime and Tooling
Outside the language crates, the repository also includes:
- runtime support under
lib/runtime/ - CLI tooling under
tools/vex-cli/ - editor/LSP tooling under
tools/vex-lsp/ - formatter and supporting developer tools under
tools/
Where to Start Reading
- Start in Guide if you want language semantics.
- Start in CLI Reference if you want execution and command behavior.
- Start in GPU & SIR and SIMD if you care about the graph path.
Architecture Pages
- Compiler Pipeline: source to HIR to native/SIR lowering
- SIR & Backends: graph path, backend maturity, and acceleration routing
- Runtime & Tooling: runtime model, CLI, tests, docs, and editor tooling
Lexer (vex-lexer)
The lexer uses Logos for high-performance tokenization.
Token Types
#[derive(Logos, Debug, Clone, PartialEq)]
pub enum Token {
// Keywords
#[token("fn")] Fn,
#[token("let")] Let,
#[token("let!")] LetMut,
#[token("if")] If,
#[token("else")] Else,
// ...
// Literals
#[regex(r"[0-9]+", parse_int)]
IntLiteral(i64),
#[regex(r#""[^"]*""#, parse_string)]
StringLiteral(String),
// Identifiers
#[regex(r"[a-zA-Z_][a-zA-Z0-9_]*")]
Identifier,
}Automatic Semicolon Insertion
fn insert_semicolons(tokens: Vec<Token>) -> Vec<Token> {
// Insert semicolons at newlines when:
// - Previous token can end a statement
// - Next token can start a statement
// - Not inside brackets/parens
}Parser (vex-parser)
Built on Rowan for lossless syntax trees with error recovery.
Syntax Kinds
#[derive(Clone, Copy, PartialEq, Eq, Hash)]
pub enum SyntaxKind {
// Expressions
BinaryExpr,
UnaryExpr,
CallExpr,
IndexExpr,
FieldExpr,
// Statements
LetStmt,
ExprStmt,
ReturnStmt,
// Items
FnDef,
StructDef,
EnumDef,
NODE_CONTRACT_IMPL,
NODE_CONTRACT,
// Types
PathType,
RefType,
ArrayType,
FnType,
// Patterns
IdentPat,
TuplePat,
StructPat,
// Tokens
Ident,
IntLit,
StringLit,
// ...
}Error Recovery
fn parse_block(&mut self) -> Block {
self.expect(T!['{']);
let mut stmts = vec![];
while !self.at(T!['}']) && !self.at_end() {
match self.parse_stmt() {
Ok(stmt) => stmts.push(stmt),
Err(e) => {
self.error(e);
self.recover_to(&[T!['}'], T![;]]);
}
}
}
self.expect(T!['}']);
Block { stmts }
}HIR (vex-hir)
High-level IR with semantic information, powered by Salsa for incremental computation.
Type System
pub enum Ty {
// Primitives
Int(IntTy), // i8, i16, i32, i64, i128
Uint(UintTy), // u8, u16, u32, u64, u128
Float(FloatTy), // f16, f32, f64
Bool,
Char,
Str,
Never,
Unit,
// Compound
Tuple(Vec<Ty>),
Array(Box<Ty>, usize),
Slice(Box<Ty>),
Ref(Box<Ty>, Mutability, Lifetime),
Ptr(Box<Ty>, Mutability),
// User-defined
Adt(AdtId, Substs),
Fn(FnSig),
Closure(ClosureId, Substs),
// Inference
Infer(InferTy),
Error,
}Borrow Checker
The borrow checker implements Polonius-style Non-Lexical Lifetimes (NLL).
pub struct BorrowChecker {
cfg: ControlFlowGraph,
facts: AllFacts,
regions: RegionInference,
}
impl BorrowChecker {
pub fn check(&mut self) -> Vec<BorrowError> {
// Phase 1: Build control flow graph
self.build_cfg();
// Phase 2: Compute liveness
self.compute_liveness();
// Phase 3: Compute regions
self.compute_regions();
// Phase 4: Check borrows
self.check_borrows()
}
}Four-Phase Analysis
- Immutability Check: Verify
letvslet!usage - Move Analysis: Track ownership transfers
- Borrow Analysis: Verify borrowing rules
- Lifetime Analysis: NLL region inference
VUMM (Vex Unified Memory Model)
Automatic memory strategy selection.
pub enum BoxKind {
Unique, // Single owner
SharedRc, // Reference counted (single-thread)
AtomicArc, // Atomic ref counted (multi-thread)
Unknown, // Not yet determined
}
pub struct VummAnalysis {
kinds: HashMap<ExprId, BoxKind>,
}
impl VummAnalysis {
pub fn analyze(&mut self, hir: &Hir) {
// Phase 1: Escape analysis
self.escape_analysis();
// Phase 2: Clone analysis
self.clone_analysis();
// Phase 3: Thread analysis
self.thread_analysis();
// Phase 4: Kind decision
self.decide_kinds();
// Phase 5: Elision optimization
self.optimize_elisions();
}
}SIR (Silicon IR)
Intermediate representation for GPU compute.
Node Types
pub enum SirNode {
// Data
Constant(Value),
Parameter(usize),
Tensor(Shape, DType),
// Arithmetic
Add(NodeId, NodeId),
Sub(NodeId, NodeId),
Mul(NodeId, NodeId),
Div(NodeId, NodeId),
// Tensor ops
MatMul(NodeId, NodeId),
Transpose(NodeId),
Reshape(NodeId, Shape),
// Control flow
If(NodeId, NodeId, NodeId),
Loop(NodeId, NodeId),
// Parallel
ParallelFor(Range, NodeId),
Reduce(NodeId, ReduceOp),
Scan(NodeId, ScanOp),
}Backends
pub trait Backend {
fn compile(&self, graph: &SirGraph) -> CompiledKernel;
fn execute(&self, kernel: &CompiledKernel, inputs: &[Tensor]) -> Vec<Tensor>;
}
pub struct SpirVBackend; // Vulkan
pub struct WgslBackend; // WebGPU
pub struct MetalBackend; // Apple Silicon
pub struct ScalarBackend; // CPU fallback
pub struct SimdBackend; // CPU SIMDAutomatic Differentiation
pub struct Autograd {
forward_graph: SirGraph,
backward_graph: SirGraph,
adjoints: HashMap<NodeId, NodeId>,
}
impl Autograd {
pub fn backward(&mut self, output: NodeId) -> SirGraph {
// Reverse-mode automatic differentiation
self.adjoints.insert(output, self.constant(1.0));
for node in self.forward_graph.reverse_postorder() {
let adjoint = self.adjoints[&node];
match &self.forward_graph[node] {
SirNode.Add(a, b) => {
self.accumulate_adjoint(*a, adjoint);
self.accumulate_adjoint(*b, adjoint);
},
SirNode.Mul(a, b) => {
self.accumulate_adjoint(*a, self.mul(adjoint, *b));
self.accumulate_adjoint(*b, self.mul(adjoint, *a));
},
// ... other ops
}
}
self.backward_graph.clone()
}
}Runtime
C Runtime Layer
lib/runtime/
├── runtime/
│ └── src/
│ ├── alloc/
│ │ ├── slab.c # Slab allocator
│ │ ├── arena.c # Arena allocator
│ │ └── vumm.c # VUMM runtime support
│ ├── platform/
│ │ ├── syscall.c # Raw syscalls
│ │ └── thread.c # Threading primitives
│ └── core/
│ ├── panic.c # Panic handling
│ └── print.c # Basic I/OSlab Allocator
typedef struct slab {
void* memory;
size_t object_size;
size_t capacity;
uint64_t* bitmap;
struct slab* next;
} slab_t;
typedef struct slab_cache {
slab_t* partial;
slab_t* full;
slab_t* empty;
size_t object_size;
pthread_mutex_t lock;
} slab_cache_t;Thread-Local Caching
typedef struct thread_cache {
void* free_list[SIZE_CLASSES];
size_t free_count[SIZE_CLASSES];
} thread_cache_t;
__thread thread_cache_t* tlc = NULL;Diagnostics (vex-diagnostics)
Rich error reporting with source locations.
pub struct Diagnostic {
severity: Severity,
message: String,
span: Span,
labels: Vec<Label>,
notes: Vec<String>,
suggestions: Vec<Suggestion>,
}
impl Diagnostic {
pub fn emit(&self, source: &str) {
// Pretty-print error with source context
// Colored output, underlines, suggestions
}
}Example Output
error[E0382]: borrow of moved value: `data`
--> src/main.vx:10:20
|
8 | let data = vec![1, 2, 3];
| ---- move occurs because `data` has type `Vec<i32>`
9 | consume(data);
| ---- value moved here
10 | println(data);
| ^^^^ value borrowed here after move
|
help: consider cloning the value if you need to use it again
|
9 | consume(data.clone());
| ++++++++Build System Integration
Incremental Compilation
#[salsa::query_group(CompilerDatabase)]
pub trait Compiler {
#[salsa::input]
fn source(&self, file: FileId) -> Arc<String>;
fn tokens(&self, file: FileId) -> Arc<Vec<Token>>;
fn syntax(&self, file: FileId) -> Arc<SyntaxTree>;
fn hir(&self, file: FileId) -> Arc<Hir>;
fn types(&self, file: FileId) -> Arc<TypeInfo>;
}Parallel Compilation
pub fn compile_crate(files: Vec<FileId>) -> Result<(), Error> {
// Parse all files in parallel
let parsed: Vec<_> = files
.par_iter()
.map(|f| parse(f))
.collect();
// Type check with dependency ordering
let sorted = topological_sort(&parsed);
for batch in sorted {
batch.par_iter().for_each(|f| type_check(f));
}
// Generate code
files.par_iter().for_each(|f| codegen(f));
Ok(())
}Next Steps
- Language Reference - Complete syntax reference
- Contributing - How to contribute
- API Documentation - Internal API docs