Rex Spec (Locked Semantics)
This document records the intended, production-facing semantics of the current Rex implementation. When behavior changes, this file and the corresponding regression tests should be updated together.
Regression tests live in:
rex/tests/spec_semantics.rsrex/tests/record_update.rsrex/tests/typeclasses_system.rsrex/tests/negative.rs
Notation
Γ ⊢ e : τmeans “under type environmentΓ, expressionehas typeτ”.C τmeans a typeclass predicate (constraint) for classCat typeτ.- “Ground” means “contains no free type variables” (
ftv(τ) = ∅). - Rex’s multi-parameter classes are represented internally by packing the parameters into tuples:
- unary
C aisPredicate { class: C, typ: a } - binary
C t aisPredicate { class: C, typ: (t, a) } - etc.
- unary
Lexical Comments
Rex comments are lexical trivia and are removed before parsing:
//starts a line comment that runs to the next newline or end of file./* ... */starts a block comment. Block comments may span lines.- Nested block comments are not supported.
- The legacy
{- ... -}spelling is ordinary syntax, not a comment.
String Literals
String literals are decoded during lexing.
- Double-quoted (
"...") and single-quoted ('...') literals both producestringvalues. - C-style simple escapes are supported:
\a,\b,\f,\n,\r,\t,\v,\\,\",\', and\?. - Octal escapes use one to three octal digits (
\0through\777). - Hex escapes use
\xfollowed by one or more hexadecimal digits. - Unicode escapes use
\ufollowed by exactly four hexadecimal digits, or\Ufollowed by exactly eight hexadecimal digits. - Backslash-newline is a line continuation and produces no character.
- Unsupported or malformed escape sequences are lexical errors.
Program Entry Points
A Rex source is a compilation unit with zero or more declarations and an optional final expression. Entry-point execution uses a single program entry point:
- If the source defines a top-level
fn main, that function is the entry point. It is an error for the same source to also contain a final expression. - If the source does not define
main, the final expression is treated as an implicit zero-argument entry point. - If the source does not define
mainand has no final expression, entry-point execution is an error.
The CLI supplies arguments to an explicit main from a JSON object passed with
--inputs. The object keys must exactly match the main parameter names, and
each value is converted with json_to_rex using the corresponding parameter
type. JSON inputs require concrete parameter types. A runnable source without
main uses its final expression as an implicit entry point and conceptually has
the empty input object {}.
Module Imports
Rex distinguishes between:
- program entry-point execution, and
- module files used by the import system (declaration-only).
When a .rex file is loaded as a module via the module system, it must not contain a top-level
expression result.
Syntax
Top-level imports support three forms:
import foo.bar as Bar;
import foo.bar (*);
import foo.bar (x, y, z as q);
Rules:
- Import declarations are terminated by explicit semicolons.
import <module> as <Alias>imports the module namespace and requires qualified access (Alias.member).import <module> (*)imports all exported values, types, and classes into unqualified scope.import <module> (x, y as z)imports selected exported values, types, and classes into unqualified scope.as <Alias>on the module and(...)import clauses are mutually exclusive.
Visibility and Exports
Only exported (pub) values, types, and classes are importable through (*) and item clauses.
Module aliases expose all export namespaces for qualified lookup:
-
Alias.valueresolves against exported values (including constructors). -
Alias.Typeresolves against exported type names in type positions. -
Alias.Classresolves against exported class names in class-constraint positions. -
Missing requested exports are module errors.
-
Private (non-
pub) values are not importable.
Name Binding and Conflicts
- Imported unqualified names participate in lexical shadowing.
- Lexically bound names (lambda params,
letvars, pattern bindings) shadow imported names. - Importing a name that conflicts with a local top-level declaration is a module error.
- Importing the same unqualified name more than once (including via aliasing) is a module error.
Type/class rewrites run with declaration ordering semantics:
- In binder forms that carry type syntax (
\ (x : T) -> ...,let rec f : T = ...), the binder being introduced does not suppress alias resolution inside its own annotation. - Missing alias members used in type/class positions (function signatures, annotations,
whereconstraints, instance headers, and superclass clauses) are reported as module errors.
Module Initialization
- Importing a module does not execute arbitrary top-level expressions.
- Module initialization is declaration-driven: exported values/types/classes are registered from declarations, and import resolution rewrites references to canonical internal symbols.
- Cyclic imports are supported via strongly connected component (SCC) loading of module interfaces.
Let Rec Bindings
Syntax
Recursive bindings use let rec with comma-separated entries:
let rec
a = ...,
b = ...
in
body
Rules:
let recentries are separated by commas.let recbindings must bind variables (not arbitrary patterns).
Top-Level Declaration Terminators
Top-level type, fn, declare fn, and import declarations are terminated by explicit
semicolons:
import math.core as Math;
type Box a = Box a;
fn inc x: i32 -> i32 = x + 1;
declare fn host_value : i32;
Rules:
- The semicolon terminates the declaration, not a nested type or expression itself.
- The terminating semicolon is found at top-level expression/type depth; semicolons nested inside parentheses, brackets, braces, or blocks do not terminate the declaration.
- Indentation and newlines do not delimit declarations.
Explicit Type Parameters
Type variables used in annotations, constraints, class heads, or instance heads must be declared by the syntactic form that binds them.
Examples:
type Box a = Box a;
fn id<a> x: a -> a = x;
declare fn host_id<a> x: a -> a;
let id<a>: a -> a = \x -> x in id 1
class Size a where { size : a -> i32; }
instance<a> Show (List a) <= Show a where { show = prim_show; }
Rules:
- A bare unknown type name is an error, even when it starts with a lowercase letter.
- Top-level
fn,declare fn, namedlet, class methods, and instance methods bind type parameters with<...>after the value name. typeandclassdeclarations bind type parameters with whitespace after the declaration head.instancedeclarations bind type parameters with<...>immediately afterinstance.
Top-Level fn Recursion
Top-level fn declarations are mutually recursive within a module.
This means:
- A top-level
fnmay reference itself. - A top-level
fnmay reference other top-levelfndeclarations in the same module, regardless of declaration order.
Operationally, top-level fn recursion follows the same fixed-point semantics as recursive
bindings in let rec, but at declaration scope.
Record Projection
Syntax
Field projection is an expression:
base.field
Typing (Definite Fields)
Let Γ ⊢ base : T. Projection is well-typed iff the field is definitely available on T:
- If
Tis a record type{ ..., field : τ, ... }, thenΓ ⊢ base.field : τ. - If
Tis a single-variant ADT whose payload is a record containingfield : τ, thenΓ ⊢ base.field : τ. - If
Tis a multi-variant ADT, projection is accepted only if the typechecker can prove the current constructor is a specific record-carrying variant (typically viamatchrefinement or by tracking known constructors through let-bound variables).
If the typechecker cannot prove the constructor for a multi-variant ADT, the field is considered “not definitely available”, and projection is rejected.
Evaluation
Evaluation is strict in base. At runtime, projection reads the field out of the record payload:
- for plain records/dicts, it indexes the map by the field symbol.
- for record-carrying ADT values, it indexes the payload record/dict.
Missing fields are a runtime error (EngineError::UnknownField) when projection is attempted on a
non-record-like value.
Record Update
Syntax
Record update is an expression:
{ base with { field1 = e1, field2 = e2 } }
Typing (Definite Fields)
Let Γ ⊢ base : T. Record update is well-typed iff:
- Each updated field exists on the definite record shape of
T. Tis one of:- a record type
{ field: Ty, ... }, OR - a single-variant ADT whose payload is a record, OR
- a multi-variant ADT after the expression has been refined to a specific record-carrying constructor (the typechecker tracks this refinement).
- a record type
- For each update
fieldᵢ = eᵢ, the update expression unifies with the declared field type.
If the base type is a multi-variant ADT and the typechecker cannot prove the current constructor, record update is rejected (the field is “not definitely available”).
Typing: Known-Constructor Refinement
The typechecker refines “which constructor is known” via two mechanisms:
- Pattern matching: within a
case K { ... } -> ...arm, the scrutinee is known to beK. - Let-bound known constructors: when a variable is bound to a value constructed with a record-carrying constructor, the variable may carry “known variant” information forward.
This enables the common pattern:
type Sum = A { x: i32 } | B { x: i32 };
let s: Sum = A { x = 1 } in
match s with {
case A {x} -> { s with { x = x + 1 } };
case B {x} -> { s with { x = x + 2 } };
}
Evaluation
Evaluation is strict:
- Evaluate
baseto a value. - Evaluate all update expressions (left-to-right in the implementation’s map iteration order).
- Apply updates:
- If
baseis a plain record/dict value, updates replace existing fields. - If
baseis an ADT whose payload is a record/dict, updates replace fields in the payload and re-wrap the constructor tag.
- If
Runtime errors:
- Updating a non-record-like runtime value is
EngineError::UnsupportedExpr.
Type Classes: Coherence, Resolution, and Ambiguity
Instance Coherence (No Overlap)
For each class C, instance heads are non-overlapping:
- When injecting a new instance head
H, it is rejected if it unifies with any existing head for the same classC.
This forbids overlap and preserves deterministic method resolution.
Regression: spec_typeclass_instance_overlap_is_rejected (rex/tests/spec_semantics.rs).
Qualified Class Names in instance Headers
The class name in an instance header may be qualified through a module alias:
import dep as D;
instance D.Pick i32 where {
pick = 7;
}
The alias member must be an exported class from the referenced module; otherwise import-use validation fails before typechecking/evaluation.
Method Resolution (Runtime)
At runtime, class methods are resolved by unification against the inferred call type.
Let m be a class method, and let its call site be typed with monomorphic call type τ_call.
Resolution:
- Determine the “instance parameter type” for the method by unifying
τ_callwith the method’s scheme and extracting the predicate corresponding to the method’s defining class. - If the instance parameter type is still headed by a type variable (not ground enough to pick an
instance), the use is ambiguous:
- If
mis used as a function value (i.e.τ_callis a function type), the engine returns an overloaded function value and defers resolution until the function is applied with concrete arguments. - If
mis used as a value (non-function), the engine errors (EngineError::AmbiguousOverload).
- If
- If exactly one instance head unifies with the instance parameter type, its method body is specialized and evaluated.
- If none match, the engine errors (
EngineError::MissingTypeclassImpl). - If more than one match (should not occur given non-overlap), the engine errors
(
EngineError::AmbiguousTypeclassImpl).
Regression: spec_typeclass_method_value_without_type_is_ambiguous (rex/tests/spec_semantics.rs).
Overloaded Method Values (Deferred Resolution)
If a class method is used as a function value, the engine may defer instance selection until the function is applied with concrete argument types. This supports idioms like:
let f = map ((+) 1) in
( f [1, 2, 3]
, f (Some 41)
)
Here f is polymorphic over the Functor dictionary; at each call site, the engine resolves
map using the argument type (List i32 vs Option i32) and dispatches to the corresponding
instance method body.
Instance-Method Checking (Static)
Inside an instance method body, only the instance context is available as “given” constraints:
- Given predicates start with the instance’s explicit context.
- The superclass closure of that context is added (repeat until fixed point).
- The instance head itself is also considered given (dictionary recursion).
Rules:
- Ground predicates required by the method body must be entailed by the given set (via instance search).
- Non-ground predicates are not resolved by instance search (that would be unsound); they must appear explicitly in the instance context.
This is what makes instance methods predictable and prevents “magical” selection based on unifying type variables with arbitrary instance heads.
Integer Literals
Integer literals are overloaded over integral types.
- A literal like
4introduces a fresh type variableαwith predicateIntegral α. - A negative literal like
-3introducesαwith predicatesIntegral αandAdditiveGroup α(so it can only specialize to signed numeric types). - Binary subtraction uses
Subtractive, which includes unsigned integer types. Unary negation still requiresAdditiveGroup. - Division uses
Divisive, which includes primitive integer and floating-point types. Integer division follows Rust’s integer division semantics. - Integer
+,-,*,/, and%are checked at runtime for primitive integer types. Arithmetic overflow raisesinteger overflow (T)and arithmetic underflow raisesinteger underflow (T), whereTis the concrete integer type. - Context can specialize
α(for example,let x: u64 = 4 in x). - Unannotated
letbindings whose definition is an integer literal are kept monomorphic. This lets use sites specialize the binding consistently in that scope (for example,let x = 4 in (x + 1, x + 2)). - If
αremains ambiguous, normal defaulting rules apply.
Examples:
let x: u8 = 4 in x
let f: i64 -> i64 = \x -> x in f 4
let x = 4 in (x is u16)
let x: i16 = -3 in x
Attempting to use a negative literal at an unsigned type is a type error (for example
let x: u8 = -3 in x).
Float Literals
Float literals are overloaded over primitive floating-point types.
- A literal like
3.0introduces a fresh type variableαwith predicateField α. - Context can specialize
αtof32orf64(for example,let x: f64 = 3.0 in x). - If
αremains ambiguous, normal defaulting rules choosef32. - Unannotated
letbindings whose definition is a float literal are kept monomorphic, matching integer literal bindings. - Float literals do not imply integer-to-float coercions. A mixed expression such as
1 + 2.0is still a type error unless the values are explicitly converted by user code.
Examples:
let x: f64 = 3.0 in x
let f: f64 -> f64 = \x -> x in f 3.0
let x = 3.0 in (x is f32)
Defaulting
Defaulting runs after type inference and before evaluation.
Eligible Variables
A type variable α is eligible for defaulting iff:
αappears only in simple predicates of the formC α(not in compound types), and- at least one such
Cis in the numeric defaultable set:AdditiveMonoid,MultiplicativeMonoid,Subtractive,AdditiveGroup,Ring,Divisive,Field,Integral.
Eq and Ord are allowed as companion predicates when a numeric defaultable predicate is also
present. They do not make a type variable defaultable on their own. This lets expressions like
if x == 0.0 then ... default through the float literal’s Field predicate without making
unconstrained equality default to an arbitrary numeric type.
If α appears in any non-simple predicate or any non-defaultable class predicate, it is not
defaulted.
Candidate Types (Order Matters)
The candidate list is constructed in this order:
- Traverse the typed expression (depth-first) and collect every concrete (ground) 0-arity type constructor that appears as the type of a subexpression (unique, in first-seen order).
- Append (if not already present):
f32,i32,string.
Choosing a Default
For an eligible variable α with required predicates { C₁ α, ..., Cₙ α }, choose the first
candidate type T such that all predicates are satisfied in the empty context:
entails([], Cᵢ T) for all i
If no candidate satisfies all predicates, α remains ambiguous.
Example: zero (type α with AdditiveMonoid α) defaults to f32 when no other concrete type is
present:
zero
Regression: spec_defaulting_picks_a_concrete_type_for_numeric_classes (rex/tests/spec_semantics.rs).