SPy Internals: Sources and Architecture

SPy is not just a language — it is also its own implementation. To understand how SPy is built, and how its standard library works, it helps to read the source tree with a mental model of two distinct levels at which code can live. This note gives a guided tour of the most important parts.

Two Levels: Interpreter and App¶

SPy draws a sharp line between two execution levels, directly inspired by the design of PyPy.

Interpreter level (also called interp-level) is the level at which the SPy virtual machine is implemented: the type system, the evaluator, the operator dispatch, the compiler back-ends. Code at this level manipulates wrapped SPy objects (the W_* family: W_I32, W_Str, W_Type, …). The SPy interpreter serves two distinct purposes: it is used both for standard interpretation of SPy code (the usual development workflow) and for ahead-of-time compilation of SPy code to native executables or C. In both cases the same VM and the same interp-level machinery are at work. Under the hood, the interpreter executes a typed AST by delegating low-level operations to a small C library called libspy.

App level (also called application level) is the level of code written in SPy, running inside the SPy VM. The standard library modules (e.g. _range.spy, _list.spy) live here. App-level code cannot reach into the internals of the VM; it only sees what the VM exposes as public names.

The distinction is analogous to other language implementations:

In CPython, the interpreter is written in C and the standard library in a mix of C and Python.
In PyPy, the interpreter is written in RPython and the standard library in Python.
In SPy, the interpreter is currently written in Python and C, and the standard library in SPy.

A key difference from CPython is that because SPy is designed to be efficient, app-level code is a first-class implementation vehicle: fundamental objects like list, dict, and tuple are implemented directly in SPy rather than requiring a C extension.

Note

The interp-level/app-level split is a permanent architectural feature of SPy. What is not permanent is the role of Python. The long-term goal, stated explicitly in the SPy roadmap, is:

“Northern star: have enough features to be able to write the SPy interpreter in SPy itself.”

This mirrors a classic milestone in language design: C compilers are written in C, the Rust compiler is written in Rust. Being able to implement the language’s own toolchain in the language itself is a sign of maturity and a major practical win — in SPy’s case, it would allow the interpreter to be compiled by SPy’s own compiler, making it dramatically faster.

There are two complementary directions toward this goal, both already underway:

Moving interp-level code into the stdlib. Individual functions and data structures are progressively migrated from Python (vm/modules/) to pure SPy (stdlib/). Moving abs, min, and max from builtins.py into a stdlib/_builtins.spy is a small example. A more spectacular roadmap item is rewriting str and bytes in pure SPy — currently str is split between interp-level Python and a C implementation in libspy/src/str.c.
Rewriting the interpreter itself in SPy. This is the more ambitious direction. The VM (SPyVM), the AST interpreter, the type system, the operator dispatch, the compiler back-ends — all of this should eventually be expressed in SPy. This would be the step that allows SPy to compile itself.

Some parts of libspy will likely remain in C indefinitely — in particular the helpers called by the code the C back-end generates — but the goal is to keep that residue as slim as possible.

The ultimate ambition goes even further: remove the dependency on Python completely. The CLI and the parser are currently written in Python too, and the plan is to rewrite them in SPy as well. This has interesting implications for what SPy’s stdlib will eventually need to contain. To fully describe and compile its own toolchain, SPy will need library-level equivalents of the tools it currently borrows from Python — things like argument parsing (today handled by typer), build system integration (ninja), and so on. In other words, the growth of the stdlib is not only driven by language features but also by the goal of self-sufficiency.

A practical consequence of the current architecture: interp-level code can depend on anything defined at interp level, but it cannot call app-level (stdlib) functions, because stdlib is only loaded on demand, after the VM is initialised. The reverse is fine: stdlib .spy files freely use primitive types and operators that are wired up at interp level.

Source Tree Overview¶

spy/
├── vm/            # The virtual machine (interp-level)
│   ├── vm.py      # SPyVM class — the central entry point
│   ├── b.py       # BUILTINS, OPERATOR registries (early bootstrap)
│   ├── primitive.py   # W_I32, W_F64, W_Bool, … — primitive wrapped types
│   ├── object.py      # W_Object, W_Type — root of the type hierarchy
│   ├── function.py    # W_ASTFunc, W_BuiltinFunc, W_FuncType
│   ├── astframe.py    # AST interpreter — executes SPy functions
│   ├── modframe.py    # Module-level frame
│   ├── opspec.py      # W_OpSpec, W_MetaArg — metafunction dispatch
│   └── modules/   # Interp-level module definitions (see below)
├── analyze/
│   ├── scope.py       # ScopeAnalyzer — static name resolution
│   ├── symtable.py    # SymTable, Symbol, ImportRef
│   └── importing.py   # ImportAnalyzer — module loading pipeline
├── parser.py      # SPy source → AST
├── ast.py         # AST node definitions
├── doppler.py     # Redshift: blue-time evaluation / partial evaluation
├── backend/       # C and other compilation back-ends
├── libspy/        # C runtime (builtins.c, str.c, operator.c, …)
└── tests/         # Test suite
stdlib/            # App-level standard library (written in SPy)
├── _range.spy
├── _list.spy
├── _dict.spy
├── _slice.spy
├── _tuple.spy
├── _file.spy
└── …

`spy/vm/modules/` — Interp-level Modules¶

The modules/ subdirectory contains the built-in modules implemented in Python. The most important ones:

builtins.py — the builtins module: abs, min, max, print, len, repr, hash, dir, getattr, setattr, and the int/float type aliases. Functions here are decorated with @BUILTINS.builtin_func and registered into the BUILTINS ModuleRegistry.
operator/ — the operator module: all binary operators (+, -, *, …), comparison operators, attribute access, calls. These are metafunctions that dispatch to the correct implementation based on the static types of their arguments.
math.py — the math module: trigonometric functions, sqrt, fabs, etc., backed by the C standard library.
types.py — the types module: meta-types and helpers used by the type system.
unsafe/ — low-level unsafe operations (raw memory access, pointer casts).

`spy/vm/vm.py` and `SPyVM`¶

SPyVM is the central class of the implementation. Every SPy program runs inside an instance of it — whether the program is being interpreted or ahead-of-time compiled. Its __init__ method defines the bootstrapping sequence:

self.make_module(BUILTINS)
self.make_module(OPERATOR)
self.make_module(TYPES)
self.make_module(MATH)
self.make_module(UNSAFE)
self.make_module(RAW_BUFFER)
self.make_module(JSFFI)
self.make_module(POSIX)
self.make_module(TIME)
self.make_module(SPY)
self.make_module(_TESTING_HELPERS)
self.call_INITs()

make_module() registers a ModuleRegistry (an interp-level Python object) as a live W_Module inside the VM’s global namespace. These modules are available immediately, before any .spy file has been parsed or executed. This is the hard boundary: code that runs as part of make_module() cannot call anything defined in stdlib, because stdlib does not exist yet.

After make_module(), the VM sets self.path = [str(STDLIB)], pointing at the stdlib/ directory. From that point on, vm.import_("_range") will find and execute stdlib/_range.spy.

`stdlib/` — App-level Standard Library¶

The stdlib/ directory contains modules written in SPy itself. They are loaded lazily — only when something actually needs them — via the import pipeline described below. They can use all primitive types and operators exposed by the interp-level modules, but they cannot call interp-level Python functions directly.

The naming convention _name.spy (leading underscore) signals that these are internal modules not meant to be imported explicitly by users. Their public names (range, list, dict, …) are injected directly into the builtin scope, so user code can write range(10) without any import statement.

Two Phases: Scope Analysis and Import¶

When a .spy file is loaded, the pipeline has two distinct phases.

Phase 1 — Scope Analysis (`analyze/`)¶

ScopeAnalyzer (scope.py) performs a purely static pass over the AST. It builds a SymTable for each scope (module, function, class), resolving every name to a Symbol that records where the name comes from.

The root of all name resolution is SymTable.from_builtins() (symtable.py). This method builds the builtins scope: a flat map from name to ImportRef. An ImportRef is simply a (modname, attr) pair — for example:

"range"  →  ImportRef("_range",  "range")
"list"   →  ImportRef("_list",   "list")
"abs"    →  ImportRef("builtins","abs")

No module is loaded here. This is purely a static description of where each builtin name lives.

When the scope analyser encounters a use of range inside a function, it captures the symbol from the builtins scope and records "_range" in the module’s implicit_imports set. This set is the output of Phase 1 that feeds Phase 2.

Phase 2 — Import Resolution (`ImportAnalyzer`)¶

ImportAnalyzer (importing.py) implements the actual module loading. It starts from a root module name and:

Parses the .spy file and runs ScopeAnalyzer on it (Phase 1).
Reads implicit_imports and any explicit import statements to build the dependency graph.
Performs a depth-first post-order traversal of that graph to produce a linear import order — dependencies always before dependents.
Executes each module in order by running it through ModFrame.

The key property is that builtins (and all the other interp-level modules registered by make_module()) are already present in vm.modules_w before ImportAnalyzer runs. When it encounters them in the dependency graph, it simply records them as “already imported” and moves on.

This is what makes _range.spy safe to write: it uses i32, bool, and < — all of which resolve to the pre-loaded builtins and operator modules. No circular dependency, no bootstrap problem.

Summary¶

	Written in	Loaded when	Can depend on
Interp-level modules (`vm/modules/`)	Python + C	VM `__init__` (`make_module`)	Other interp-level modules
App-level stdlib (`stdlib/*.spy`)	SPy	First use (lazy, via `ImportAnalyzer`)	Interp-level modules; other stdlib modules loaded before it

The two-level architecture means SPy can express more and more of itself in SPy over time — the same trajectory C compilers, Rust, and PyPy all followed on the path to self-hosting.

SPy Internals: Sources and Architecture

Two Levels: Interpreter and App¶

Source Tree Overview¶

spy/vm/modules/ — Interp-level Modules¶

spy/vm/vm.py and SPyVM¶

stdlib/ — App-level Standard Library¶