Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

SPy Internals: Sources and Architecture

SPy is not just a language — it is also its own implementation. To understand how SPy is built, and how its standard library works, it helps to read the source tree with a mental model of two distinct levels at which code can live. This note gives a guided tour of the most important parts.

Two Levels: Interpreter and App

SPy draws a sharp line between two execution levels, directly inspired by the design of PyPy.

Interpreter level (also called interp-level) is the level at which the SPy virtual machine is implemented: the type system, the evaluator, the operator dispatch, the compiler back-ends. Code at this level manipulates wrapped SPy objects (the W_* family: W_I32, W_Str, W_Type, …). The SPy interpreter serves two distinct purposes: it is used both for standard interpretation of SPy code (the usual development workflow) and for ahead-of-time compilation of SPy code to native executables or C. In both cases the same VM and the same interp-level machinery are at work. Under the hood, the interpreter executes a typed AST by delegating low-level operations to a small C library called libspy.

App level (also called application level) is the level of code written in SPy, running inside the SPy VM. The standard library modules (e.g. _range.spy, _list.spy) live here. App-level code cannot reach into the internals of the VM; it only sees what the VM exposes as public names.

The distinction is analogous to other language implementations:

A key difference from CPython is that because SPy is designed to be efficient, app-level code is a first-class implementation vehicle: fundamental objects like list, dict, and tuple are implemented directly in SPy rather than requiring a C extension.

A practical consequence of the current architecture: interp-level code can depend on anything defined at interp level, but it cannot call app-level (stdlib) functions, because stdlib is only loaded on demand, after the VM is initialised. The reverse is fine: stdlib .spy files freely use primitive types and operators that are wired up at interp level.

Source Tree Overview

spy/
├── vm/            # The virtual machine (interp-level)
│   ├── vm.py      # SPyVM class — the central entry point
│   ├── b.py       # BUILTINS, OPERATOR registries (early bootstrap)
│   ├── primitive.py   # W_I32, W_F64, W_Bool, … — primitive wrapped types
│   ├── object.py      # W_Object, W_Type — root of the type hierarchy
│   ├── function.py    # W_ASTFunc, W_BuiltinFunc, W_FuncType
│   ├── astframe.py    # AST interpreter — executes SPy functions
│   ├── modframe.py    # Module-level frame
│   ├── opspec.py      # W_OpSpec, W_MetaArg — metafunction dispatch
│   └── modules/   # Interp-level module definitions (see below)
├── analyze/
│   ├── scope.py       # ScopeAnalyzer — static name resolution
│   ├── symtable.py    # SymTable, Symbol, ImportRef
│   └── importing.py   # ImportAnalyzer — module loading pipeline
├── parser.py      # SPy source → AST
├── ast.py         # AST node definitions
├── doppler.py     # Redshift: blue-time evaluation / partial evaluation
├── backend/       # C and other compilation back-ends
├── libspy/        # C runtime (builtins.c, str.c, operator.c, …)
└── tests/         # Test suite
stdlib/            # App-level standard library (written in SPy)
├── _range.spy
├── _list.spy
├── _dict.spy
├── _slice.spy
├── _tuple.spy
├── _file.spy
└── …

spy/vm/modules/ — Interp-level Modules

The modules/ subdirectory contains the built-in modules implemented in Python. The most important ones:

spy/vm/vm.py and SPyVM

SPyVM is the central class of the implementation. Every SPy program runs inside an instance of it — whether the program is being interpreted or ahead-of-time compiled. Its __init__ method defines the bootstrapping sequence:

self.make_module(BUILTINS)
self.make_module(OPERATOR)
self.make_module(TYPES)
self.make_module(MATH)
self.make_module(UNSAFE)
self.make_module(RAW_BUFFER)
self.make_module(JSFFI)
self.make_module(POSIX)
self.make_module(TIME)
self.make_module(SPY)
self.make_module(_TESTING_HELPERS)
self.call_INITs()

make_module() registers a ModuleRegistry (an interp-level Python object) as a live W_Module inside the VM’s global namespace. These modules are available immediately, before any .spy file has been parsed or executed. This is the hard boundary: code that runs as part of make_module() cannot call anything defined in stdlib, because stdlib does not exist yet.

After make_module(), the VM sets self.path = [str(STDLIB)], pointing at the stdlib/ directory. From that point on, vm.import_("_range") will find and execute stdlib/_range.spy.

stdlib/ — App-level Standard Library

The stdlib/ directory contains modules written in SPy itself. They are loaded lazily — only when something actually needs them — via the import pipeline described below. They can use all primitive types and operators exposed by the interp-level modules, but they cannot call interp-level Python functions directly.

The naming convention _name.spy (leading underscore) signals that these are internal modules not meant to be imported explicitly by users. Their public names (range, list, dict, …) are injected directly into the builtin scope, so user code can write range(10) without any import statement.

Two Phases: Scope Analysis and Import

When a .spy file is loaded, the pipeline has two distinct phases.

Phase 1 — Scope Analysis (analyze/)

ScopeAnalyzer (scope.py) performs a purely static pass over the AST. It builds a SymTable for each scope (module, function, class), resolving every name to a Symbol that records where the name comes from.

The root of all name resolution is SymTable.from_builtins() (symtable.py). This method builds the builtins scope: a flat map from name to ImportRef. An ImportRef is simply a (modname, attr) pair — for example:

"range"  →  ImportRef("_range",  "range")
"list"   →  ImportRef("_list",   "list")
"abs"    →  ImportRef("builtins","abs")

No module is loaded here. This is purely a static description of where each builtin name lives.

When the scope analyser encounters a use of range inside a function, it captures the symbol from the builtins scope and records "_range" in the module’s implicit_imports set. This set is the output of Phase 1 that feeds Phase 2.

Phase 2 — Import Resolution (ImportAnalyzer)

ImportAnalyzer (importing.py) implements the actual module loading. It starts from a root module name and:

  1. Parses the .spy file and runs ScopeAnalyzer on it (Phase 1).

  2. Reads implicit_imports and any explicit import statements to build the dependency graph.

  3. Performs a depth-first post-order traversal of that graph to produce a linear import order — dependencies always before dependents.

  4. Executes each module in order by running it through ModFrame.

The key property is that builtins (and all the other interp-level modules registered by make_module()) are already present in vm.modules_w before ImportAnalyzer runs. When it encounters them in the dependency graph, it simply records them as “already imported” and moves on.

This is what makes _range.spy safe to write: it uses i32, bool, and < — all of which resolve to the pre-loaded builtins and operator modules. No circular dependency, no bootstrap problem.

Summary

Written inLoaded whenCan depend on
Interp-level modules (vm/modules/)Python + CVM __init__ (make_module)Other interp-level modules
App-level stdlib (stdlib/*.spy)SPyFirst use (lazy, via ImportAnalyzer)Interp-level modules; other stdlib modules loaded before it

The two-level architecture means SPy can express more and more of itself in SPy over time — the same trajectory C compilers, Rust, and PyPy all followed on the path to self-hosting.