Chapter 3: Register-Based VM

In Chapter 2, we built persistent storage for accounts and blocks. Now we need something to actually execute smart contract code. That’s the virtual machine — the computational engine at the heart of our blockchain.

What We’re Building

By the end of this chapter, you’ll have implemented:

Module	Purpose
`opcodes.rs`	Instruction set definition (43 opcodes)
`memory.rs`	Registers and linear memory
`gas.rs`	Gas metering and cost tables
`executor.rs`	The fetch-decode-execute loop
`tracer.rs`	Execution tracing for debugging

What is a Virtual Machine?

A virtual machine (VM) is a software implementation of a computer. Instead of running machine code on physical silicon, it interprets bytecode — a compact binary representation of instructions.

The CPU Analogy

Physical CPU	Blockchain VM
Transistors on silicon	Rust code
x86/ARM machine code	Custom bytecode
Registers (RAX, RBX, etc.)	Virtual registers (R0-R15)
RAM	Linear memory buffer
Disk I/O	Storage (SLOAD/SSTORE)
Clock cycles	Gas units

Stack vs Register Architecture

There are two main VM designs:

Stack-Based (like Ethereum’s EVM)

Operations pop operands from a stack and push results back:

PUSH 3      ; stack: [3]
PUSH 4      ; stack: [3, 4]
ADD         ; pops 3 and 4, pushes 7 → stack: [7]

Pros: Simple to implement, compact bytecode Cons: Lots of stack manipulation, harder to optimize

Register-Based (like our VM, Lua, Dalvik)

Operations specify source and destination registers explicitly:

LOAD R0, 3    ; R0 = 3
LOAD R1, 4    ; R1 = 4
ADD R2, R0, R1  ; R2 = R0 + R1 = 7

Pros: Fewer instructions, easier to optimize, maps well to real CPUs Cons: Larger instruction encoding (need register numbers)

Why Register-Based? A Deeper Look

Historical Context:

Most VMs fall into two categories:

Architecture	Examples	Advantages	Disadvantages
Stack-Based	EVM, JVM (early), WebAssembly	Compact bytecode, simple to compile to	More instructions, harder to optimize
Register-Based	Lua VM, Android ART, x86/ARM CPUs	Fewer instructions, easier to optimize	Larger bytecode (register operands)

Stack-Based Example (EVM):

; Compute (a + b) * c

PUSH a    ; Stack: [a]
PUSH b    ; Stack: [a, b]
ADD       ; Stack: [a+b]
PUSH c    ; Stack: [a+b, c]
MUL       ; Stack: [(a+b)*c]

Total: 5 instructions

Register-Based Example (Minichain):

; Compute (a + b) * c

LOADI R0, a    ; R0 = a
LOADI R1, b    ; R1 = b
ADD R2, R0, R1 ; R2 = a + b
LOADI R3, c    ; R3 = c
MUL R4, R2, R3 ; R4 = (a+b) * c

Total: 5 instructions (same), but clearer semantics

Why Register-Based Wins for Learning:

Explicit Data Flow:
- Stack: “Where is the value? Top? Second from top?”
- Registers: “Value is in R2, explicitly named”
Easier to Debug:

Stack VM trace:         Register VM trace:
PUSH 10                 LOADI R0, 10
PUSH 20                 LOADI R1, 20
ADD                     ADD R2, R0, R1
  Stack: [30]            R0=10, R1=20, R2=30 ← Clear!

Matches Physical Hardware:
- CPUs have registers (x86: RAX, RBX, etc.)
- Understanding register-based VMs → easier to understand real CPUs
Optimization Opportunities:
- Register allocation is a well-studied problem
- Easier to implement JIT compilation
- Less stack manipulation overhead

Why EVM Chose Stack-Based?

Performance Comparison:

Operation	Stack VM	Register VM	Winner
Simple ADD	3 instructions (PUSH, PUSH, ADD)	3 instructions (LOADI, LOADI, ADD)	Tie
Complex expression	More stack juggling (DUP, SWAP)	Direct register access	Register
Function calls	Stack frame management	Explicit register save	Register
Debugging	Opaque stack state	Visible register state	Register

Real-World Example:

Expression: result = (a * b) + (c * d)

Stack VM (EVM):
PUSH a      ; [a]
PUSH b      ; [a, b]
MUL         ; [a*b]
PUSH c      ; [a*b, c]
PUSH d      ; [a*b, c, d]
MUL         ; [a*b, c*d]
ADD         ; [(a*b)+(c*d)]

7 instructions, max stack depth: 3

Register VM (Minichain):
LOADI R0, a          ; R0 = a
LOADI R1, b          ; R1 = b
MUL R2, R0, R1       ; R2 = a * b
LOADI R3, c          ; R3 = c
LOADI R4, d          ; R4 = d
MUL R5, R3, R4       ; R5 = c * d
ADD R6, R2, R5       ; R6 = R2 + R5

7 instructions, 7 registers used (R0-R6)

Both use same number of instructions, but register version:

Intermediate values (ab, cd) have explicit homes (R2, R5)
No stack depth tracking needed
Easier to parallelize (R2 and R5 independent)

Why 16 Registers?

Register Count	Trade-off
8 registers	Compact encoding (3 bits), but frequent spills to memory
16 registers	4-bit encoding (good fit in bytes), enough for most expressions
32 registers	More flexibility, but wastes encoding space

x86-64 has 16 general-purpose registers. We match that convention.

3.1 Architecture Overview

Our VM has three main state components:

┌─────────────────────────────────────────────────────────────┐
│                         VM State                             │
├─────────────────┬──────────────────┬────────────────────────┤
│   Registers     │  Execution Ctx   │      Memory            │
│   R0-R15        │  caller/address  │   (linear buffer)      │
│   64-bit each   │  value/block/ts  │                        │
├─────────────────┴──────────────────┴────────────────────────┤
│                     Program Counter (PC)                     │
│                     Gas Remaining                            │
├─────────────────────────────────────────────────────────────┤
│                   Bytecode (read-only)                       │
└─────────────────────────────────────────────────────────────┘
                              │
                              │ SLOAD/SSTORE
                              ▼
                    ┌─────────────────┐
                    │  Contract       │
                    │  Storage        │
                    │  (from Ch. 2)   │
                    └─────────────────┘

Registers

We have 16 general-purpose registers (R0-R15), each holding a 64-bit unsigned integer:

pub const NUM_REGISTERS: usize = 16;

pub struct Registers {
    values: [u64; NUM_REGISTERS],
}

impl Registers {
    pub fn new() -> Self {
        Self { values: [0; NUM_REGISTERS] }
    }

    pub fn get(&self, r: u8) -> u64 {
        self.values[r as usize % NUM_REGISTERS]
    }

    pub fn set(&mut self, r: u8, value: u64) {
        self.values[r as usize % NUM_REGISTERS] = value;
    }
}

Memory

Linear memory is a byte-addressable buffer that grows on demand:

pub struct Memory {
    data: Vec<u8>,
    max_size: usize,
}

impl Memory {
    pub fn new(max_size: usize) -> Self {
        Self {
            data: Vec::new(),
            max_size,
        }
    }

    /// Read a byte from memory.
    pub fn load8(&self, offset: u32) -> u8 {
        self.data.get(offset as usize).copied().unwrap_or(0)
    }

    /// Write a byte to memory, growing if needed.
    pub fn store8(&mut self, offset: u32, value: u8) -> Result<(), VmError> {
        let offset = offset as usize;
        if offset >= self.max_size {
            return Err(VmError::MemoryOverflow);
        }
        if offset >= self.data.len() {
            self.data.resize(offset + 1, 0);
        }
        self.data[offset] = value;
        Ok(())
    }
}

Program Counter

The program counter (PC) points to the next instruction to execute:

pub struct Vm {
    pub registers: Registers,
    pub memory: Memory,
    pub pc: usize,           // Program counter
    pub gas_remaining: u64,  // Gas left
    pub bytecode: Vec<u8>,   // The contract code
    pub halted: bool,        // Has execution stopped?
}

3.2 Instruction Encoding

Each instruction is encoded as a sequence of bytes. The first byte is the opcode, followed by operands.

Encoding Formats

We use three main formats:

Format	Bytes	Example	Description
R	1 + 1	`HALT`	No operands
RR	1 + 1	`NOT R0`	One register
RRR	1 + 2	`ADD R0, R1, R2`	Two or three registers
RI	1 + 1 + 8	`LOADI R0, 12345`	Register + immediate value

ADD instruction (3 bytes):
┌────────┬────────┬────────┐
│ opcode │ dst|s1 │ s2|pad │
│  0x01  │  0x01  │  0x20  │
└────────┴────────┴────────┘
         R0  R1    R2

dst = destination register (4 bits)
s1  = source register 1 (4 bits)
s2  = source register 2 (4 bits)
pad = padding (4 bits)

Opcode Layout

We organize opcodes into categories:

Range	Category	Examples
0x00-0x0F	Control	HALT, NOP, JUMP, CALL, RET
0x10-0x1F	Arithmetic	ADD, SUB, MUL, DIV, MOD
0x20-0x2F	Bitwise	AND, OR, XOR, NOT, SHL, SHR
0x30-0x3F	Comparison	EQ, NE, LT, GT, LE, GE
0x40-0x4F	Memory	LOAD8, LOAD64, STORE8, STORE64
0x50-0x5F	Storage	SLOAD, SSTORE
0x70-0x7F	Immediate	LOADI, MOV
0x80-0x8F	Context	CALLER, CALLVALUE, ADDRESS
0xF0-0xFF	Debug	LOG

3.3 Opcode Definitions

Our VM uses a 43-instruction set organized into 9 categories: control flow, arithmetic, bitwise, comparison, memory, storage, immediate, context, and debug operations.

Instruction Categories

Our instruction set is organized by opcode ranges:

Range	Category	Purpose	Example Instructions
`0x00-0x0F`	Control Flow	Program control	HALT, JUMP, JUMPI, CALL, RET
`0x10-0x1F`	Arithmetic	Math operations	ADD, SUB, MUL, DIV, MOD
`0x20-0x2F`	Bitwise	Bit manipulation	AND, OR, XOR, NOT, SHL, SHR
`0x30-0x3F`	Comparison	Value comparison	EQ, LT, GT, LE, GE, ISZERO
`0x40-0x4F`	Memory	RAM operations (temp)	LOAD8, LOAD64, STORE8, STORE64
`0x50-0x5F`	Storage	Disk operations (persistent)	SLOAD, SSTORE
`0x70-0x7F`	Immediate	Constants & data movement	LOADI, MOV
`0x80-0x8F`	Context	Execution environment	CALLER, ADDRESS, TIMESTAMP
`0xF0-0xFF`	Debug	Debugging & logging	LOG

Opcode Enum Definition

/// All VM opcodes.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub enum Opcode {
    // Control Flow (0x00-0x0F)
    HALT = 0x00,
    NOP = 0x01,
    JUMP = 0x02,
    JUMPI = 0x03,
    CALL = 0x04,
    RET = 0x05,
    REVERT = 0x0F,

    // Arithmetic (0x10-0x1F)
    ADD = 0x10,
    SUB = 0x11,
    MUL = 0x12,
    DIV = 0x13,
    MOD = 0x14,
    ADDI = 0x15,

    // Bitwise (0x20-0x2F)
    AND = 0x20,
    OR = 0x21,
    XOR = 0x22,
    NOT = 0x23,
    SHL = 0x24,
    SHR = 0x25,

    // Comparison (0x30-0x3F)
    EQ = 0x30,
    NE = 0x31,
    LT = 0x32,
    GT = 0x33,
    LE = 0x34,
    GE = 0x35,
    ISZERO = 0x36,

    // Memory - RAM (0x40-0x4F)
    LOAD8 = 0x40,
    LOAD64 = 0x41,
    STORE8 = 0x42,
    STORE64 = 0x43,
    MSIZE = 0x44,
    MCOPY = 0x45,

    // Storage - Disk (0x50-0x5F)
    SLOAD = 0x50,
    SSTORE = 0x51,

    // Immediate (0x70-0x7F)
    LOADI = 0x70,
    MOV = 0x71,

    // Context (0x80-0x8F)
    CALLER = 0x80,
    CALLVALUE = 0x81,
    ADDRESS = 0x82,
    BLOCKNUMBER = 0x83,
    TIMESTAMP = 0x84,
    GAS = 0x85,

    // Debug (0xF0-0xFF)
    LOG = 0xF0,
}

Why separate Memory and Storage opcodes?

Aspect	Memory (RAM)	Storage (Disk)
Persistence	Temporary — cleared after execution	Permanent — survives across transactions
Speed	Fast (in-memory array)	Slow (database read/write)
Cost	Cheap (~3 gas per operation)	Expensive (~100 gas read, ~5,000-20,000 gas write)
Use Case	Temporary computation, buffers	Contract state, balances

This mirrors real computers: fast volatile RAM for active work, slow persistent disk for data that must survive.

Implementing Opcode Parsing

impl Opcode {
    /// Parse a byte as an opcode.
    pub fn from_byte(byte: u8) -> Option<Self> {
        match byte {
            0x00 => Some(Opcode::HALT),
            0x01 => Some(Opcode::NOP),
            0x02 => Some(Opcode::JUMP),
            0x03 => Some(Opcode::JUMPI),
            0x04 => Some(Opcode::CALL),
            0x05 => Some(Opcode::RET),
            0x0F => Some(Opcode::REVERT),

            0x10 => Some(Opcode::ADD),
            0x11 => Some(Opcode::SUB),
            0x12 => Some(Opcode::MUL),
            0x13 => Some(Opcode::DIV),
            0x14 => Some(Opcode::MOD),
            0x15 => Some(Opcode::ADDI),

            0x20 => Some(Opcode::AND),
            0x21 => Some(Opcode::OR),
            0x22 => Some(Opcode::XOR),
            0x23 => Some(Opcode::NOT),
            0x24 => Some(Opcode::SHL),
            0x25 => Some(Opcode::SHR),

            0x30 => Some(Opcode::EQ),
            0x31 => Some(Opcode::NE),
            0x32 => Some(Opcode::LT),
            0x33 => Some(Opcode::GT),
            0x34 => Some(Opcode::LE),
            0x35 => Some(Opcode::GE),
            0x36 => Some(Opcode::ISZERO),

            0x40 => Some(Opcode::LOAD8),
            0x41 => Some(Opcode::LOAD64),
            0x42 => Some(Opcode::STORE8),
            0x43 => Some(Opcode::STORE64),
            0x44 => Some(Opcode::MSIZE),
            0x45 => Some(Opcode::MCOPY),

            0x50 => Some(Opcode::SLOAD),
            0x51 => Some(Opcode::SSTORE),

            0x70 => Some(Opcode::LOADI),
            0x71 => Some(Opcode::MOV),

            0x80 => Some(Opcode::CALLER),
            0x81 => Some(Opcode::CALLVALUE),
            0x82 => Some(Opcode::ADDRESS),
            0x83 => Some(Opcode::BLOCKNUMBER),
            0x84 => Some(Opcode::TIMESTAMP),
            0x85 => Some(Opcode::GAS),

            0xF0 => Some(Opcode::LOG),

            _ => None,
        }
    }

    /// Get the number of bytes this instruction consumes (including opcode).
    pub fn instruction_size(&self) -> usize {
        match self {
            // No operands (1 byte total)
            Opcode::HALT | Opcode::NOP | Opcode::RET => 1,

            // One register (2 bytes: opcode + register)
            Opcode::JUMP | Opcode::NOT | Opcode::LOG |
            Opcode::MSIZE | Opcode::CALLER | Opcode::CALLVALUE |
            Opcode::ADDRESS | Opcode::BLOCKNUMBER |
            Opcode::TIMESTAMP | Opcode::GAS => 2,

            // Two registers (2 bytes: opcode + packed registers)
            Opcode::MOV | Opcode::LOAD8 | Opcode::STORE8 |
            Opcode::LOAD64 | Opcode::STORE64 |
            Opcode::SLOAD | Opcode::SSTORE | Opcode::ISZERO => 2,

            // Three registers (3 bytes: opcode + 2 packed register bytes)
            Opcode::ADD | Opcode::SUB | Opcode::MUL |
            Opcode::DIV | Opcode::MOD | Opcode::AND |
            Opcode::OR | Opcode::XOR | Opcode::SHL |
            Opcode::SHR | Opcode::EQ | Opcode::NE |
            Opcode::LT | Opcode::GT | Opcode::LE |
            Opcode::GE | Opcode::MCOPY | Opcode::JUMPI => 3,

            // Register + 64-bit immediate (10 bytes)
            Opcode::LOADI => 10,  // 1 (opcode) + 1 (reg) + 8 (immediate)

            // Fallback
            _ => 2,
        }
    }
}

3.4 Gas Metering

Gas prevents infinite loops and makes computation economically accountable. Every instruction costs gas. If you run out, execution halts.

The Gas Model

/// Gas costs for operations.
pub struct GasCosts;

impl GasCosts {
    // Tier 1: Very cheap (simple register operations)
    pub const ZERO: u64 = 0;       // HALT, NOP
    pub const BASE: u64 = 2;       // ADD, SUB, AND, OR, MOV

    // Tier 2: Cheap (more complex ALU operations)
    pub const LOW: u64 = 3;        // MUL, comparison ops

    // Tier 3: Medium (division, shifts)
    pub const MID: u64 = 5;        // DIV, MOD, SHL, SHR

    // Tier 4: Memory operations
    pub const MEMORY_READ: u64 = 3;
    pub const MEMORY_WRITE: u64 = 3;
    pub const MEMORY_GROW_PER_BYTE: u64 = 1;

    // Tier 5: Storage (expensive!)
    pub const SLOAD: u64 = 100;          // Read from storage
    pub const SSTORE_SET: u64 = 20000;   // Write to empty slot
    pub const SSTORE_RESET: u64 = 5000;  // Overwrite existing slot

    // Control flow
    pub const JUMP: u64 = 8;
    pub const CALL: u64 = 700;
}

/// Gas meter tracks remaining gas.
pub struct GasMeter {
    remaining: u64,
    used: u64,
}

impl GasMeter {
    pub fn new(limit: u64) -> Self {
        Self { remaining: limit, used: 0 }
    }

    /// Consume gas, returning error if insufficient.
    pub fn consume(&mut self, amount: u64) -> Result<(), VmError> {
        if self.remaining < amount {
            return Err(VmError::OutOfGas {
                required: amount,
                remaining: self.remaining,
            });
        }
        self.remaining -= amount;
        self.used += amount;
        Ok(())
    }

    pub fn remaining(&self) -> u64 {
        self.remaining
    }

    pub fn used(&self) -> u64 {
        self.used
    }
}

Why These Costs?

Operation	Cost	Rationale
ADD, SUB	2	Single CPU cycle equivalent
MUL	3	Slightly more complex
DIV	5	Division is slow on real CPUs
SLOAD	100	Disk read (SSD ~100μs)
SSTORE (new)	20,000	Permanent storage is expensive

Memory Expansion Costs

Memory isn’t free. When you access a higher address than before, you pay for the expansion:

impl GasMeter {
    /// Calculate gas for memory expansion.
    pub fn memory_expansion_cost(current_size: usize, new_size: usize) -> u64 {
        if new_size <= current_size {
            return 0;
        }
        let expansion = (new_size - current_size) as u64;
        expansion * GasCosts::MEMORY_GROW_PER_BYTE
    }
}

3.5 The Executor

The executor is the heart of the VM — the fetch-decode-execute loop:

use crate::{
    gas::{GasCosts, GasMeter},
    memory::Memory,
    opcodes::Opcode,
};
use minichain_core::Address;
use thiserror::Error;

#[derive(Error, Debug, Clone, PartialEq, Eq)]
pub enum VmError {
    #[error("Out of gas: required {required}, remaining {remaining}")]
    OutOfGas { required: u64, remaining: u64 },

    #[error("Invalid opcode: 0x{0:02X}")]
    InvalidOpcode(u8),

    #[error("Division by zero")]
    DivisionByZero,

    #[error("Memory overflow")]
    MemoryOverflow,

    #[error("Invalid jump destination: {0}")]
    InvalidJump(usize),

    #[error("Stack underflow")]
    StackUnderflow,

    #[error("Execution reverted")]
    Reverted,
}

/// Execution result.
pub struct ExecutionResult {
    pub success: bool,
    pub gas_used: u64,
    pub return_data: Vec<u8>,
    pub logs: Vec<u64>,  // LOG opcode outputs
}

/// The virtual machine state.
pub struct Vm {
    registers: [u64; 16],
    memory: Memory,
    pc: usize,
    gas: GasMeter,
    bytecode: Vec<u8>,
    halted: bool,

    // Context
    caller: Address,
    address: Address,
    call_value: u64,
    block_number: u64,
    timestamp: u64,

    // Outputs
    logs: Vec<u64>,
}

The Execution Loop

impl Vm {
    /// Create a new VM with the given bytecode and gas limit.
    pub fn new(
        bytecode: Vec<u8>,
        gas_limit: u64,
        caller: Address,
        address: Address,
        call_value: u64,
    ) -> Self {
        Self {
            registers: [0; 16],
            memory: Memory::new(1024 * 1024),  // 1MB max
            pc: 0,
            gas: GasMeter::new(gas_limit),
            bytecode,
            halted: false,
            caller,
            address,
            call_value,
            block_number: 0,
            timestamp: 0,
            logs: Vec::new(),
        }
    }

    /// Run the VM until it halts or runs out of gas.
    pub fn run(&mut self) -> Result<ExecutionResult, VmError> {
        while !self.halted && self.pc < self.bytecode.len() {
            self.step()?;
        }

        Ok(ExecutionResult {
            success: self.halted,  // HALT = success, out of bounds = failure
            gas_used: self.gas.used(),
            return_data: Vec::new(),  // TODO: implement return data
            logs: std::mem::take(&mut self.logs),
        })
    }

    /// Execute a single instruction.
    fn step(&mut self) -> Result<(), VmError> {
        // Fetch
        let opcode_byte = self.bytecode[self.pc];
        let opcode = Opcode::from_byte(opcode_byte)
            .ok_or(VmError::InvalidOpcode(opcode_byte))?;

        // Decode & Execute
        match opcode {
            Opcode::HALT => {
                self.gas.consume(GasCosts::ZERO)?;
                self.halted = true;
            }

            Opcode::NOP => {
                self.gas.consume(GasCosts::ZERO)?;
                self.pc += 1;
            }

            Opcode::ADD => {
                self.gas.consume(GasCosts::BASE)?;
                let (dst, s1, s2) = self.decode_rrr();
                let result = self.registers[s1].wrapping_add(self.registers[s2]);
                self.registers[dst] = result;
                self.pc += 3;
            }

            Opcode::SUB => {
                self.gas.consume(GasCosts::BASE)?;
                let (dst, s1, s2) = self.decode_rrr();
                let result = self.registers[s1].wrapping_sub(self.registers[s2]);
                self.registers[dst] = result;
                self.pc += 3;
            }

            Opcode::MUL => {
                self.gas.consume(GasCosts::LOW)?;
                let (dst, s1, s2) = self.decode_rrr();
                let result = self.registers[s1].wrapping_mul(self.registers[s2]);
                self.registers[dst] = result;
                self.pc += 3;
            }

            Opcode::DIV => {
                self.gas.consume(GasCosts::MID)?;
                let (dst, s1, s2) = self.decode_rrr();
                let divisor = self.registers[s2];
                if divisor == 0 {
                    return Err(VmError::DivisionByZero);
                }
                self.registers[dst] = self.registers[s1] / divisor;
                self.pc += 3;
            }

            Opcode::LOADI => {
                self.gas.consume(GasCosts::BASE)?;
                let dst = self.decode_r();
                let immediate = self.decode_imm64();
                self.registers[dst] = immediate;
                self.pc += 10;
            }

            Opcode::MOV => {
                self.gas.consume(GasCosts::BASE)?;
                let (dst, src) = self.decode_rr();
                self.registers[dst] = self.registers[src];
                self.pc += 2;
            }

            Opcode::JUMP => {
                self.gas.consume(GasCosts::JUMP)?;
                let target = self.decode_r();
                let addr = self.registers[target] as usize;
                if addr >= self.bytecode.len() {
                    return Err(VmError::InvalidJump(addr));
                }
                self.pc = addr;
            }

            Opcode::JUMPI => {
                self.gas.consume(GasCosts::JUMP)?;
                let (cond, target) = self.decode_rr();
                if self.registers[cond] != 0 {
                    let addr = self.registers[target] as usize;
                    if addr >= self.bytecode.len() {
                        return Err(VmError::InvalidJump(addr));
                    }
                    self.pc = addr;
                } else {
                    self.pc += 2;
                }
            }

            Opcode::LOG => {
                self.gas.consume(GasCosts::BASE)?;
                let src = self.decode_r();
                self.logs.push(self.registers[src]);
                self.pc += 2;
            }

            // ... implement remaining opcodes similarly

            _ => {
                return Err(VmError::InvalidOpcode(opcode_byte));
            }
        }

        Ok(())
    }
}

Decoding Helpers

These helper functions extract register operands from the packed binary instruction format. Since we pack multiple 4-bit register numbers into single bytes to keep bytecode compact, we need bit-shifting and masking to extract individual register indices. Each helper corresponds to one of the instruction formats (R, RR, RRR) defined in section 3.2.

impl Vm {
    /// Decode a single register operand: [opcode, RRRR____]
    fn decode_r(&self) -> usize {
        ((self.bytecode[self.pc + 1] >> 4) & 0x0F) as usize
    }

    /// Decode two register operands: [opcode, RRRR_SSSS]
    fn decode_rr(&self) -> (usize, usize) {
        let byte = self.bytecode[self.pc + 1];
        let r1 = ((byte >> 4) & 0x0F) as usize;
        let r2 = (byte & 0x0F) as usize;
        (r1, r2)
    }

    /// Decode three register operands: [opcode, DDDD_SSS1, SSS2____]
    fn decode_rrr(&self) -> (usize, usize, usize) {
        let b1 = self.bytecode[self.pc + 1];
        let b2 = self.bytecode[self.pc + 2];
        let dst = ((b1 >> 4) & 0x0F) as usize;
        let s1 = (b1 & 0x0F) as usize;
        let s2 = ((b2 >> 4) & 0x0F) as usize;
        (dst, s1, s2)
    }

    /// Decode a 64-bit immediate value (little-endian).
    fn decode_imm64(&self) -> u64 {
        let start = self.pc + 2;
        let bytes = &self.bytecode[start..start + 8];
        u64::from_le_bytes(bytes.try_into().unwrap())
    }
}

3.6 Storage Integration

The VM connects to the storage layer from Chapter 2 via callbacks:

/// Storage interface for the VM.
pub trait StorageBackend {
    /// Read 32 bytes from storage slot.
    fn sload(&self, key: &[u8; 32]) -> [u8; 32];

    /// Write 32 bytes to storage slot.
    fn sstore(&mut self, key: &[u8; 32], value: &[u8; 32]);
}

impl Vm {
    /// Execute SLOAD: read from persistent storage.
    fn execute_sload<S: StorageBackend>(
        &mut self,
        storage: &S,
    ) -> Result<(), VmError> {
        self.gas.consume(GasCosts::SLOAD)?;

        let (dst, key_reg) = self.decode_rr();

        // Convert register value to 32-byte key
        let key_value = self.registers[key_reg];
        let mut key = [0u8; 32];
        key[24..32].copy_from_slice(&key_value.to_be_bytes());

        // Load from storage
        let value = storage.sload(&key);

        // Store first 8 bytes in register (simplified)
        self.registers[dst] = u64::from_be_bytes(value[24..32].try_into().unwrap());

        self.pc += 2;
        Ok(())
    }

    /// Execute SSTORE: write to persistent storage.
    fn execute_sstore<S: StorageBackend>(
        &mut self,
        storage: &mut S,
    ) -> Result<(), VmError> {
        // Check if slot was empty for gas calculation
        let (key_reg, value_reg) = self.decode_rr();

        let key_value = self.registers[key_reg];
        let mut key = [0u8; 32];
        key[24..32].copy_from_slice(&key_value.to_be_bytes());

        let current = storage.sload(&key);
        let is_empty = current == [0u8; 32];

        // Different costs for new vs existing slot
        let cost = if is_empty {
            GasCosts::SSTORE_SET
        } else {
            GasCosts::SSTORE_RESET
        };
        self.gas.consume(cost)?;

        // Write to storage
        let mut value = [0u8; 32];
        value[24..32].copy_from_slice(&self.registers[value_reg].to_be_bytes());
        storage.sstore(&key, &value);

        self.pc += 2;
        Ok(())
    }
}

3.7 Execution Tracing

For debugging, we can record every instruction executed:

use crate::opcodes::Opcode;

/// A single trace entry.
#[derive(Debug, Clone)]
pub struct TraceStep {
    pub pc: usize,
    pub opcode: Opcode,
    pub gas_before: u64,
    pub gas_after: u64,
    pub registers: [u64; 16],
}

/// Records execution for debugging.
pub struct Tracer {
    steps: Vec<TraceStep>,
    enabled: bool,
}

impl Tracer {
    pub fn new(enabled: bool) -> Self {
        Self {
            steps: Vec::new(),
            enabled,
        }
    }

    pub fn record(&mut self, step: TraceStep) {
        if self.enabled {
            self.steps.push(step);
        }
    }

    pub fn steps(&self) -> &[TraceStep] {
        &self.steps
    }

    /// Print a human-readable trace.
    pub fn print_trace(&self) {
        for (i, step) in self.steps.iter().enumerate() {
            println!(
                "{:4}: PC={:04X} {:?} gas={} R0={} R1={} R2={}",
                i,
                step.pc,
                step.opcode,
                step.gas_after,
                step.registers[0],
                step.registers[1],
                step.registers[2],
            );
        }
    }
}

Example Trace Output

   0: PC=0000 LOADI gas=999998 R0=10 R1=0 R2=0
   1: PC=000A LOADI gas=999996 R0=10 R1=20 R2=0
   2: PC=0014 ADD   gas=999994 R0=10 R1=20 R2=30
   3: PC=0017 LOG   gas=999992 R0=10 R1=20 R2=30
   4: PC=0019 HALT  gas=999992 R0=10 R1=20 R2=30

3.8 Complete Example

Let’s trace through a simple program:

; Calculate 10 + 20 and log the result

LOADI R0, 10      ; R0 = 10
LOADI R1, 20      ; R1 = 20
ADD R2, R0, R1    ; R2 = R0 + R1 = 30
LOG R2            ; output 30
HALT              ; stop

Bytecode encoding:

70 00 0A 00 00 00 00 00 00 00   ; LOADI R0, 10
70 10 14 00 00 00 00 00 00 00   ; LOADI R1, 20
10 20 10                        ; ADD R2, R0, R1
F0 20                           ; LOG R2
00                              ; HALT

Execution:

Step	PC	Instruction	R0	R1	R2	Gas Used
0	0x00	LOADI R0, 10	10	0	0	2
1	0x0A	LOADI R1, 20	10	20	0	4
2	0x14	ADD R2, R0, R1	10	20	30	6
3	0x17	LOG R2	10	20	30	8
4	0x19	HALT	10	20	30	8

Result: Success, gas used = 8, logs = [30]

Summary

We’ve built a register-based virtual machine:

Component	What It Does
`Registers`	16 × 64-bit general-purpose registers
`Memory`	Growable linear memory buffer
`Opcode`	43 instructions (arithmetic, logic, memory, storage)
`GasMeter`	Tracks and limits computation
`Vm`	The fetch-decode-execute loop
`Tracer`	Records execution for debugging

Design Decisions

Decision	Rationale
Register-based	More intuitive than stack-based
16 registers	Addressable with 4 bits, enough for most code
64-bit values	Matches modern CPUs, simpler than 256-bit
Gas metering	Prevents infinite loops, enables economic accountability
32-byte storage slots	Compatible with Ethereum, fits hashes

What’s Next?

We have a VM, but writing bytecode by hand is tedious. In Chapter 4, we’ll build an assembler that compiles human-readable assembly into bytecode:

; This is much nicer than raw bytes!
.entry main

main:
    LOADI R0, 0       ; storage slot 0
    SLOAD R1, R0      ; load current counter
    ADDI R1, R1, 1    ; increment
    SSTORE R0, R1     ; store back
    HALT

The assembler will handle labels, .entry directives, and emit the compact bytecode our VM expects.