Fun Projects

Aimulet

An autonomous NetHack player — PTY capture, structured game-state side-channels, LLM planner, Raylib visualisation.

Rust PTY / WINCHAIN OpenAI-compatible API LM Studio Raylib Games

NetHack is one of the oldest and hardest games in existence. It's also famously impossible for AI to play well: the state space is enormous, the game is deeply stochastic, permanent death makes every mistake expensive, and a huge fraction of the game's knowledge exists only in the manual and the cultural lore accumulated over forty years of play. It is, in short, a perfect test of whether an LLM-based agent can reason about a non-trivial interactive environment.

Aimulet runs NetHack in a PTY (pseudo-terminal), captures the terminal output frame by frame, and feeds it to a perception layer that converts the raw ANSI into structured game state: map tiles, monster positions, inventory, status effects, last messages. This structured state goes to the LLM planner, which produces a sequence of actions. The actions go back through the PTY as keystrokes.

The WINCHAIN approach

NetHack's terminal output alone isn't enough context for reliable planning — the LLM needs to know what just happened, what threats are present, what the inventory means in context. Aimulet builds a structured side-channel (called WINCHAIN internally) that accumulates per-turn game events — combat results, pickup events, level transitions, hunger state — and annotates the planner's context window with them. This lets the LLM reason about recent history without having to re-parse the terminal from scratch each turn.

Local models and Raylib

The LLM interface uses an OpenAI-compatible API, so it runs equally against a remote model or against LM Studio running a local model on-device. Most development uses LM Studio — which keeps latency low and means the agent plays at a reasonable pace without API costs accumulating.

A Raylib-based visualisation layer renders the agent's current map view, its inventory and a live trace of the LLM's reasoning steps alongside the NetHack terminal itself. Watching the agent think through a dangerous room is unexpectedly compelling.

Why Rust

PTY management, ANSI parsing and turn-by-turn state synchronisation need reliability and low latency. Rust's ownership model catches the aliasing and lifetime bugs that tend to appear in systems that manage shared mutable terminal state, and the ecosystem's async story (Tokio) handles the PTY I/O, LLM streaming and Raylib render loop concurrently without threads fighting over state.