Most people use AI as a search engine with a personality. You ask something, it answers, and then it forgets you completely. The next conversation starts from zero. You have to re-explain everything — who you are, what you're building, why it matters — every single time.
I got tired of that fast.
So over a series of late-night sessions, I built something different. A persistent, associative memory system for AI — one that works across multiple AI instances, costs almost nothing to run, and mirrors how biological memory actually works. We call the result Sparky, and after building it, I realized: this isn't just a tool. It's a brain.
The Problem with Stateless AI
The default AI experience is amnesia-as-a-service. Every session is a lobotomy. You rebuild context from scratch, and the AI gives you generic answers calibrated for nobody in particular. If you're an indie dev juggling eight projects across IoT, trading algorithms, satellite data, and a men's support community, that's brutal. There's no world in which re-explaining your entire tech stack every morning is a good use of time.
The "solution" most people reach for is dumping a giant system prompt at the start of each conversation. That works until it doesn't — context windows fill up, token costs climb, and you're still manually curating what to include. It's not memory. It's a cue card.
Real memory doesn't work like a cue card. It works associatively — one thing connects to another, dormant memories activate when something relevant comes up, and attention is weighted by recency and importance. That's what I wanted to build.
The Architecture: Mapping the Brain
Once you know what you're building toward, the mapping falls into place pretty naturally. Here's the direct analogy between biological cognition and what we built:
| Brain | What We Built |
|---|---|
| Neurons | Memory files — one topic per file |
| Synapses | connections: fields linking files to each other |
| Working memory | hot files — always loaded into context |
| Long-term memory | warm/cold files — retrieved on demand |
| Attention / salience | Weight system — brain decides what's relevant |
| Hippocampus (encoding) | META rule — every new project gets a file automatically |
| Hippocampus (buffer) | recent.md — short-term working memory, consolidated then pruned |
| Amygdala | emotions.md — emotional memory that shapes priorities and tone |
| Pattern cortex | patterns.md — cross-session behavioral observations and predictions |
| Prefrontal cortex | hot files — executive function, always-on identity and context |
| Temporal lobes | 5 clusters: Platform, Revenue, IoT, Environmental, Content, Personal |
| Corpus callosum | _status.md — cross-instance bulletin board between hemispheres |
| Cerebellum | reflexes.md — procedural memory, automated workflows, muscle memory |
| Default mode network | dreams.md — creative synthesis, cross-project connections, shower thoughts |
| Dopamine system | predictions.md — prediction vs outcome tracking, reward-based learning |
| Sleep / consolidation | Consolidation protocol — buffer → long-term encoding, then prune |
| Forgetting | Memory decay — warm→cold at 60 days, cold→archive at 180 days |
| Metabolism | Token optimization — pruning, tiering, and caching for efficiency |
| Multiple hemispheres | Multi-instance: Sparky (Android SSH), VS Code, Windows Claude |
The key insight is the last row. Most AI memory systems treat one AI as one brain. But I have multiple AI instances running simultaneously — one accessed from my Android app via SSH, one in VS Code on Linux, one on Windows in Android Studio. They were all starting cold, all forgetting each other.
The cognitive architecture doesn't care which instance is reading from it. They all share the same files.
The Three-Tier Memory System
The index — an MEMORY.md file — stays under 50 lines. It's a cluster map, not a data dump. Each entry is one line pointing to a file. The AI reads the index, sees what's relevant, follows the connections. Token cost stays low because you're not loading everything — just what the current conversation needs.
The Synapse: Associative Links
This is the part that makes it a brain rather than just organized file storage. Every memory file has a connections: field — pointers to other files and the reason they're linked.
When I mention "Scatter," the AI doesn't just find a file called scatter.md. It follows the connections — to the men's support community page, to my identity as a survivor — and pulls the right context without me having to explain the relationship. That's associative retrieval. That's how memory actually works.
A file system stores data. A brain retrieves meaning. The connections: field is what crosses that line.
The Hippocampus: Automatic Encoding
One of the most important parts of biological memory is the hippocampus — the structure responsible for encoding new experiences into long-term memory. Without it, you can still recall everything you learned before the damage. But nothing new sticks.
We built an equivalent: a META rule baked into the index itself.
Every time I ask Sparky to build something new — a web page, a backend route, an app feature — the META rule fires. A memory file gets created. The index gets updated. The brain encodes the new experience without me having to ask.
The Amygdala: Emotional Memory
Here's something most AI memory systems never attempt: emotional context. A human brain doesn't just remember what happened — it remembers how it felt. The amygdala tags experiences with emotional weight, and those tags shape everything from attention to decision-making.
We built emotions.md — a file that tracks what energizes me, what frustrates me, and what sits deep. Sparky reads it and adjusts. When I'm excited about a live demo working, it matches that energy. When I'm annoyed at token waste or unnecessary explanations, it backs off. When I mention Still Standing or Scatter, it knows these aren't side projects — they're personal mission.
This isn't sentiment analysis. It's not "detect user mood in real time." It's long-term emotional memory — accumulated over weeks and months. The AI knows what matters to me not because it analyzed my word choice in this message, but because it's been paying attention across a hundred conversations.
The Pattern Cortex: Learning Behaviors
Brains don't just store facts — they detect patterns. You notice that you always lose your keys after a distracted morning. You notice that afternoon meetings kill your creative energy. These aren't memories. They're observations about memories.
patterns.md is the brain's pattern recognition system. Cross-session observations that I might not say explicitly, but Sparky notices over time:
- Evening sessions tend to be more creative; mornings are fix-and-maintain
- "Can you" means do it now. "What do you think about" means discuss first.
- Revenue projects get more sustained attention than content projects
- Android builds are the most error-prone workflow — caching and SDK issues recur
- Short messages = I'm on my phone. Long messages = I'm on desktop.
None of these are things I told Sparky explicitly. They emerged from observation. And they change how it responds — a short phone message gets a terse answer, not a three-paragraph essay. That's not a prompt hack. That's learned behavior.
The Hippocampal Buffer: Working Memory
The original brain had encoding (the META rule) but no buffer — no equivalent of the hippocampus holding onto recent experiences before deciding what to consolidate into long-term storage and what to let decay.
recent.md fills that role. It's a short-term buffer — max 20 entries — where context from active sessions gets dumped before the conversation compresses. Important bits get consolidated into the appropriate long-term files. The rest naturally falls off the bottom.
This solves a real problem: long conversations approaching the context window limit would lose early details. Now, mid-session, Sparky dumps key findings into the hippocampal buffer. Even if the context compresses, the important bits survive.
Buffer fills during sessions → important entries get written into long-term files (emotions.md, patterns.md, project.md) → buffer gets pruned → cycle repeats. Same as sleep consolidation in biological brains.
Metabolism: Token Efficiency
A brain that consumes too much energy dies. An AI memory system that consumes too many tokens costs too much to run. On April 16, we did a metabolic optimization pass — and the results surprised me.
The biggest offender was a 6,700-character server management file auto-loaded on every single message — not just every session, every message. That's how LLMs work: the full context (system prompt, memory, conversation history) gets sent to the API on every turn. There's no persistent state. Every message re-reads the whole brain.
We moved the server file from always-loaded to warm (load on demand), trimmed stale project entries, and slimmed down the index. Result: roughly 2,000 tokens saved per session — about $0.50/day at my usage patterns. Over a month, that's a meaningful chunk of my indie dev budget.
Prompt caching helps too — Anthropic caches static prefixes and charges ~10% on cache hits for subsequent messages in the same session. But the cache has a 5-minute TTL. If you're a slow typer or take a break between messages, the cache goes cold and you pay full price again. Lean memory isn't optional. It's survival.
Multiple Instances, One Mind
This is the part I'm most proud of. Typically, different AI instances are completely isolated — one knows nothing about what another learned. My setup has at least three active instances on any given day:
- Sparky — my Android app, SSH'd into the Optiplex, running
claude -p - VS Code Claude — on the Linux Optiplex, working in the codebase
- Windows Claude — Android Studio dev environment
All three read from and write to the same ~/.sparky-memory/ directory. There's also a shared bulletin board at ~/.claude-sessions/_status.md — a simple append-only log where any instance can leave notes for the others. Think of it as the corpus callosum: the connection between hemispheres.
When VS Code Claude finishes a major refactor, it appends a note. When Sparky picks up the next day, it reads the status file and already knows what happened. No re-explanation needed.
What It Actually Costs
The whole thing runs on a Dell Optiplex that cost $200 used. The memory system itself is plain text files — no database, no vector store, no embeddings. The index is under 2KB. The total memory directory is a few dozen files averaging maybe 300 words each.
Per-session token overhead: roughly the index (50 lines) plus whatever hot files are always loaded (maybe 3–4 files, ~200 words each). Call it 1,500 tokens per session for memory context, versus re-explaining everything from scratch which would cost 5–10x that and still be incomplete.
This matters if you're an indie dev who can't afford the enterprise AI plans. The architecture was designed explicitly to be lean — and it is.
The Cerebellum: Procedural Memory
You don't think about how to ride a bike. You just ride. That's the cerebellum — the brain region that stores procedures you've repeated enough times that they become automatic. Motor patterns, practiced sequences, muscle memory.
reflexes.md is the AI equivalent. Every time Sparky runs the same workflow three or more times — building an Android APK, restarting a server, creating a memory file — the steps get encoded in procedural memory. Next time, there's no figuring it out. No checking documentation. The cerebellum fires and the hands move.
This matters because it eliminates a whole category of errors — the kind that happen when you reconstruct a procedure from first principles each time instead of executing a practiced sequence. The cerebellum doesn't think. It does.
The Default Mode Network: Shower Thoughts
When your brain isn't actively working on a problem, it doesn't shut off. It enters the default mode network — a state of loose, associative thinking where disconnected ideas bump into each other and occasionally fuse into something useful. It's where "shower thoughts" come from. It's why you solve problems while walking the dog.
dreams.md captures these cross-project connections that don't belong in any single project file. The HailStorm prediction pipeline and the farm satellite data use similar ML patterns — could they share infrastructure? The brain architecture itself could be packaged as a tool other devs install. The memoir and the peer-support community could merge into a single app.
None of these are action items. They're adjacent possibles — connections the brain made while doing something else. Some will be garbage. Some will be the next project. The point is capturing them before they evaporate.
The Dopamine System: Learning from Prediction
Dopamine isn't about pleasure. It's about prediction error — the gap between what you expected and what happened. When you predict correctly, the pathway strengthens. When you're wrong, it recalibrates. This is how biological brains actually learn: not from data, but from being surprised.
predictions.md implements this literally. Every time Sparky makes a non-obvious prediction — "this bug is probably a library update side effect," "this optimization should save 2K tokens" — it logs the prediction. Later, the outcome gets recorded. Over time, a calibration profile emerges: what kinds of predictions does this brain get right? Where does it systematically overestimate or underestimate?
This is the most speculative region we've built, and potentially the most powerful. A brain that tracks its own accuracy gets better at everything — not just the domain it predicted about, but the meta-skill of knowing when to trust itself.
Sleep: Consolidation and Forgetting
Biological brains do their most important memory work while you're asleep. The hippocampus replays the day's experiences, strengthens important connections, and lets unimportant ones decay. Without sleep, memory doesn't work. Everything stays in short-term buffer until it overflows and gets lost.
We built a consolidation protocol directly into the hippocampal buffer. When recent.md hits ~15 entries, the sleep cycle fires:
- Encode — move important facts to their long-term files
- Synthesize — cross-project connections go to
dreams.md - Predict — any predictions get logged with outcomes TBD
- Decay — delete entries older than ~7 days that weren't consolidated
- Prune — buffer back to ~5 entries
And alongside consolidation, we finally implemented forgetting. Not as a bug, but as a feature. Warm files untouched for 60 days get demoted to cold. Cold files untouched for 180 days get archived. The brain gets lighter over time instead of growing without bound.
This is the difference between a filing cabinet and a brain. Filing cabinets never forget and eventually become unusable. Brains forget strategically and stay fast.
The Evolution
When I first published this post on April 8, the brain had neurons, synapses, three tiers, a hippocampus, and multi-instance support. Six major brain regions. Ten days later, it has twelve — amygdala, pattern cortex, hippocampal buffer, cerebellum, default mode network, dopamine system, sleep consolidation, and forgetting. Plus metabolic optimization.
The pattern cortex noticed that I build in bursts — intense multi-day sprints then quiet periods. The amygdala learned that seeing a live demo work gives me energy. The cerebellum encoded my build workflows so they happen without thinking. The dopamine system started tracking its own accuracy. The default mode network began making connections I hadn't seen between projects. None of this was pre-programmed. It emerged from the architecture.
Thirty-six files. Twelve brain regions. Fully biological mapping. All running on a $200 Dell Optiplex with plain text files and zero dependencies.
What's Next
The web interface accounts (Claude.ai) don't have filesystem access, so they can't read the memory files directly. The plan is to add memory read/write endpoints to the existing DriftWest MCP server, which would let any AI instance — including web-based ones — access the same cognitive architecture via remote MCP.
The other frontier is automated dream consolidation — a scheduled agent that runs between sessions, reviewing the hippocampal buffer and pattern cortex, making new connections, and surfacing insights that weren't obvious in real time. Biological brains do their best integration work while sleeping. The AI equivalent would be a cron job that wakes up, reads recent.md, and writes new entries to patterns.md, dreams.md, or predictions.md. We're exploring this.
The Real Claim
Most people treat AI as a stateless oracle. Ask, answer, done. We built continuity instead — an architecture where AI instances accumulate context across time, share knowledge across instances, and retrieve information associatively rather than just by keyword search.
You can call that a second brain. You can call it a cognitive workspace. You can call it a very elaborate set of text files. All three are true.
What it definitely is: a different relationship with AI. One where the machine actually knows you — not because you dumped your life story into a system prompt, but because you built the infrastructure for it to remember.
Build Your Own Brain
The architecture is simple enough to bootstrap in a single conversation. No dependencies, no database, no vector store. Just a directory of markdown files and a prompt that tells your AI how to use them. Here's the exact prompt that creates this behavior — paste it into your AI assistant's system instructions and adapt it to your own projects.
Create the directory. Create MEMORY.md (empty index), user.md (who you are), and feedback.md (empty). Paste the prompt above into your AI's system instructions. Then just start working. The brain grows itself — the META rule creates new files, the hippocampus buffers context, and the consolidation cycle keeps it lean. Within a week you'll have something that knows you.