Memory
Self-curated knowledge across runs. Three scopes, BM25 search, markdown-on-disk default.
An agent with no memory starts every run from zero — it can’t recall what the operator told it last week or what it learned the hard way two runs ago. drover’s memory module fixes that. Composing memory for an agent means deciding what it should know going in and what it’s allowed to learn — then wiring an adapter to make both durable.
The two halves of memory
Memory has two halves. Compose with both in mind:
| Ambient instructions | Learned memory | |
|---|---|---|
| Written by | you (the developer) | the agent, during runs |
| Source | AGENTS.md / CLAUDE.md files on disk | remember tool calls |
| Mutable at runtime | no — edit the file | yes — remember / forget |
| Always in context | yes — injected every run | no — summaries indexed, bodies on recall |
| Configured via | spec.instructionFiles | spec.memory |
Ambient instructions are what the agent should know going in — house rules, conventions, the shape of the project. Learned memory is what the agent picks up — operator preferences, corrections, project state. A well-composed agent usually uses both: instruction files for the stable baseline, learned memory for everything that accrues.
This is also how memory differs from skills: skills are author-curated capabilities the agent loads on demand; memory (both halves) is knowledge the agent carries by default.
How it layers in the system prompt
When a run starts, the harness assembles the system prompt in layers — broadest context first, most specific last:
┌─ basePrompt ────────────── your spec.systemPrompt
├─ ## Project instructions ─ ambient: AGENTS.md / CLAUDE.md chain
├─ ## Available skills ───── skill names + descriptions
└─ ## Recalled memory ────── learned: scoped summary index
Instruction-file bodies are injected whole. Learned-memory entries
appear only as one-line summaries — the agent calls recall to pull a
full body. That keeps the prompt cheap no matter how much the agent has
learned.
Three scopes
| Scope | Persists across | Visible to |
|---|---|---|
global | every run, every agent | all agents |
agent | every run of a given agent id | only that agent |
run | one specific run (and pause/resume) | only that runId, while active |
run-scoped entries stay on disk after the run terminates (for
forensics) but are no longer reachable via recall once a different
runId is active.
SKILL.md vs memory
- Skills are author-curated content shipped with the project. They don’t change at runtime.
- Memory is model-curated content learned during runs. It changes every time an agent crosses a “worth remembering” bar.
Wire it on
import { defineAgent } from "@drover/core";
import { createMarkdownMemory } from "@drover/memory";
import { runAgent } from "@drover/facade";
const writer = defineAgent({
id: "writer",
/* ... */
memory: {
enabled: true,
includeIndex: true, // default — auto-inject summary index
writesPerTurn: 1, // default — rate-limit `remember`
allowForget: false, // default — `forget` is off
},
});
const memory = await createMarkdownMemory({ root: "./memory" });
runAgent(writer, input, { memory });When spec.memory.enabled === true and deps.memory is wired:
- The harness auto-injects
rememberandrecall(andforgetifallowForget). - It appends a
## Recalled memoryblock to the system prompt listing global + agent-scope summaries (capped bymaxIndexEntries, default 30). - It applies
memoryRateLimitPlugin({ writesPerTurn })so a runaway model can’t spam the store.
Composition patterns
One adapter, wired per run, backs whatever set of agents you point at it. How you compose depends on how much agents should share.
Single agent, durable
The common case — one agent, file-backed memory, instructions from the repo:
const memory = await createMarkdownMemory({ root: "./.memory" });
const agent = defineAgent({
id: "writer",
/* ... */
memory: { enabled: true },
instructionFiles: {}, // {} = all defaults
});
runAgent(agent, input, { memory });Many agents, one store
Pass the same adapter to every run. global entries are shared;
agent-scoped entries stay private to each spec.id. No extra wiring —
scope isolation is automatic.
const memory = await createMarkdownMemory({ root: "./.memory" });
runAgent(researcher, input, { memory }); // sees global + researcher's own
runAgent(writer, input, { memory }); // sees global + writer's ownUse global for cross-agent facts (“the operator prefers metric
units”); let each agent accrue its own agent-scoped lessons.
Private memory per agent
For hard isolation — no shared global — give each agent its own
adapter rooted at its own directory:
const researcherMem = await createMarkdownMemory({ root: "./.memory/researcher" });
const writerMem = await createMarkdownMemory({ root: "./.memory/writer" });Ephemeral (tests, throwaway runs)
createInMemoryMemory() is a drop-in adapter with no disk footprint —
use it in tests or short-lived processes:
runAgent(agent, input, { memory: createInMemoryMemory() });Subagents
Subagents spawned via taskTool inherit the parent’s deps — including
the memory adapter. A child agent reads global plus its own
agent-scoped entries (keyed by the child’s spec.id), never the
parent’s. Compose shared knowledge at global if you want it to reach
the whole tree.
Choosing what goes where
When you (or the agent) have a fact to persist, the scope and source follow from two questions — who wrote it and how long it should live:
| The fact is… | Put it in… |
|---|---|
| A house rule you maintain | an AGENTS.md / CLAUDE.md file (instructionFiles) |
| True for every agent, learned at runtime | global scope |
| Specific to one agent’s behaviour | agent scope |
| Only relevant to the current task | run scope |
And the kind tags what type of fact it is — user, feedback,
project, reference (see Kinds). Scope is lifetime; kind is
category. They’re independent: a feedback memory can be global or
agent-scoped.
Rule of thumb: if you would commit it to the repo, it’s an instruction file. If the agent discovers it mid-run, it’s learned memory — and the scope is “how broadly does this apply.”
The self-learning bar
remember is gated by the model’s own judgement, not by automatic
capture. The tool description nudges:
Save a memory only when the lesson is non-obvious AND will apply to future runs. Lead the body with the rule, then a Why: line and a How to apply: line. Don’t save what’s already in the code or commit log.
Pair with the rate-limit (default 1/turn) so the agent has to choose.
Kinds
Borrowed from the auto-memory schema — short taxonomy that’s easy to filter on:
user— facts about the operatorfeedback— behavioural rules from corrections / winsproject— state about ongoing workreference— pointers to external systems
Progressive disclosure
- System prompt — summaries only (one line per entry, capped).
- Activation — model calls
recall(query=...); full bodies return. - Update — model calls
remember(id=..., ...)to overwrite.
If a memory turns out wrong, the model can update it in place (id
preserves the original createdAt) instead of writing a contradicting
new one.
Scope inference
recall infers the default scope from context: global + agent when
no runId is active, all three when there is one. Pass scopes
explicitly to override. Agent-scoped reads always filter to the
calling agent — no cross-agent leaks.
Tags
Optional tags?: string[] on every entry. recall({ tags: [...] })
returns matches when ANY tag overlaps (set union, not intersection).
Tag hits also give a small BM25 score boost so curated tags rank above
incidental text matches.
Events
Two new HarnessEvent kinds surface every write and recall:
memory_written— emitted afterremembersucceeds.memory_recalled— emitted afterrecall; lists hit ids + scores.
The eval-viewer renders both on the run timeline. Tracers and observers see them just like any other harness event.
Adapters
drover ships two:
createMarkdownMemory({ root })— files on disk under<root>/global/,<root>/agents/<agentId>/,<root>/runs/<runId>/. Each file is YAML frontmatter + markdown body. Git-friendly. Single writer per process.createInMemoryMemory()— Map-backed, no persistence. For tests and ephemeral runs.
Custom adapters implement MemoryAdapter.
Instruction files (AGENTS.md / CLAUDE.md)
Most projects already keep AGENTS.md / CLAUDE.md files describing how
agents should behave — at the repo root and in subdirectories. drover can
load them as ambient memory: read-only context the agent always sees,
distinct from the learned entries above.
This is fully opt-in. Set spec.instructionFiles and the harness:
- Discovers instruction files along the ancestor chain — from the repo
root down to the run’s
cwd. A subdirectory’s file is loaded only when the agent works in (or under) that directory. - Injects them into the system prompt, root-first, fresh every run.
- If a memory adapter is wired, also seeds them as read-only
referenceentries (taginstructions) sorecallreaches them.
const writer = defineAgent({
id: "writer",
/* ... */
instructionFiles: {
// defaults shown
filenames: ["AGENTS.md", "CLAUDE.md"],
// root: auto-detected via the nearest .git ancestor of cwd
maxBytesPerFile: 16384,
seedMemory: true,
},
});For cwd = packages/api, the loaded chain is:
./AGENTS.md ← repo root
./packages/AGENTS.md
./packages/api/CLAUDE.md ← run cwd
./packages/web/AGENTS.md is off-path and not loaded.
No filename is privileged — AGENTS.md / CLAUDE.md are just the default
filenames. Point it at any set (["RULES.md", "team-conventions.md"]).
Read-only
Seeded instruction entries carry the instructions tag. forget refuses
them — they mirror files on disk; edit the file, not the memory. Each run
re-seeds from disk, so the store always reflects current file content.
Manual use
The loader is also exported standalone for full control — bespoke block formats, custom discovery, or files outside the ancestor chain:
import {
loadInstructionFiles,
renderInstructionsBlock,
seedInstructionFiles,
} from "@drover/memory";
const files = await loadInstructionFiles({ cwd, root, filenames: ["RULES.md"] });
const block = renderInstructionsBlock(files); // → system-prompt markdown
await Effect.runPromise(seedInstructionFiles(memory, files));Resume contract
The memory settings (enabled, includeIndex, maxIndexEntries,
allowForget, writesPerTurn) and instructionFiles config all go into
hashSpec. A paused run resumed under different memory settings fails with
ResumeError: spec hash mismatch — same contract as the other policy
fields, so you can’t accidentally replay a run under a different memory
policy. (File content isn’t hashed — only the config — same boundary as
skills.)