Memory

Self-curated knowledge across runs. Three scopes, BM25 search, markdown-on-disk default.

An agent with no memory starts every run from zero — it can’t recall what the operator told it last week or what it learned the hard way two runs ago. drover’s memory module fixes that. Composing memory for an agent means deciding what it should know going in and what it’s allowed to learn — then wiring an adapter to make both durable.

The two halves of memory

Memory has two halves. Compose with both in mind:

Ambient instructionsLearned memory
Written byyou (the developer)the agent, during runs
SourceAGENTS.md / CLAUDE.md files on diskremember tool calls
Mutable at runtimeno — edit the fileyes — remember / forget
Always in contextyes — injected every runno — summaries indexed, bodies on recall
Configured viaspec.instructionFilesspec.memory

Ambient instructions are what the agent should know going in — house rules, conventions, the shape of the project. Learned memory is what the agent picks up — operator preferences, corrections, project state. A well-composed agent usually uses both: instruction files for the stable baseline, learned memory for everything that accrues.

This is also how memory differs from skills: skills are author-curated capabilities the agent loads on demand; memory (both halves) is knowledge the agent carries by default.

How it layers in the system prompt

When a run starts, the harness assembles the system prompt in layers — broadest context first, most specific last:

┌─ basePrompt ────────────── your spec.systemPrompt
├─ ## Project instructions ─ ambient: AGENTS.md / CLAUDE.md chain
├─ ## Available skills ───── skill names + descriptions
└─ ## Recalled memory ────── learned: scoped summary index

Instruction-file bodies are injected whole. Learned-memory entries appear only as one-line summaries — the agent calls recall to pull a full body. That keeps the prompt cheap no matter how much the agent has learned.

Three scopes

ScopePersists acrossVisible to
globalevery run, every agentall agents
agentevery run of a given agent idonly that agent
runone specific run (and pause/resume)only that runId, while active

run-scoped entries stay on disk after the run terminates (for forensics) but are no longer reachable via recall once a different runId is active.

SKILL.md vs memory

  • Skills are author-curated content shipped with the project. They don’t change at runtime.
  • Memory is model-curated content learned during runs. It changes every time an agent crosses a “worth remembering” bar.

Wire it on

ts
import { defineAgent } from "@drover/core";
import { createMarkdownMemory } from "@drover/memory";
import { runAgent } from "@drover/facade";

const writer = defineAgent({
  id: "writer",
  /* ... */
  memory: {
    enabled: true,
    includeIndex: true,        // default — auto-inject summary index
    writesPerTurn: 1,          // default — rate-limit `remember`
    allowForget: false,        // default — `forget` is off
  },
});

const memory = await createMarkdownMemory({ root: "./memory" });

runAgent(writer, input, { memory });

When spec.memory.enabled === true and deps.memory is wired:

  1. The harness auto-injects remember and recall (and forget if allowForget).
  2. It appends a ## Recalled memory block to the system prompt listing global + agent-scope summaries (capped by maxIndexEntries, default 30).
  3. It applies memoryRateLimitPlugin({ writesPerTurn }) so a runaway model can’t spam the store.

Composition patterns

One adapter, wired per run, backs whatever set of agents you point at it. How you compose depends on how much agents should share.

Single agent, durable

The common case — one agent, file-backed memory, instructions from the repo:

ts
const memory = await createMarkdownMemory({ root: "./.memory" });

const agent = defineAgent({
  id: "writer",
  /* ... */
  memory: { enabled: true },
  instructionFiles: {},   // {} = all defaults
});

runAgent(agent, input, { memory });

Many agents, one store

Pass the same adapter to every run. global entries are shared; agent-scoped entries stay private to each spec.id. No extra wiring — scope isolation is automatic.

ts
const memory = await createMarkdownMemory({ root: "./.memory" });

runAgent(researcher, input, { memory });   // sees global + researcher's own
runAgent(writer, input, { memory });       // sees global + writer's own

Use global for cross-agent facts (“the operator prefers metric units”); let each agent accrue its own agent-scoped lessons.

Private memory per agent

For hard isolation — no shared global — give each agent its own adapter rooted at its own directory:

ts
const researcherMem = await createMarkdownMemory({ root: "./.memory/researcher" });
const writerMem = await createMarkdownMemory({ root: "./.memory/writer" });

Ephemeral (tests, throwaway runs)

createInMemoryMemory() is a drop-in adapter with no disk footprint — use it in tests or short-lived processes:

ts
runAgent(agent, input, { memory: createInMemoryMemory() });

Subagents

Subagents spawned via taskTool inherit the parent’s deps — including the memory adapter. A child agent reads global plus its own agent-scoped entries (keyed by the child’s spec.id), never the parent’s. Compose shared knowledge at global if you want it to reach the whole tree.

Choosing what goes where

When you (or the agent) have a fact to persist, the scope and source follow from two questions — who wrote it and how long it should live:

The fact is…Put it in…
A house rule you maintainan AGENTS.md / CLAUDE.md file (instructionFiles)
True for every agent, learned at runtimeglobal scope
Specific to one agent’s behaviouragent scope
Only relevant to the current taskrun scope

And the kind tags what type of fact it is — user, feedback, project, reference (see Kinds). Scope is lifetime; kind is category. They’re independent: a feedback memory can be global or agent-scoped.

Rule of thumb: if you would commit it to the repo, it’s an instruction file. If the agent discovers it mid-run, it’s learned memory — and the scope is “how broadly does this apply.”

The self-learning bar

remember is gated by the model’s own judgement, not by automatic capture. The tool description nudges:

Save a memory only when the lesson is non-obvious AND will apply to future runs. Lead the body with the rule, then a Why: line and a How to apply: line. Don’t save what’s already in the code or commit log.

Pair with the rate-limit (default 1/turn) so the agent has to choose.

Kinds

Borrowed from the auto-memory schema — short taxonomy that’s easy to filter on:

  • user — facts about the operator
  • feedback — behavioural rules from corrections / wins
  • project — state about ongoing work
  • reference — pointers to external systems

Progressive disclosure

  1. System prompt — summaries only (one line per entry, capped).
  2. Activation — model calls recall(query=...); full bodies return.
  3. Update — model calls remember(id=..., ...) to overwrite.

If a memory turns out wrong, the model can update it in place (id preserves the original createdAt) instead of writing a contradicting new one.

Scope inference

recall infers the default scope from context: global + agent when no runId is active, all three when there is one. Pass scopes explicitly to override. Agent-scoped reads always filter to the calling agent — no cross-agent leaks.

Tags

Optional tags?: string[] on every entry. recall({ tags: [...] }) returns matches when ANY tag overlaps (set union, not intersection). Tag hits also give a small BM25 score boost so curated tags rank above incidental text matches.

Events

Two new HarnessEvent kinds surface every write and recall:

  • memory_written — emitted after remember succeeds.
  • memory_recalled — emitted after recall; lists hit ids + scores.

The eval-viewer renders both on the run timeline. Tracers and observers see them just like any other harness event.

Adapters

drover ships two:

  • createMarkdownMemory({ root }) — files on disk under <root>/global/, <root>/agents/<agentId>/, <root>/runs/<runId>/. Each file is YAML frontmatter + markdown body. Git-friendly. Single writer per process.
  • createInMemoryMemory() — Map-backed, no persistence. For tests and ephemeral runs.

Custom adapters implement MemoryAdapter.

Instruction files (AGENTS.md / CLAUDE.md)

Most projects already keep AGENTS.md / CLAUDE.md files describing how agents should behave — at the repo root and in subdirectories. drover can load them as ambient memory: read-only context the agent always sees, distinct from the learned entries above.

This is fully opt-in. Set spec.instructionFiles and the harness:

  1. Discovers instruction files along the ancestor chain — from the repo root down to the run’s cwd. A subdirectory’s file is loaded only when the agent works in (or under) that directory.
  2. Injects them into the system prompt, root-first, fresh every run.
  3. If a memory adapter is wired, also seeds them as read-only reference entries (tag instructions) so recall reaches them.
ts
const writer = defineAgent({
  id: "writer",
  /* ... */
  instructionFiles: {
    // defaults shown
    filenames: ["AGENTS.md", "CLAUDE.md"],
    // root: auto-detected via the nearest .git ancestor of cwd
    maxBytesPerFile: 16384,
    seedMemory: true,
  },
});

For cwd = packages/api, the loaded chain is:

./AGENTS.md              ← repo root
./packages/AGENTS.md
./packages/api/CLAUDE.md ← run cwd

./packages/web/AGENTS.md is off-path and not loaded.

No filename is privileged — AGENTS.md / CLAUDE.md are just the default filenames. Point it at any set (["RULES.md", "team-conventions.md"]).

Read-only

Seeded instruction entries carry the instructions tag. forget refuses them — they mirror files on disk; edit the file, not the memory. Each run re-seeds from disk, so the store always reflects current file content.

Manual use

The loader is also exported standalone for full control — bespoke block formats, custom discovery, or files outside the ancestor chain:

ts
import {
  loadInstructionFiles,
  renderInstructionsBlock,
  seedInstructionFiles,
} from "@drover/memory";

const files = await loadInstructionFiles({ cwd, root, filenames: ["RULES.md"] });
const block = renderInstructionsBlock(files);   // → system-prompt markdown
await Effect.runPromise(seedInstructionFiles(memory, files));

Resume contract

The memory settings (enabled, includeIndex, maxIndexEntries, allowForget, writesPerTurn) and instructionFiles config all go into hashSpec. A paused run resumed under different memory settings fails with ResumeError: spec hash mismatch — same contract as the other policy fields, so you can’t accidentally replay a run under a different memory policy. (File content isn’t hashed — only the config — same boundary as skills.)

Type to search…

↑↓ navigate open esc close