Agent Memory

@betterdb/agent-memory is standalone agent memory for Valkey: the short-term caching tiers from @betterdb/agent-cache plus a semantic long-term MemoryStore backed by Valkey Search (FT.*).

Where the cache tiers are exact-match and ephemeral, the memory tier is semantic and durable: store memories with remember(), retrieve the most relevant ones with recall() (semantic similarity blended with recency and importance), and keep stores bounded with TTLs, capacity eviction, and consolidate().

Prerequisites

  • Valkey 8.0+ with the valkey-search module loaded (for the FT.* commands)
  • Or Amazon ElastiCache for Valkey (8.0+)
  • Or Google Cloud Memorystore for Valkey
  • Node.js >= 20
  • An embedding function you provide

Installation

npm install @betterdb/agent-memory iovalkey

iovalkey is a peer dependency - install it alongside the package.

Quick start

The AgentMemory facade wires the short-term cache tiers and the long-term memory store together over a single client and name:

import Valkey from 'iovalkey';
import { AgentMemory } from '@betterdb/agent-memory';

const client = new Valkey('redis://localhost:6379');

const agent = new AgentMemory({
  client,
  name: 'my_agent',
  embedFn: async (text) => embed(text), // returns number[]
});

// Create the vector index and register discovery markers (idempotent).
await agent.initialize();

// Long-term memory:
await agent.memory.remember('User prefers dark mode', {
  agentId: 'assistant',
  importance: 0.8,
  tags: ['preferences'],
});

const hits = await agent.memory.recall('what theme does the user like?', {
  agentId: 'assistant',
  k: 5,
});

// Short-term cache tiers (from @betterdb/agent-cache):
agent.llm;
agent.tool;
agent.session;

await agent.close();

You can also use the MemoryStore directly, without the cache tiers:

import { MemoryStore } from '@betterdb/agent-memory';

const memory = new MemoryStore({ client, name: 'my_agent', embedFn });
await memory.ensureIndex();

MemoryStore API

Method Description
ensureIndex() Create the {name}:mem:idx vector index if absent (idempotent). Resolves the vector dimension from embedFn.
remember(content, options?) Embed and store a memory; returns its id. Options: importance (0..1), tags, ttl (seconds), and scope (threadId, agentId, namespace).
recall(query, options?) Semantic search scoped by threadId/agentId/namespace/tags, ranked by a composite of similarity, recency (half-life decay), and importance. Returns MemoryHit[]. Recalled memories are reinforced (last-access + access-count bumped) unless reinforce: false.
forget(id) Delete a single memory by id.
forgetByScope(scope) Delete all memories matching a scope and/or tags.
consolidate(options) Summarize a set of memories (via a summarize callback) into one new memory and optionally delete the sources. Select candidates by scope, tags, olderThanSeconds, or maxImportance.
currentConfig() / refreshConfig() Read the live recall/eviction tunables; with configRefresh enabled the store periodically re-reads them from {name}:__mem_config.
close() Stop the config-refresh timer and tear down discovery heartbeats.

Scoring and capacity

Recall ranks by compositeScore - a weighted blend of similarity, recency (true half-life decay), and importance. Defaults are tunable via MemoryStoreOptions (weights, halfLifeSeconds, defaultThreshold) or live via config refresh. Set maxItemsPerScope to cap memories per scope; over-capacity writes evict the lowest-scoring items (importance + recency).

recall only returns candidates whose cosine distance is within a threshold (default 0.25, i.e. similarity >= ~0.875) - tuned for real semantic embeddings, where a relevant memory lands well inside it. A weak or non-semantic embedFn can push every candidate past the threshold and yield no hits; raise it per call (recall(query, { threshold })) or globally (defaultThreshold) if that happens.

Observability

Set telemetry: { registry } to register Prometheus metrics (agent_memory_*: items, recall total/hits/empty/latency, embedding calls, evictions, consolidations) and OpenTelemetry spans for each operation. With discovery enabled (default in the facade), the store publishes a marker to the shared __betterdb:caches registry so BetterDB Monitor can auto-discover it.

See also

  • Agent Memory (Python) - the Python port with the same surface and data format.
  • Agent Cache - the short-term llm/tool/session cache tiers bundled into the facade.
  • Retrieval - a lower-level vector retrieval SDK without the recency/importance scoring.