First Memory Deep Dive
Follow a single claim through every pipeline layer and understand the data transformations that make epistemic memory work.
The Claim
A user sends in chat:
"Tôi thích uống cà phê đen mỗi sáng"
(I like drinking black coffee every morning)
The AI agent decides this is worth remembering and calls epistemic_store.
L0: Policy Firewall
The claim is first classified by type:
| Check | Result |
|---|---|
| Is PII? | No (food preference) |
| Is garbage? | No |
| Claim type | behavioral |
| Policy verdict | PASS |
L0.5: Sentence Classifier
Fast regex checks filter non-storable content:
- Not a greeting ("xin chào", "hello") → ✅
- Not a question → ✅
- Not a command ("hãy làm", "please do") → ✅
- Contains factual assertion → ✅ proceed to L1
L1: Claim Normalizer
The Vietnamese text is parsed into a structured triple:
{
"subject": "user",
"predicate": "likes drinking",
"object": "black coffee every morning",
"kind": "behavioral",
"decayClass": "STABLE"
}The normalizer uses regex fast-path first (handles ~70% of patterns), then falls back to LLM extraction for complex sentences.
L2: Confidence Scorer
Confidence is computed via a multi-factor sigmoid function:
confidence = σ(α·source + β·corroboration − γ·conflict + δ·kind)
| Factor | Value | Weight | Contribution |
|---|---|---|---|
| source = user_explicit | 1.0 | α = 0.4 | +0.40 |
| corroboration | 0 (first mention) | β = 0.2 | +0.00 |
| conflict | 0 (none found) | γ = 0.3 | -0.00 |
| kind = behavioral | 0.8 | δ = 0.1 | +0.08 |
Raw score: 0.48 → σ(0.48) ≈ 0.618 → rounded and adjusted: 0.741
L3: Conflict Detector
Vector and keyword search checks for contradictions:
- Search: "user likes drinking black coffee every morning" → 0 existing matches
- No contradictions found → claim proceeds cleanly
- Entropy delta: +0.00
L4: Embedding + Storage
The claim is embedded and stored in LanceDB:
// 36-column record written to LanceDB
{
id: "mem_a1b2c3d4",
claim: "Tôi thích uống cà phê đen mỗi sáng",
subject: "user",
predicate: "likes drinking",
object: "black coffee every morning",
kind: "behavioral",
confidence: 0.741,
tier: "WORKING",
decayClass: "STABLE",
source: "user_explicit",
channelId: "telegram:123456",
storedAt: "2026-01-15T10:30:00.000Z",
lastAccessed: "2026-01-15T10:30:00.000Z",
vector: Float32Array[1536],
// ... 22 more columns
}L5: Tier Router
The confidence score determines the memory's tier:
| Tier | Confidence Range | Behavior |
|---|---|---|
QUARANTINE | < 0.30 | Hidden, never injected |
CANDIDATE | 0.30 – 0.49 | Available on search only |
WORKING | 0.50 – 0.89 | Auto-injected into prompts |
FACT | ≥ 0.90 | Permanent, high-priority injection |
Our coffee claim has confidence 0.741 → WORKING tier. It will be auto-injected into future conversations.
epistemic_promote to manually boost memories to FACT tier, or epistemic_demote to drop them back.
Summary
A single claim traverses 7 processing stages in under 200ms, resulting in a richly-annotated memory with confidence scoring, decay classification, conflict checking, and automatic tier assignment. This is what makes epistemic memory fundamentally different from key-value storage.