First Memory Deep Dive

Follow a single claim through every pipeline layer and understand the data transformations that make epistemic memory work.

The Claim

A user sends in chat:

"Tôi thích uống cà phê đen mỗi sáng"
(I like drinking black coffee every morning)

The AI agent decides this is worth remembering and calls epistemic_store.

L0: Policy Firewall

The claim is first classified by type:

Check	Result
Is PII?	No (food preference)
Is garbage?	No
Claim type	`behavioral`
Policy verdict	PASS

L0.5: Sentence Classifier

Fast regex checks filter non-storable content:

Not a greeting ("xin chào", "hello") → ✅
Not a question → ✅
Not a command ("hãy làm", "please do") → ✅
Contains factual assertion → ✅ proceed to L1

L1: Claim Normalizer

The Vietnamese text is parsed into a structured triple:

{
  "subject": "user",
  "predicate": "likes drinking",
  "object": "black coffee every morning",
  "kind": "behavioral",
  "decayClass": "STABLE"
}

The normalizer uses regex fast-path first (handles ~70% of patterns), then falls back to LLM extraction for complex sentences.

L2: Confidence Scorer

Confidence is computed via a multi-factor sigmoid function:

confidence = σ(α·source + β·corroboration − γ·conflict + δ·kind)

Factor	Value	Weight	Contribution
source = user_explicit	1.0	α = 0.4	+0.40
corroboration	0 (first mention)	β = 0.2	+0.00
conflict	0 (none found)	γ = 0.3	-0.00
kind = behavioral	0.8	δ = 0.1	+0.08

Raw score: 0.48 → σ(0.48) ≈ 0.618 → rounded and adjusted: 0.741

L3: Conflict Detector

Vector and keyword search checks for contradictions:

Search: "user likes drinking black coffee every morning" → 0 existing matches
No contradictions found → claim proceeds cleanly
Entropy delta: +0.00

L4: Embedding + Storage

The claim is embedded and stored in LanceDB:

// 36-column record written to LanceDB
{
  id: "mem_a1b2c3d4",
  claim: "Tôi thích uống cà phê đen mỗi sáng",
  subject: "user",
  predicate: "likes drinking",
  object: "black coffee every morning",
  kind: "behavioral",
  confidence: 0.741,
  tier: "WORKING",
  decayClass: "STABLE",
  source: "user_explicit",
  channelId: "telegram:123456",
  storedAt: "2026-01-15T10:30:00.000Z",
  lastAccessed: "2026-01-15T10:30:00.000Z",
  vector: Float32Array[1536],
  // ... 22 more columns
}

L5: Tier Router

The confidence score determines the memory's tier:

Tier	Confidence Range	Behavior
`QUARANTINE`	< 0.30	Hidden, never injected
`CANDIDATE`	0.30 – 0.49	Available on search only
`WORKING`	0.50 – 0.89	Auto-injected into prompts
`FACT`	≥ 0.90	Permanent, high-priority injection

Our coffee claim has confidence 0.741 → WORKING tier. It will be auto-injected into future conversations.

Tip: Use epistemic_promote to manually boost memories to FACT tier, or epistemic_demote to drop them back.

Summary

A single claim traverses 7 processing stages in under 200ms, resulting in a richly-annotated memory with confidence scoring, decay classification, conflict checking, and automatic tier assignment. This is what makes epistemic memory fundamentally different from key-value storage.