diff --git a/README.md b/README.md
index c62d5a8..1659ed4 100644
--- a/README.md
+++ b/README.md
@@ -17,6 +17,16 @@
 
 Works **alongside** `memory-core` (OpenClaw's built-in memory) — doesn't replace it.
 
+### Regex + LLM Hybrid (v0.2.0)
+
+By default, Cortex uses fast regex patterns (zero cost, instant). Optionally, you can plug in **any OpenAI-compatible LLM** for deeper analysis:
+
+- **Ollama** (local, free): `mistral:7b`, `qwen2.5:7b`, `llama3.1:8b`
+- **OpenAI**: `gpt-4o-mini`, `gpt-4o`
+- **OpenRouter / vLLM / any OpenAI-compatible API**
+
+The LLM runs **on top of regex** — it enhances, never replaces. If the LLM is down, Cortex falls back silently to regex-only.
+
 ## 🎬 Demo
 
 Try the interactive demo — it simulates a real bilingual dev conversation and shows every Cortex feature in action:
@@ -236,6 +246,52 @@ Add to your OpenClaw config:
 }
 ```
 
+### LLM Enhancement (optional)
+
+Add an `llm` section to enable AI-powered analysis on top of regex:
+
+```json
+{
+  "plugins": {
+    "openclaw-cortex": {
+      "enabled": true,
+      "llm": {
+        "enabled": true,
+        "endpoint": "http://localhost:11434/v1",
+        "model": "mistral:7b",
+        "apiKey": "",
+        "timeoutMs": 15000,
+        "batchSize": 3
+      }
+    }
+  }
+}
+```
+
+| Setting | Default | Description |
+|---------|---------|-------------|
+| `enabled` | `false` | Enable LLM enhancement |
+| `endpoint` | `http://localhost:11434/v1` | Any OpenAI-compatible API endpoint |
+| `model` | `mistral:7b` | Model identifier |
+| `apiKey` | `""` | API key (optional, for cloud providers) |
+| `timeoutMs` | `15000` | Timeout per LLM call |
+| `batchSize` | `3` | Messages to buffer before calling the LLM |
+
+**Examples:**
+
+```jsonc
+// Ollama (local, free)
+{ "endpoint": "http://localhost:11434/v1", "model": "mistral:7b" }
+
+// OpenAI
+{ "endpoint": "https://api.openai.com/v1", "model": "gpt-4o-mini", "apiKey": "sk-..." }
+
+// OpenRouter
+{ "endpoint": "https://openrouter.ai/api/v1", "model": "meta-llama/llama-3.1-8b-instruct", "apiKey": "sk-or-..." }
+```
+
+The LLM receives batches of messages and returns structured JSON: detected threads, decisions, closures, and mood. Results are merged with regex findings — the LLM can catch things regex misses (nuance, implicit decisions, context-dependent closures).
+
 Restart OpenClaw after configuring.
 
 ## How It Works
@@ -273,28 +329,49 @@ Thread and decision detection supports English, German, or both:
 - **Topic patterns**: "back to", "now about", "jetzt zu", "bzgl."
 - **Mood detection**: frustrated, excited, tense, productive, exploratory
 
+### LLM Enhancement Flow
+
+When `llm.enabled: true`:
+
+```
+message_received → regex analysis (instant, always)
+                 → buffer message
+                 → batch full? → LLM call (async, fire-and-forget)
+                              → merge LLM results into threads + decisions
+                              → LLM down? → silent fallback to regex-only
+```
+
+The LLM sees a conversation snippet (configurable batch size) and returns:
+- **Threads**: title, status (open/closed), summary
+- **Decisions**: what was decided, who, impact level
+- **Closures**: which threads were resolved
+- **Mood**: overall conversation mood
+
 ### Graceful Degradation
 
 - Read-only workspace → runs in-memory, skips writes
 - Corrupt JSON → starts fresh, next write recovers
 - Missing directories → creates them automatically
 - Hook errors → caught and logged, never crashes the gateway
+- LLM timeout/error → falls back to regex-only, no data loss
 
 ## Development
 
 ```bash
 npm install
-npm test            # 270 tests
+npm test            # 288 tests
 npm run typecheck   # TypeScript strict mode
 npm run build       # Compile to dist/
 ```
 
 ## Performance
 
-- Zero runtime dependencies (Node built-ins only)
-- All hook handlers are non-blocking (fire-and-forget)
+- Zero runtime dependencies (Node built-ins only — even LLM calls use `node:http`)
+- Regex analysis: instant, runs on every message
+- LLM enhancement: async, batched, fire-and-forget (never blocks hooks)
 - Atomic file writes via `.tmp` + rename
-- Tested with 270 unit + integration tests
+- Noise filter prevents garbage threads from polluting state
+- Tested with 288 unit + integration tests
 
 ## Architecture