Persistent AI memory via filesystem knowledge retrieval. BM25 search with query intelligence, cross-encoder reranking, and pluggable LLM answer extraction. Rust-accelerated regex pipeline under 1ms. Continuous consolidation turns episodic traces into semantic memories. Integrates with Claude Code and MCP-compatible tools.
Query classification routes to 8 intent types. 3-stage reranking: BM25, bi-encoder, cross-encoder. Answer extraction via Rust-accelerated regex with optional LLM fallback (local or Anthropic). Continuous consolidation on every memory write.
| LongMemEval (500q, committed) | |
|---|---|
| single-session-preference | 90.0% |
| knowledge-update | 87.2% |
| single-session-assistant | 87.5% |
| single-session-user | 82.9% |
| multi-session | 61.7% |
| temporal-reasoning | 45.9% |
| Overall | 69.0% |
| LOCOMO (1,986q, experimental) | |
|---|---|
| Multi-hop | 82.6% |
| Adversarial | 79.5% |
| Open-domain | 78.7% |
| Single-hop | 55.7% |
| Temporal | 42.7% |
| Overall | 74.7% |
LongMemEval: GPT-4o-mini generation + judge, results committed. LOCOMO: BM25 + answer extraction, no LLM in retrieval path. Temporal reasoning is the primary target for v0.4.
7 PyO3 crates built with maturin. Python fallback preserved in every module.
| Crate | Speedup | Wired into |
|---|---|---|
| remanentia_temporal | 14.2× | temporal_graph, date_normalizer |
| remanentia_answer_extractor | 11.4× | answer_extractor |
| remanentia_fact_decomposer | ~7× | fact_decomposer |
| remanentia_answer_normalizer | ~6× | answer_normalizer |
| remanentia_search | ~3-5× | memory_index (BM25) |
| arcane_stdp | ~2-3× | snn_backend |
| remanentia_entity_extractor | ~2× | entity_extractor |
Full regex pipeline: 0.60ms (Rust) vs 9.07ms (Python) on 470K chars = 14.1× on large workloads
| Component | Avg (ms) |
|---|---|
| parse_dates (47K chars) | 0.323 |
| regex_entities (47K) | 0.391 |
| extract_answer (47K) | 0.075 |
| normalize_answer | 0.001 |
| answers_match | 0.002 |
| fuzzy_match | 0.001 |
| KnowledgeStore.add_note | 0.027 |
| Total (regex pipeline) | 0.60 |
Measured with time.perf_counter, 27 budget-asserted performance tests
Four backends, zero cloud dependency required. Fully offline capable.
Benchmarked: Qwen 2.5 3B (44% LongMemEval, 8.5s/query) on AMD RX 6600 XT via ROCm. 8 GGUF models tested.
Built on 70+ experiments. We publish what fails.
SNN retrieval via STDP-modified weight matrices provides zero discriminative signal across 4 learning rules, 8 architectures, GPU sweeps. Root cause: input current dominates recurrent dynamics 100–2,300× — the SNN operates in a perturbative regime where learned connectivity cannot matter. The system uses BM25 + cross-encoder reranking + optional LLM because that is what 70+ experiments showed works. 1,343 tests at 100% coverage enforce this honestly.
Open to strategic partners, research collaborators, and investors.
protoscience@anulum.li