Remanentia

v0.3.1
remanentem — the state of remaining

Persistent AI memory via filesystem knowledge retrieval. BM25 search with query intelligence, cross-encoder reranking, and pluggable LLM answer extraction. Rust-accelerated regex pipeline under 1ms. Continuous consolidation turns episodic traces into semantic memories. Integrates with Claude Code and MCP-compatible tools.

69.0%
LongMemEval (500q, committed)
0.6ms
regex pipeline (47K chars)
1,343
tests, 100% coverage
7
Rust acceleration crates

How It Works

IndexBM25 + embeddings, filesystem-native
SearchBM25 + cross-encoder rerank + RRF
ExtractRust regex + optional LLM
ConsolidateEpisodic → semantic + entity graph

Query classification routes to 8 intent types. 3-stage reranking: BM25, bi-encoder, cross-encoder. Answer extraction via Rust-accelerated regex with optional LLM fallback (local or Anthropic). Continuous consolidation on every memory write.

Benchmarks

LongMemEval (500q, committed)
single-session-preference90.0%
knowledge-update87.2%
single-session-assistant87.5%
single-session-user82.9%
multi-session61.7%
temporal-reasoning45.9%
Overall69.0%
LOCOMO (1,986q, experimental)
Multi-hop82.6%
Adversarial79.5%
Open-domain78.7%
Single-hop55.7%
Temporal42.7%
Overall74.7%

LongMemEval: GPT-4o-mini generation + judge, results committed. LOCOMO: BM25 + answer extraction, no LLM in retrieval path. Temporal reasoning is the primary target for v0.4.

Rust Acceleration

7 PyO3 crates built with maturin. Python fallback preserved in every module.

CrateSpeedupWired into
remanentia_temporal14.2×temporal_graph, date_normalizer
remanentia_answer_extractor11.4×answer_extractor
remanentia_fact_decomposer~7×fact_decomposer
remanentia_answer_normalizer~6×answer_normalizer
remanentia_search~3-5×memory_index (BM25)
arcane_stdp~2-3×snn_backend
remanentia_entity_extractor~2×entity_extractor

Full regex pipeline: 0.60ms (Rust) vs 9.07ms (Python) on 470K chars = 14.1× on large workloads

Pipeline Performance

ComponentAvg (ms)
parse_dates (47K chars)0.323
regex_entities (47K)0.391
extract_answer (47K)0.075
normalize_answer0.001
answers_match0.002
fuzzy_match0.001
KnowledgeStore.add_note0.027
Total (regex pipeline)0.60

Measured with time.perf_counter, 27 budget-asserted performance tests

Pluggable LLM Backend

Four backends, zero cloud dependency required. Fully offline capable.

AutoBackend Tries local → Anthropic → Null. Zero config.
LocalLLMBackend OpenAI-compatible HTTP. llama.cpp, Ollama, vLLM.
AnthropicBackend Claude Haiku/Sonnet via API.
NullBackend Explicit no-LLM. Regex-only extraction.

Benchmarked: Qwen 2.5 3B (44% LongMemEval, 8.5s/query) on AMD RX 6600 XT via ROCm. 8 GGUF models tested.

What Makes This Different

🔍
Filesystem-Native Indexes existing project artifacts — session logs, code, research docs. Your files are the memory.
Zero Infrastructure No vector database, no cloud service. One Python process. Optional local or cloud LLM.
🔗
MCP Integration 4 MCP tools: recall, remember, status, graph. Read and write path. Claude Code native.

Built on 70+ experiments. We publish what fails.

SNN retrieval via STDP-modified weight matrices provides zero discriminative signal across 4 learning rules, 8 architectures, GPU sweeps. Root cause: input current dominates recurrent dynamics 100–2,300× — the SNN operates in a perturbative regime where learned connectivity cannot matter. The system uses BM25 + cross-encoder reranking + optional LLM because that is what 70+ experiments showed works. 1,343 tests at 100% coverage enforce this honestly.

Stack

Rust (PyO3) maturin Python 3.10+ BM25 Cross-Encoder MCP (stdio) FastAPI sentence-transformers llama.cpp / Ollama Anthropic API GLiNER2 NumPy ROCm / CUDA AGPL-3.0
Part of the SCPN framework by Anulum
Beta — v0.3.1 · 1,343 tests · 100% coverage · 7 Rust crates · PyPI live

Open to strategic partners, research collaborators, and investors.

protoscience@anulum.li