[R] First open-source implementation of Hebbian fast-weight write-back for the BDH architecture

The BDH (Dragon Hatchling) paper (arXiv:2509.26507) describes a Hebbian synaptic plasticity mechanism where model weights update during inference. The released code computes the co-activation product and discards it, the write-back was never implemented publicly. I implemented it.

The model rewrites its own decoder weights during inference using sparse activation codes as addresses. Same token always produces the same code regardless of position.

Consolidation (v2): Once episodic fast weights work, the next question is whether you can write them back into slow weights without destroying the signal. Dense writeback degrades it. Selective writeback (top 10% of rows by episode activity) preserves most of it:

	n2	n4	n8

Control (no consolidation)	97.2%	95.5%	97.4%
Dense writeback	75.4%	68.1%	89.8%
Selective (rowtop10)	97.5%	97.1%	96.2%

Verified on independent hardware (H100) and seed. Counter-benchmarks stay in the 91–95% range.

Base mechanism: Baseline without write-back gets 1% (chance). Best Hebbian run hits 99.0 / 98.0 / 97.5 on n2/n4/n8. Reproduced across independent seeds. Five bugs had to be solved — all documented in the README.

Limitations: This is a mechanism proof on synthetic n-back associative recall. 25M parameter model. Not validated on natural language. Next step is FineWeb-Edu.

Repo (Apache 2.0): https://github.com/fleeb83/bdh-fast-weights

Independent researcher, no lab. Happy to answer any questions.

submitted by /u/fleebrun83
[link] [comments]

[R] First open-source implementation of Hebbian fast-weight write-back for the BDH architecture

Want to read more?

Tagged with