Trained learned planners

This repository contains the trained networks from the paper "Planning behavior in a recurrent neural network that plays Sokoban", presented at the ICML 2024 Mechanistic Interpretability Workshop.

To load and use the NNs, please refer to the learned-planner repository, and possibly to the training code .

Model details

Hyperparameters:

See model/*/cp_*/cfg.json for the hyperparameters that were used to train a particular run.

Best Models:

The best models for each of the model type are stored in the following directory:

Model	Directory	Parameter Count
DRC(3, 3)	`drc33/bkynosqi/cp_2002944000`	1,285,125 (1.29M)
DRC(1, 1)	`drc11/eue6pax7/cp_2002944000`	987,525 (0.99M)
ResNet	`resnet/syb50iz7/cp_2002944000`	3,068,421 (3.07M)

Probes & SAEs:

The trained probes and SAEs are stored in the probes and saes directories, respectively.

Training dataset:

The Boxoban set of levels by DeepMind.

Citation

If you use any of these artifacts, please cite our work:

@inproceedings{garriga-alonso2024planning,
    title={Planning behavior in a recurrent neural network that plays Sokoban},
    author={Adri{\`a} Garriga-Alonso and Mohammad Taufeeque and Adam Gleave},
    booktitle={ICML 2024 Workshop on Mechanistic Interpretability},
    year={2024},
    url={https://openreview.net/forum?id=T9sB3S2hok}
}