![Language models can explain neurons in language models Language models can explain neurons in language models](https://search.ai.wiki/wp-content/uploads/2023/05/language-models-can-explain-neurons-in-language-models.jpg)
Language models can explain neurons in language models
We use GPT-4 to automatically write explanations for the behavior of neurons in large language models and to score those explanations. We release a dataset of these (imperfect) explanations and scores for every neuron in GPT-2.