Interpretability in the wild: a circuit for indirect object identification in gpt-2 small K Wang, A Variengien, A Conmy, B Shlegeris, J Steinhardt International Conference on Learning Representations 2023, 2023 | 383 | 2023 |
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model M Hanna, O Liu, A Variengien Advances in Neural Information Processing Systems 36, 2024 | 135* | 2024 |
Towards self-organized control: using neural cellular automata to robustly control a cart-pole agent. A Variengien, S Pontes-Filho, T Glover, S Nichele Innovation in Machine Intelligence (IMI) 1: 1–14, 2021 | 43* | 2021 |
A journey in ESN and LSTM visualisations on a language task A Variengien, X Hinaut arXiv preprint arXiv:2012.01748, 2020 | 15 | 2020 |
Look Before You Leap: A Universal Emergent Decomposition of Retrieval Tasks in Language Models A Variengien, E Winsor Mechanistic Interpretability Workshop - 41st International Conference on …, 2024 | 4 | 2024 |
BELLS: A framework towards future proof benchmarks for the evaluation of LLM safeguards D Dorn, A Variengien, CR Segerie, V Corruble NextGen AI Safety Workshop - 41st International Conference on Machine Learning, 2024 | 4 | 2024 |
AI Safety Institutes: Can countries meet the challenge? A Variengien < bound method Organization. get_name_with_acronym of< Organization …, 2024 | 1 | 2024 |
Modelling Cross-Situational Learning on Full Sentences in Few Shots with Simple RNNs X Hinaut, SR OOTA, A Variengien, F Alexandre Proceedings of the Annual Meeting of the Cognitive Science Society 46, 2024 | | 2024 |
Recurrent Neural Networks Models for Developmental Language Acquisition: Reservoirs Outperform LSTMs X Hinaut, A Variengien SNL 2020-12th Annual Meeting of the Society for the Neurobiology of Language, 2020 | | 2020 |