Grounding large language models in interactive environments with online reinforcement learning T Carta, C Romac, T Wolf, S Lamprier, O Sigaud, PY Oudeyer International Conference on Machine Learning, 3676-3713, 2023 | 131 | 2023 |
Teachmyagent: a benchmark for automatic curriculum learning in deep rl C Romac, R Portelas, K Hofmann, PY Oudeyer International Conference on Machine Learning, 9052-9063, 2021 | 30 | 2021 |
Meta automatic curriculum learning R Portelas, C Romac, K Hofmann, PY Oudeyer arXiv preprint arXiv:2011.08463, 2020 | 9 | 2020 |
Deep recurrent Q-learning vs deep Q-learning on a simple partially observable Markov decision process with minecraft C Romac, V Béraud arXiv preprint arXiv:1903.04311, 2019 | 8 | 2019 |
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Q Gallouédec, E Beeching, C Romac, E Dellandréa arXiv preprint arXiv:2402.09844, 2024 | 5 | 2024 |
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting MS Aissi, C Romac, T Carta, S Lamprier, PY Oudeyer, O Sigaud, L Soulier, ... arXiv preprint arXiv:2410.19920, 2024 | | 2024 |
SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling L Gaven, C Romac, T Carta, S Lamprier, O Sigaud, PY Oudeyer arXiv preprint arXiv:2410.12481, 2024 | | 2024 |
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting M Salim Aissi, C Romac, T Carta, S Lamprier, PY Oudeyer, O Sigaud, ... arXiv e-prints, arXiv: 2410.19920, 2024 | | 2024 |
Les IA face au réel C Romac, T Carta, PY Oudeyer Pour la Science 557 (3), 24-31, 2024 | | 2024 |