‪Michel Ma‬ - ‪Google Scholar‬

รับโปรไฟล์ของฉันเอง

อ้างโดย

	ทั้งหมด	ตั้งแต่ปี 2019
การอ้างอิง	14	14
ดัชนี h	2	2
ดัชนี i10	0	0

0

8

4

202320247 7

ผู้เขียนร่วม

Pierre-Luc BaconUniversity of Montrealยืนยันอีเมลแล้วที่ mila.quebec
Tianwei NiMila, University of Montrealยืนยันอีเมลแล้วที่ mila.quebec
Pierluca D'OroMila & Metaยืนยันอีเมลแล้วที่ mila.quebec

ติดตาม

Michel Ma

Michel Ma

PhD candidate, University of Montreal, Mila

ยืนยันอีเมลแล้วที่ mila.quebec

Reinforcement Learning Deep Learning


ชื่อ เรียงตามการอ้างอิง เรียงตามปี เรียงตามชื่อ	อ้างโดย อ้างโดย	ปี
When do transformers shine in rl? decoupling memory from credit assignment T Ni, M Ma, B Eysenbach, PL Bacon Advances in Neural Information Processing Systems 36, 2024	8	2024
Long-term credit assignment via model-based temporal shortcuts M Ma, P D'Oro, Y Bengio, PL Bacon Deep RL Workshop NeurIPS 2021, 2021	5	2021
Counterfactual Policy Evaluation and the Conditional Monte Carlo Method M Ma, B Pierre-Luc Offline Reinforcement Learning Workshop, NeurIPS, 2020	1	2020
Do Transformer World Models Give Better Policy Gradients? M Ma, T Ni, C Gehring, P D'Oro, PL Bacon arXiv preprint arXiv:2402.05290, 2024		2024
Bridging State and History Representations: Understanding Self-Predictive RL T Ni, B Eysenbach, E Seyedsalehi, M Ma, C Gehring, A Mahajan, ... arXiv preprint arXiv:2401.08898, 2024		2024
A Differentiable Sequence Model Perspective on Policy Gradients M Ma, P D'Oro, T Ni, C Gehring, PL Bacon		2023
Parsimonious reasoning in reinforcement learning for better credit assignment M Ma		2022

ระบบไม่สามารถดำเนินการได้ในขณะนี้ โปรดลองใหม่อีกครั้งในภายหลัง

บทความ 1–7