I am a postdoc at VU Amsterdam working with Vincent François-Lavet. I received my PhD from Leiden University where I worked with Thomas Moerland, Mike Preuss, and Aske Plaat. I got my master degree at Leiden University in 2020, and bachelor degree at BLCU, China, in 2018.
I'm interested in reinforcement learning and trying to automate agents using [intrinsic motivation, foundation models, world models...], mostly in games and robotic tasks.
I co-host BeNeRL seminar and review.
Contact: z.yang(at)liacs.leidenuniv.nl
Google Scholar  | 
LinkedIn  | 
Twitter  | 
Github  | 
CV
Keywords: Foundation Models, Unsupervised Skill Discovery
Keywords: World Models, Autonomy, Unsupervised RL
Combine episodic control (EC) and RL together. The agent learns to automatically switch between EC and RL.
Use episodic memory directly for continuous action selection. It outperforms SOTA RL agents.
Systematically illustrate that why and how Go-Explore works in tabular and deep RL settings. Explore ('exp') can help the agent step into unseen areas.
Pre-train and fine-tune neural networks on Sokoban tasks. Agents pre-trained in 1-box tasks can learn faster in 2/3-box tasks, but not vice versa.
Template based on Hyunseung's website. Latest update: 08/2024.