Richard Chen – Jay van Zyl @ ecosystem.Ai

Learning Montezuma’s Revenge from a Single Demonstration

OpenAI July 4, 2018 July 4, 2018

We’ve trained an agent to achieve a high score of 74,500 on Montezuma’s Revenge from a single human demonstration, better than any previously published result. Our algorithm is simple: the agent plays a sequence of games starting from carefully chosen states from the demonstration, and learns from them by

Evolved Policy Gradients

OpenAI April 18, 2018 April 18, 2018

We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training regime, like learning to navigate

Igor Mordatch Jakob Foerster Maruan Al-Shedivat Pieter Abbeel Richard Chen Shimon Whiteson

Learning to Model Other Minds

OpenAI September 14, 2017 September 14, 2017

We’re releasing an algorithm which accounts for the fact that other agents are learning too, and discovers self-interested yet collaborative strategies like tit-for-tat in the iterated prisoner’s dilemma. This algorithm, Learning with Opponent-Learning Awareness (LOLA), is a small step towards agents that model other minds.

Read Paper

LOLA, a collaboration

Igor Mordatch Jakob Foerster Maruan Al-Shedivat Pieter Abbeel Richard Chen Shimon Whiteson

Learning to Model Other Minds

OpenAI September 14, 2017 September 14, 2017

We’re releasing an algorithm which accounts for the fact that other agents are learning too, and discovers self-interested yet collaborative strategies like tit-for-tat in the iterated prisoner’s dilemma. This algorithm, Learning with Opponent-Learning Awareness (LOLA), is a small step towards agents that model other minds.

Read Paper

LOLA, a collaboration

Marcin Andrychowicz Matthias Plappert Pieter Abbeel Prafulla Dhariwal Rein Houthooft Richard Chen Szymon Sidor Tamim Asfour Xi Chen

Better Exploration with Parameter Noise

OpenAI July 27, 2017 July 27, 2017

We’ve found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it’s worth trying on any problem.

View Code Read Paper

Action Space Noise

Parameter Space Noise

*Parameter noise helps algorithms more

Marcin Andrychowicz Matthias Plappert Pieter Abbeel Prafulla Dhariwal Rein Houthooft Richard Chen Szymon Sidor Tamim Asfour Xi Chen

Better Exploration with Parameter Noise

OpenAI July 27, 2017 July 27, 2017

We’ve found that adding adaptive noise to the parameters of reinforcement learning algorithms frequently boosts performance. This exploration method is simple to implement and very rarely decreases performance, so it’s worth trying on any problem.

View on GitHub View on arXiv Read more

Action Space Noise

Parameter Space Noise

Parameter

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: