Pieter Abbeel – Jay van Zyl @ ecosystem.Ai

Evolved Policy Gradients

OpenAI April 18, 2018 April 18, 2018

We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training regime, like learning to navigate

Igor Mordatch Pieter Abbeel Smitha Milli

Interpretable Machine Learning through Teaching

OpenAI February 15, 2018 February 15, 2018

We’ve designed a method that encourages AIs to teach each other with examples that also make sense to humans. Our approach automatically selects the most informative examples to teach a concept — for instance, the best images to describe the concept of dogs — and experimentally we found our approach to be

John Schulman Jonathan Ho Kevin Frans Peter Chen Pieter Abbeel

Learning a Hierarchy

OpenAI October 26, 2017 October 26, 2017

We’ve developed a hierarchical reinforcement learning algorithm that learns high-level actions useful for solving a range of tasks, allowing fast solving of tasks requiring thousands of timesteps. Our algorithm, when applied to a set of navigation problems, discovers a set of high-level actions for walking and crawling in different directions,

John Schulman Jonathan Ho Kevin Frans Peter Chen Pieter Abbeel

Learning a Hierarchy

OpenAI October 26, 2017 October 26, 2017

We’ve developed a hierarchical reinforcement learning algorithm that learns high-level actions useful for solving a range of tasks, allowing fast solving of tasks requiring thousands of timesteps. Our algorithm, when applied to a set of navigation problems, discovers a set of high-level actions for walking and crawling in different directions,

Alex Ray Bob McGrew Jonas Schneider Josh Tobin Lerrel Pinto Marcin Andrychowicz Peter Welinder Pieter Abbeel Wojciech Zaremba Xue Bin Peng

Generalizing from Simulation

OpenAI October 19, 2017 October 19, 2017

Our latest robotics techniques allow robot controllers, trained entirely in simulation and deployed on physical robots, to react to unplanned changes in the environment as they solve simple tasks. That is, we’ve used these techniques to build closed-loop systems rather than open-loop ones as before. The simulator need not match

Alex Ray Bob McGrew Jonas Schneider Josh Tobin Lerrel Pinto Marcin Andrychowicz Peter Welinder Pieter Abbeel Wojciech Zaremba Xue Bin Peng

Generalizing from Simulation

OpenAI October 19, 2017 October 19, 2017

Our latest robotics techniques allow robot controllers, trained entirely in simulation and deployed on physical robots, to react to unplanned changes in the environment as they solve simple tasks. That is, we’ve used these techniques to build closed-loop systems rather than open-loop ones as before. The simulator need not match

Igor Mordatch Ilya Sutskever Maruan Al-Shedivat Pieter Abbeel Trapit Bansal Yura Burda

Meta-Learning for Wrestling

OpenAI October 11, 2017 October 11, 2017

We show that for the task of simulated robot wrestling, a meta-learning agent can learn to quickly defeat a stronger non-meta-learning agent.

Igor Mordatch Ilya Sutskever Maruan Al-Shedivat Pieter Abbeel Trapit Bansal Yura Burda

Meta-Learning for Wrestling

OpenAI October 11, 2017 October 11, 2017

We show that for the task of simulated robot wrestling, a meta-learning agent can learn to quickly defeat a stronger non-meta-learning agent.

Igor Mordatch Jakob Foerster Maruan Al-Shedivat Pieter Abbeel Richard Chen Shimon Whiteson

Learning to Model Other Minds

OpenAI September 14, 2017 September 14, 2017

We’re releasing an algorithm which accounts for the fact that other agents are learning too, and discovers self-interested yet collaborative strategies like tit-for-tat in the iterated prisoner’s dilemma. This algorithm, Learning with Opponent-Learning Awareness (LOLA), is a small step towards agents that model other minds.

Read Paper

LOLA, a collaboration

Igor Mordatch Jakob Foerster Maruan Al-Shedivat Pieter Abbeel Richard Chen Shimon Whiteson

Learning to Model Other Minds

OpenAI September 14, 2017 September 14, 2017

We’re releasing an algorithm which accounts for the fact that other agents are learning too, and discovers self-interested yet collaborative strategies like tit-for-tat in the iterated prisoner’s dilemma. This algorithm, Learning with Opponent-Learning Awareness (LOLA), is a small step towards agents that model other minds.

Read Paper

LOLA, a collaboration

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: