Paul Christiano – Jay van Zyl @ ecosystem.Ai

Learning Complex Goals with Iterated Amplification

OpenAI October 22, 2018 October 22, 2018

We’re proposing an AI safety technique called iterated amplification that lets us specify complicated behaviors and goals that are beyond human scale, by demonstrating how to decompose a task into simpler sub-tasks, rather than by providing labeled data or a reward function. Although this idea is in its very

Gathering Human Feedback

OpenAI August 3, 2017 August 3, 2017

RL-Teacher is an open-source implementation of our interface to train AIs via occasional human feedback rather than hand-crafted reward functions. The underlying technique was developed as a step towards safe AI systems, but also applies to reinforcement learning problems with rewards that are hard to specify.

View Code

This simulated

Dario Amodei Paul Christiano Tom Brown

Gathering Human Feedback

OpenAI August 3, 2017 August 3, 2017

RL-Teacher is an open-source implementation of our interface to train AIs via occasional human feedback rather than hand-crafted reward functions. The underlying technique was developed as a step towards safe AI systems, but also applies to reinforcement learning problems with rewards that are hard to specify.

View Code

This simulated

Alex Ray Dario Amodei Paul Christiano

Learning from Human Preferences

OpenAI June 13, 2017 June 13, 2017

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve

Share this:

Share this:

Share this:

Share this: