OpenAI

Proximal Policy Optimization

OpenAI July 20, 2017 July 20, 2017

We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune. PPO has become the default reinforcement learning algorithm at OpenAI because of its ease of use and good performance.

View

Anish Athalye

Robust Adversarial Examples

OpenAI July 17, 2017 July 17, 2017

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that self-driving cars would be hard to trick maliciously since they capture images from multiple scales, angles, perspectives, and the like.

*This innocuous kitten photo, printed on

Anish Athalye

Robust Adversarial Examples

OpenAI July 17, 2017 July 17, 2017

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that self-driving cars would be hard to trick maliciously since they capture images from multiple scales, angles, perspectives, and the like.

*This innocuous kitten photo, printed on

Alex Ray Jonas Schneider Jonathan Ho Peter Welinder Wojciech Zaremba

Faster Physics in Python

OpenAI June 28, 2017 June 28, 2017

We’re open-sourcing a high-performance Python library for robotic simulation using the MuJoCo engine, developed over our past year of robotics research.

View Code View Docs

This library is one of our core tools for deep learning robotics research, which we’ve now released as a major version of mujoco-py, our Python

Alex Ray Jonas Schneider Jonathan Ho Peter Welinder Wojciech Zaremba

Faster Physics in Python

OpenAI June 28, 2017 June 28, 2017

We’re open-sourcing a high-performance Python library for robotic simulation using the MuJoCo engine, developed over our past year of robotics research.

View Code View Docs

This library is one of our core tools for deep learning robotics research, which we’ve now released as a major version of mujoco-py, our Python

Alex Ray Dario Amodei Paul Christiano

Learning from Human Preferences

OpenAI June 13, 2017 June 13, 2017

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve

Share this:

Share this:

Share this:

Share this:

Share this:

Share this: