Research
Research

Efficient technique improves machine-learning models’ reliability

The method enables a model to determine its confidence in a prediction, while using no additional data and far fewer computing resources than other methods.

Solving a machine-learning mystery

A new study shows how large language models like GPT-3 can learn a new task from just a few examples, without the need for any new training data.

Automating the math for decision-making under uncertainty

A new tool brings the benefits of AI programming to a much broader class of problems.

ChatGPT: Optimizing Language Models for Dialogue

We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction

Learning to Play Minecraft with Video PreTraining (VPT)

We trained a neural network to play Minecraft by Video PreTraining (VPT) on a massive unlabeled video dataset of human Minecraft play, while using only a small amount of labeled contractor data. With fine-tuning, our model can learn to craft diamond tools, a task that usually takes proficient humans over

Techniques for Training Large Neural Networks

Large neural networks are at the core of many recent advances in AI, but training them is a difficult engineering and research challenge which requires orchestrating a cluster of GPUs to perform a single synchronized calculation. As cluster and model sizes have grown, machine learning practitioners have developed an increasing