AI-Written Critiques Help Humans Notice Flaws
Showing model-generated critical comments to humans helps them find flaws in summaries.
Showing model-generated critical comments to humans helps them find flaws in summaries.
Large neural networks are at the core of many recent advances in AI, but training them is a difficult engineering and research challenge which requires orchestrating a cluster of GPUs to perform a single synchronized calculation. As cluster and model sizes have grown, machine learning practitioners have developed an increasing
We built a neural theorem prover for Lean that learned to solve a variety of challenging high-school olympiad problems, including problems from the AMC12 and AIME competitions, as well as two problems adapted from the IMO.[1] The prover uses a language model to find proofs of formal statements. Each
We’ve trained language models that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through our alignment research. These InstructGPT models, which are trained with humans in the loop, are now deployed as the default language
We’ve fine-tuned GPT-3 to more accurately answer open-ended questions using a text-based web browser. Our prototype copies how humans research answers to questions online—it submits search queries, follows links, and scrolls up and down web pages. It is trained to cite its sources, which makes it
We’ve trained a system that solves grade school math problems with nearly twice the accuracy of a fine-tuned GPT-3 model. It solves about 90% as many problems as real kids: a small sample of 9-12 year olds scored 60% on a test from our dataset, while our
Scaling human oversight of AI systems for tasks that are difficult to evaluate.
We’re releasing Triton 1.0, an open-source Python-like programming language which enables researchers with no CUDA experience to write highly efficient GPU code—most of the time on par with what an expert would be able to produce. Triton makes it possible to reach peak hardware performance
Our latest research finds we can improve language model behavior with respect to specific behavioral values by fine-tuning on a small, curated dataset.
We’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually.