A research team from UC Berkeley has developed a novel LM pipeline, a retrieval-augmented language model system specifically designed for forecasting. The system achieved an average Brier score of .179, closely approaching the human aggregate score of .149, indicating that the language model-based forecasting system closely approximates, and in some instances surpasses, the accuracy of human forecasters aggregated from competitive platforms.
[link] [comments]