<span class="vcard">/u/Successful-Western27</span>
/u/Successful-Western27

Brown University Paper: Low-Resource Languages (Zulu, Scots Gaelic, Hmong, Guarani) Can Easily Jailbreak LLMs

Researchers from Brown University presented a new study supporting that translating unsafe prompts into `low-resource languages` allows them to easily bypass safety measures in LLMs. By converting English inputs like "how to steal without getting …

DeepMind, Univ. of Illinois: Is self-correction a viable method to improve LLM reasoning? Probably not.

Can LLMs actually improve their own reasoning by self-correcting mistakes? A new paper from DeepMind and the University of Illinois looks to answer this quantitatively. The results show that unaided, LLMs struggle at self-correction for reasoning tasks…

New Paper: Enabling Language Models to Implicitly Learn Self-Improvement From Data

LLMs keep getting more capable at generating natural language. But there's always room for improving the quality and alignment of their responses. Typically this requires lots of human effort to collect more training data. So researchers are explor…

Infinite context windows? Streaming LLMs can be extended to infinite sequence lengths without any fine-tuning.

LLMs like GPT-3 struggle in streaming uses like chatbots because their performance tanks on long texts exceeding their training length. I checked out a new paper investigating why windowed attention fails for this. By visualizing the attention maps, th…

Tool-Integrated Reasoning: A New Approach for Math-Savvy LLMs

When trying to get language models to solve complex math problems, researchers kept running into limits. Models like GPT-3 and ChatGPT still struggle with advanced algebra, calculus, and geometry questions. The math is just too abstract and symbol-heav…

Meta, INRIA researchers discover that explicit registers eliminate ViT attention spikes

When visualizing the inner workings of vision transformers (ViTs), researchers noticed weird spikes of attention on random background patches. This didn't make sense since the models should focus on foreground objects. By analyzing the output embed…

Show-1: Marrying Pixel and Latent Diffusion Models for Efficient and High-Quality Text-to-Video Generation

A new paper proposes Show-1, a hybrid model that combines pixel and latent diffusion for efficient high-quality text-to-video generation. Both of these approaches have tradeoffs, so researchers at the National University of Singapore tried a hybrid app…

UNC Researchers Present VideoDirectorGPT: Using AI to Generate Multi-Scene Videos from Text

Generating coherent videos spanning multiple scenes from text descriptions poses unique challenges for AI. While recent progress enables creating short clips, smoothly transitioning across diverse events and maintaining continuity remains difficult. A …

Microsoft Researchers Propose AI Morality Test for LLMs in New Study

Researchers from Microsoft have just proposed using a psychological assessment tool called the Defining Issues Test (DIT) to evaluate the moral reasoning capabilities of large language models (LLMs) like GPT-3, ChatGPT, etc. The DIT presents moral dile…

DeepMind: Increasing learning rate in small models lets you reproduce errors in large ones

Training giant AI models like GPT-3 requires large resources – thousands of GPUs running for months. As a solo researcher without access to that kind of scale, I can't easily reproduce experiments and findings from papers on huge models. But a new …