<span class="vcard">/u/Successful-Western27</span>
/u/Successful-Western27

Snapchat used AI agents to build a sound-aware video captioning system

Training AI to understand and describe video content requires datasets which are expensive for humans to annotate manually. Now researchers from Snap, UC Merced, and the University of Trento have put together a new dataset called Panda-70M that aims to…

Google DeepMind uses AI to discover 2.2 million new materials – equivalent to nearly 800 years’ worth of knowledge. Shares they’ve already validated 736 in laboratories.

Materials discovery is critical but tough. New materials enable big innovations like batteries or LEDs. But there are ~infinitely many combinations to try. Testing for them experimentally is slow and expensive. So scientists and engineers want to simul…

Researchers present SuGaR: Surface-Aligned Gaussian Splatting for Speedy 3D Mesh Reconstruction

Computer vision researchers developed a way to create detailed 3D models from images in just minutes on a single GPU. Their method, called SuGaR, works by optimizing millions of tiny particles to match images of a scene. The key innovation is getting t…

You can predict disease progression by modeling health data in latent space

Many complex diseases like autoimmune disorders have highly variable progression between patients, making them difficult to understand and predict. A new paper shows that visualizing health data in the latent space helps find hidden patterns in clinica…

They found a new NeRF technique to turn videos into controllable 3D models

The key challenge is that NeRFs typically require multiple view images to reconstruct a scene in 3D, whereas videos provide only a single view over time. But that means we have to capture a lot of data to create a NeRF. What if there was a way to creat…

Telling GPT-4 you’re scared or under pressure improves performance

In a recent paper, researchers have discovered that LLMs show enhanced performance when provided with prompts infused with emotional context, which they call "EmotionPrompts." These prompts incorporate sentiments of urgency or importance, suc…

HyperFields: towards zero-shot NeRFs by mapping language to 3D geometry

Generating 3D objects based solely on text descriptions has proven extremely challenging for AI. Current state-of-the-art methods require optimizing a full 3D model from scratch for each new prompt, which is computationally demanding. A new technique c…

Using Multi-Agent Reinforcement Learning results in better urban planning outcomes

Urban planning is tricky – governments push top-down changes while locals want bottom-up ideas. It's hard to find compromises that make everyone happier. A new research paper proposes using Multi-Agent Reinforcement Learning (MARL) to vote on land …

Researchers propose 3D-GPT: combining LLMs and agents for procedural Text-to-3D model generation

Researchers propose a new AI system called 3D-GPT that creates 3D models by combining natural language instructions and agents specialized for working with existing 3D modeling tools. 3D-GPT has predefined functions that make 3D shapes, and it tweaks p…

Meta Announces New Method for Real-Time Decoding of Images from Brain Activity

Brain decoding tech has improved a lot recently thanks to AI/ML, enabling reading out visual perceptions from fMRI brain scans. But fMRI is too slow for real-time BCIs. A new study from Meta's AI research team pushes brain reading into real-time us…