computer-vision
computer-vision

From physics to generative AI: An AI model for advanced pattern generation

Inspired by physics, a new generative model PFGM++ outperforms diffusion models in image generation.

Multi-AI collaboration helps reasoning and factual accuracy in large language models

Researchers use multiple AI models to collaborate, debate, and improve their reasoning abilities to advance the performance of LLMs while increasing accountability and factual accuracy.

A pose-mapping technique could remotely evaluate patients with cerebral palsy

The machine-learning method works on most mobile devices and could be expanded to assess other motor disorders outside of the doctor’s office.

Helping computer vision and language models understand what they see

Researchers use synthetic data to improve a model’s ability to grasp conceptual information, which could enhance automatic captioning and question-answering systems.

AI model speeds up high-resolution computer vision

The system could improve image quality in video streaming or help autonomous vehicles identify road hazards in real-time.

MIT researchers combine deep learning and physics to fix motion-corrupted MRI scans

The challenge involves than just a blurry JPEG. Fixing motion artifacts in medical imaging requires a more sophisticated approach.

Using AI to protect against AI image manipulation

“PhotoGuard,” developed by MIT CSAIL researchers, prevents unauthorized image manipulation, safeguarding authenticity in the era of advanced generative models.

A new dataset of Arctic images will spur artificial intelligence research

The dataset, being collected as part of a US Coast Guard science mission, will be released open source to help advance naval mission planning and climate change studies.

When computer vision works more like a brain, it sees more like people do

Training artificial neural networks with data from real brains can make computer vision more robust.

Computer vision system marries image recognition and generation

MAGE merges the two key tasks of image generation and recognition, typically trained separately, into a single system.