<span class="vcard">Adam Zewe | MIT News</span>
Adam Zewe | MIT News

Study: AI could lead to inconsistent outcomes in home surveillance

Researchers find large language models make inconsistent decisions about whether to call the police when analyzing surveillance videos.

Study: Transparency is often lacking in datasets used to train large language models

Researchers developed an easy-to-use tool that enables an AI practitioner to find data that suits the purpose of their model, which could improve accuracy and reduce bias.

3 Questions: How to prove humanity online

AI agents could soon become indistinguishable from humans online. Could “personhood credentials” protect people against digital imposters?

MIT researchers use large language models to flag problems in complex systems

The approach can detect anomalies in data recorded over time, without the need for any training.

Method prevents an AI model from being overconfident about wrong answers

More efficient than other approaches, the “Thermometer” technique could help someone know when they should trust a large language model.

Study: When allocating scarce resources with AI, randomization can improve fairness

Introducing structured randomization into decisions based on machine-learning model predictions can address inherent uncertainties while maintaining efficiency.

Large language models don’t behave like people, even though we may expect them to

A new study shows someone’s beliefs about an LLM play a significant role in the model’s performance and are important for how it is deployed.

AI model identifies certain breast tumor stages likely to progress to invasive cancer

The model could help clinicians assess breast cancer stage and ultimately help in reducing overtreatment.

AI method radically speeds predictions of materials’ thermal properties

The approach could help engineers design more efficient energy-conversion systems and faster microelectronic devices, reducing waste heat.

How to assess a general-purpose AI model’s reliability before it’s deployed

A new technique enables users to compare several large models and choose the one that works best for their task.