A sobering tale of AI governance

I think this article/study tells a very sobering tale wrt AI governance. It hints at very fundamental issues which are deeper than what proper engineering can solve with contingent issues.

This post, along with the one I wrote a few days ago here regarding Turing completeness, are my thoughts as to the walls that AI governance has no hope of scaling. It's a delusion.

In our social realm as subjective creatures we have governance in the form of laws, yet that is still not enough, since the State has to prove how your particular scenario violates that particular law. We have laws, yet require judicial courts to prove the law subjectively applies in that situation. Where is the associated path wrt subjectivity within the AI realm?

This study talks of:

16.1 Failures of Social Coherence

- "Discrepancy between the agent’s reports and actual actions"

- "Failures in knowledge and authority attribution"

- "Susceptibility to social pressure without proportionality"

- "Failures of social coherence"

16.2 What LLM-Backed Agents Are Lacking

- "No stakeholder model"

- "No self-model"

- "No private deliberation surface"

16.3 Fundamental vs. Contingent Failures

16.4 Multi-Agent Amplification

- "Knowledge transfer propagates vulnerabilities alongside capabilities"

- "Mutual reinforcement creates false confidence"

- "Shared channels create identity confusion"

- "Responsibility becomes harder to trace"

And is littered with statements such as:

- "novel risk surfaces emerge that cannot be fully captured by static benchmarking"

- "it failed to realize that deleting the email server would also prevent the owner from using it. Like early rule-based AI systems, which required countless explicit rules to describe how actions change (or don’t change) the world, the agent lacks an understanding of structural dependencies and common-sense consequences"

- "The inability to distinguish instructions from data in a token-based context window makes prompt injection a structural feature, not a fixable bug"

- "Multi-agent communication creates situations that have no single-agent analog, and for which there is no common evaluations. This is a critical direction for future research."

- "A key finding in this line of work is that single-turn evaluations can substantially underestimate risk, because malicious intent, persuasion, and unsafe outcomes may only emerge through sequential and socially grounded exchanges"

- "but we argue that clarifying and operationalizing responsibility is a central unresolved challenge for the safe deployment of autonomous, socially embedded AI systems"

- "He argues that conventional governance tools face fundamental limitations when applied to systems making uninterpretable decisions at unprecedented speed and scale"

- "However, the failure modes we document differ importantly from those targeted by most technical adversarial ML work. Our case studies involve no gradient access, no poisoned training data, and no technically sophisticated attack infrastructure. Instead, the dominant attack surface across our findings is social"

- "Collectively, these findings suggest that in deployed agentic systems, low-cost social attack surfaces may pose a more immediate practical threat than the technical jailbreaks that dominate the adversarial ML literature."

Are these fundamental or contingent issues?

Would be interested in the thoughts of others here on what the future of AI governance will be.

submitted by /u/Im_Talking
[link] [comments]