Video games are the criteria for judging.
- Game Scope
A wide range of standard video games across common genres and formats (2D/3D, real-time/turn-based, single-player/multiplayer).
Examples: Chess, Clash of Clans, GTA V
- Learning Efficiency
Must not use brute-force trial-and-error requiring millions of gameplay trials.
Training/playtime must have same be comparable to what an average human needs to reach competence in that game.
- Autonomous Rule Acquisition (No Pre-coded Rules)
System must operate with the same sensory inputs as human players (game screen).
No privileged engine access (e.g., hidden variables, API calls, or internal game state).
No hard-coded mechanics or rules may be provided in advance.
The system must infer game rules and mechanics solely from gameplay experience.
- Performance Benchmark
The system must reach or surpass the average human player level, measured by each game’s native scoring, ranking, or progression system.
[link] [comments]