artificial
artificial

End-to-End GUI Agent for Automated Computer Interaction: Superior Performance Without Expert Prompts or Commercial Models

UI-TARS introduces a novel architecture for automated GUI interaction by combining vision-language models with native OS integration. The key innovation is using a three-stage pipeline (perception, reasoning, action) that operates directly through OS-l…