A new survey by Microsoft researchers and academic partners highlights that artificial intelligence (AI) agents driven by large language models (LLMs) are evolving to control graphical user interfaces (GUIs), potentially altering human-software interaction. These AI systems can now perform tasks like clicking buttons and navigating apps, interpreting natural language to execute commands. Described as a major paradigm shift, such "GUI agents" allow users to undertake complex tasks through simple conversation, transforming user experience across web navigation, mobile apps, and desktop automation. Major tech companies are integrating these capabilities. For instance, Microsoft’s Power Automate and Copilot AI assist in automating workflows and software control, while Anthropic's Claude enables web interfacing. Google is reportedly working on Project Jarvis, using Chrome for web tasks. The rise of LLMs, particularly multimodal ones, marks a new phase in GUI automation, with significant potential market growth from $8. 3 billion in 2022 to $68. 9 billion by 2028, as per BCC Research.
This growth reflects enterprises’ push to make software more accessible and reduce repetitive tasks. However, challenges such as privacy concerns, performance issues, and safety remain before widespread adoption. Earlier automation approaches lacked flexibility for real-world applications. Solutions include developing efficient local models, enhancing security, and standardizing evaluations. Experts foresee a shift toward multi-agent architectures and multimodal capabilities in GUI automation, which could significantly boost productivity but necessitate careful consideration of security and infrastructure implications. Industry experts predict widespread enterprise adoption of GUI automation agents by 2025, with potential efficiency gains and challenges regarding data privacy and job impact. The survey underscores a crucial moment for conversational AI interfaces to redefine software interaction, pending technological and enterprise deployment advancements. Researchers foresee AI assistants becoming integral to how we work with computers, handling complex and dynamic environments efficiently.
AI-Driven GUI Agents: Transforming Human-Software Interaction
Tesla, Inc.
Meta, the American tech giant behind Facebook, Instagram, and WhatsApp, has recently announced a strategic AI-focused partnership with a consortium of leading international media organizations.
Since the advent of online search, some marketers, webmasters, and SEOs have sought to cheat the system to gain unfair advantages—practices known as Black Hat SEO.
Key Takeaways Fire TV now lets you use Alexa+ to jump directly to a specific movie scene on Prime Video by simply describing the moment—no need to search or fast-forward
Key statistic: According to a June 2025 survey by Ascend2 and Unbounce, 31% of US SMB marketers and business owners utilize AI-driven design or layout recommendations to optimize landing pages.
Meta Platforms Inc.
According to recent TrendForce research, the rising demand for artificial intelligence (AI) servers is significantly shaping the strategies of leading North American cloud service providers (CSPs).
Launch your AI-powered team to automate Marketing, Sales & Growth
and get clients on autopilot — from social media and search engines. No ads needed
Begin getting your first leads today