A new survey by Microsoft researchers and academic partners highlights that artificial intelligence (AI) agents driven by large language models (LLMs) are evolving to control graphical user interfaces (GUIs), potentially altering human-software interaction. These AI systems can now perform tasks like clicking buttons and navigating apps, interpreting natural language to execute commands. Described as a major paradigm shift, such "GUI agents" allow users to undertake complex tasks through simple conversation, transforming user experience across web navigation, mobile apps, and desktop automation. Major tech companies are integrating these capabilities. For instance, Microsoft’s Power Automate and Copilot AI assist in automating workflows and software control, while Anthropic's Claude enables web interfacing. Google is reportedly working on Project Jarvis, using Chrome for web tasks. The rise of LLMs, particularly multimodal ones, marks a new phase in GUI automation, with significant potential market growth from $8. 3 billion in 2022 to $68. 9 billion by 2028, as per BCC Research.
This growth reflects enterprises’ push to make software more accessible and reduce repetitive tasks. However, challenges such as privacy concerns, performance issues, and safety remain before widespread adoption. Earlier automation approaches lacked flexibility for real-world applications. Solutions include developing efficient local models, enhancing security, and standardizing evaluations. Experts foresee a shift toward multi-agent architectures and multimodal capabilities in GUI automation, which could significantly boost productivity but necessitate careful consideration of security and infrastructure implications. Industry experts predict widespread enterprise adoption of GUI automation agents by 2025, with potential efficiency gains and challenges regarding data privacy and job impact. The survey underscores a crucial moment for conversational AI interfaces to redefine software interaction, pending technological and enterprise deployment advancements. Researchers foresee AI assistants becoming integral to how we work with computers, handling complex and dynamic environments efficiently.
AI-Driven GUI Agents: Transforming Human-Software Interaction
Hitachi, Ltd.
MarketOwl AI has recently introduced a suite of AI-powered agents designed to autonomously handle various marketing tasks, presenting an innovative alternative that could replace traditional marketing departments in small and medium-sized enterprises (SMEs).
Google’s launch of AI Mode in 2025 signifies a groundbreaking evolution in search engine interaction, profoundly transforming online search behavior and content optimization.
Nvidia is on the brink of making history as it nears becoming the first company to reach an astonishing $5 trillion market valuation.
At a prominent session during the NAB Show New York, newly released survey data highlighted considerable public concern about artificial intelligence (AI) and its potential effects on trust in journalism.
By Jordan-Ashley Walker On a cloudy Thursday morning in September, Rhett Epler, an assistant professor of marketing at the Strome College of Business, sits at his desk in Constant Hall, engaging in a video call with a prospective client
Palo Alto Networks is significantly advancing its cybersecurity solutions by integrating sophisticated artificial intelligence (AI) technologies to combat escalating global cyber threats.
Launch your AI-powered team to automate Marketing, Sales & Growth
and get clients on autopilot — from social media and search engines. No ads needed
Begin getting your first leads today