Research / 2025

UXAgent An LLM-Agent-Based Usability Testing Framework for Web Design

Yuxuan Lu, Bingsheng Yao, Hansu Gu, Jing Huang, Zheshen (Jessie) Wang, Yang Li, Jiri Gesi, Qi He, Toby Jia-Jun Li, Dakuo Wang

Agents
Usability Testing
Web Design
User Personas
Automated Testing
Benchmarking

In brief

UXAgent is an LLM-agent-based framework that automates usability testing for web designs, enabling researchers to simulate thousands of user personas and collect multimodal data to iterate on study designs before human-subject studies.

Executive Summary

This white paper introduces UXAgent, an innovative LLM-agent-based framework designed to revolutionize usability testing for web designs. It addresses critical challenges in traditional UX research, such as the difficulty in gathering early feedback on experiment designs and recruiting a sufficient number of qualified participants. UXAgent proposes a solution by enabling engineers, designers, and product managers to automatically generate and simulate thousands of diverse user personas, who then interact with target websites. This allows for the collection of both quantitative and qualitative usability data in a simulated pilot session, facilitating rapid iteration and refinement of study designs before conducting costly and time-consuming human-subject studies.

Key Technical Advancements

LLM Agent Module: The core of UXAgent, this module is designed to simulate human-like behaviors and reasoning processes. It operates with a two-loop structure (Fast Loop for real-time interaction, Slow Loop for in-depth reasoning) and a Memory Stream to store observations, actions, reflections, and thoughts, mimicking human cognitive patterns. This allows agents to navigate and interact with web environments in a “believable” manner.

Universal Browser Connector Module: This module acts as the interface between the LLM Agent and any web environment. It parses raw HTML into a simplified structure, serving as the agent’s observation space, and generates a task-agnostic action space (e.g., “click,” “type,” “back”) that generalizes across different websites. This enables the LLM Agents to seamlessly interact with web pages without extensive customization.

Persona Generator Module: To support large-scale simulations with diverse user backgrounds, this module generates thousands of unique agent personas. Users can specify demographic distributions, and the generator automatically creates personas with varied information, ensuring diversity in the simulated user base.

Multimodal Data Generation: UXAgent generates comprehensive UX study results in qualitative (e.g., “interviewing” agents via a chat interface about their thoughts) and quantitative (e.g., number of actions, final outcomes, video recordings, action traces, memory logs) formats. This rich dataset provides UX researchers with familiar analysis methods.

Practical Implications & Use Cases

Technical impact: UXAgent provides a framework for integrating advanced LLM capabilities into automated testing pipelines. For engineers, it means developing systems that can interpret complex web environments and execute nuanced actions, potentially reducing the need for brittle, script-based UI automation. It opens avenues for more sophisticated AI-driven testing tools that can adapt to diverse web layouts and user intents.

UX/UI implications: For designers, UXAgent offers an unprecedented opportunity to conduct rapid, iterative usability testing during the design phase. It enables them to get early feedback on new features or web page designs from a diverse set of simulated users, identifying usability issues and validating design choices much faster than traditional methods. The chat interface allows for “qualitative interviews” with agents, providing insights into their thought processes during interaction.

Strategic implications: Product managers can leverage UXAgent to de-risk product launches by thoroughly evaluating new web features before significant investments in human-subject studies. It facilitates faster iteration on product designs, leading to improved user experience and potentially higher adoption rates. The ability to simulate thousands of users allows for robust A/B testing insights and understanding of diverse user behaviors across different demographics and income groups, informing feature prioritization and roadmap decisions with data-driven insights.

Challenges and Limitations

Data Realism and Trust: While participants found the generated data helpful, they expressed concerns that the simulated human behavior data might not be “like real humans” due to “too much detail” or “not realistic” thought processes from the LLM Agents. Gaining full trust in the data’s trustworthiness remains a challenge.

Model and Data Bias: Participants raised concerns about potential biases introduced by the LLM Agents, stemming from biases in their training data (Data Representation Bias) or the pre-training/fine-tuning stages (Algorithmic Decision Bias). These biases could potentially distort study designs or results.

Raw Data Analysis Complexity: Raw memory traces and observation data from LLM Agents were found to be hard to read and analyze by UX researchers, highlighting the need for automated high-level insight summaries.

Scope Limitations: The current system primarily processes semantic/textual HTML information, excluding visual elements. Its application is currently limited to web environments and basic actions, with a specific “buy a jacket” intent tested.

Future Outlook & Considerations

Future research for UXAgent aims to enhance data realism and address biases by exploring Multimodal LLMs (MLLMs) to interpret both textual and visual information, improving contextual understanding. The system could expand beyond web applications to desktop, mobile, or mixed-reality environments, and support more complex user intents (e.g., window shopping) and actions (e.g., “read,” “scroll”). A critical next step is to conduct systematic analyses comparing LLM-generated simulations with real human participants to better understand the benefits and limitations. Teams should also consider the ethical implications of using AI-generated data, ensuring it complements rather than replaces human-subject studies, and implement robust privacy safeguards. The development of automated high-level insight summaries will be crucial to make the vast amounts of generated data more actionable for UX researchers.

Conclusion

UXAgent represents a significant step towards a future of “human-AI collaboration” in UX research. By enabling rapid, iterative usability testing through LLM-agent-based simulations, it empowers engineers, designers, and product managers to refine their study designs, reduce risks, and improve the quality of user studies. While challenges remain in ensuring data realism and addressing biases, UXAgent’s potential to accelerate the design process and provide diverse behavioral insights positions it as a valuable tool for shaping the future of web design and user experience.