Unmoderated Usability Studies Evolved: Can GPT Ask Useful Follow-up Questions?

TLDR

This case study examines the implementation of GPT-4-generated follow-up questions to assess their impact on usability testing insights and discusses challenges encountered with GPT-4 follow-ups and proposes enhancements to improve future models for generating effective follow-up questions.

Samenvatting

Abstract Follow-up questions during usability testing provide crucial insights into the user’s experience with a product or service. In unmoderated usability tests conducted online, artificial intelligence (AI) is emerging as a valuable tool. Large language model chatbots have the potential to intelligently ask follow-up questions automatically and in real time. Conversing with a chatbot during usability tests may uncover deeper qualitative insights. Our case study examines the implementation of GPT-4-generated follow-up questions to assess their impact on usability testing insights. Sixty participants took part in an experiment aimed at comparing the feedback they yield under different conditions: no follow-up questions, static questions prepared by researchers, real-time GPT-4 questions, and a blend of static and AI-generated questions. While GPT-4-generated questions effectively elaborated details about existing findings, they revealed fewer new usability issues. We discuss challenges encountered with GPT-4 follow-ups and propose enhancements to improve future models for generating effective follow-up questions.