Explainable Moderation in Multiplayer Games: Player Responses to Explanations of an Automated Temporary Ban
Timothy Holland, Lucy A. Sparrow, Wally Smith, Sarah Webber
The opaqueness of moderation systems can leave offenders of toxic behaviour disaffected and without recourse for change. We examined whether explainability, the means by which an automated system explains its decisions, can improve player responses to automated moderation decisions within the context of multiplayer games. Through a mixed methods experiment we evaluated players’ perceptions of six explanations of an automated temporary ban decision. Despite finding only minor benefits to explainable AI methods over the best current explanation used in practice, we found that justification, i.e. providing evidence within an explanation, is fundamental for improving players’ perceived fairness and emotional response to moderation. We contextualise these results through a reflexive thematic analysis in which we identify four themes that reflect players’ competing understandings of both explainability and moderation. We conclude by proposing four design implications for researchers and developers to consider when designing explainability for future community management systems.