Identifying Sock-Puppets on Wikipedia: A Semantic Clustering Approach

Authors: Nestor Prieto Chavana, Chris Inskip, Carl Miller and David Weir
Published: 29 April 2024

This paper investigates potential platform manipulation on Wikipedia using a technique called semantic clustering in order to identify covert and organised manipulation on a larger scale. It aims to explore whether this method worked, and whether it might be useful if deployed more expansively. Developing this method was only made possible only by the release of transformer-based pre-trained language models, allowing researchers to cluster editors on the basis of the meaning (the ‘semantic content’) of the edits that they make. The ambition was to see whether this might be useful to identify clusters of editors on Wikipedia who had made suspicious edits. 

Download Report