Nuanced Text Analysis at Scale: Toxicity Detection and Digital Salafism

15th February 2022

Jakob Guhl

In this Digital Dispatch, ISD researchers outline the rationale and advantages of applying toxicity analysis approaches to the study of Salafi online content, and describe how it was applied during ISD’s recent research project mapping the rapidly evolving online Salafi ecosystem.


The concept of ‘toxic’ speech describes harmful, exclusionary and/or threatening language, which includes a spectrum of manifestations such as profanity, ridicule, verbal aggression and targeted hate speech. There is a growing field of researchers and tech companies relying on machine learning to identify toxic content, a notable example being Jigsaw’s Perspective API. Perspective API assigns words a score between 0 and 100 based on how likely they are to make users leave a conversation online. Toxicity analysis approaches such as this can be seen as probabilistic attempts to estimate how toxic language within large datasets is.

While these approaches have resulted in significant interest from researchers and technology companies, concerns have been raised that users can easily evade toxicity detection by rephrasing posts while retaining their original meaning. Others have alleged that automated approaches may reproduce biases against minority groups, if they exist in the data the model is trained on. Among other things, this highlights the importance of transparent, explainable approaches that grant subject matter experts control over the input into the model and the data it is trained on.

High toxicity scores can also constitute false positives. An example of this could be through analysis capturing counter-narratives or efforts aimed at refuting toxic positions. In such cases, additional context is required to establish exactly how toxic high scoring posts are in actuality. In turn, whether or not a dataset consists of a random sample of online conversation or posts from a specific forum will impact the likelihood of language being used in a toxic manner. For example, if a dataset consists of posts from a neo-Nazi forum, it will impact the likelihood that posts using the generic term “Jew” are doing so in a derogatory way. This means that toxicity analysis is more likely to be accurate analysing datasets that are predisposed towards toxic discourse, such as those from hateful or extremist online communities. 

Toxicity Analysis and Salafism

Salafism is a reformist branch of Sunni Islam that aspires to return to the faith practised by the prophet Mohammed and his earliest followers. As ISD‘s recent research series on Generation Z and Salafism argued, this worldview, with its black-and-white value system, strong group identity and profound contrast with establishment Islamic orthodoxy, has proven highly attractive to young people as an emerging youth counterculture.

Policy-responses in Western contexts have been based around the assumption that Salafism presents two main risks: that Salafism’s opposition to secularism and liberalism present a challenge to integration as well as social cohesion, and that some Salafists present a security threat. Similarly, research has often narrowly focused on the violent tip of Salafism, such as proscribed terrorist groups like al-Qaeda and ISIS.

However, Salafism is not a homogenous movement. The vast majority of Salafi adherents across the globe are not affiliated to a political movement, nor do they advocate for violence. One of the key findings of ISD’s research into Salafi discourse online was that much online Salafi content is anodyne and geared towards identity formation and practical religious guidance. It is therefore crucial to distinguish between expressions of Salafism that advocate supremacist, sectarian and violent positions, and those that express conservative guidance on religious and social issues that should be permissible within liberal societies.

Over the past decade, we have witnessed a rise in anti-Muslim attitudes and hate crimes across Western countries, and the internment and mass killings of Muslims by authoritarian regimes in China, Myanmar and Syria, among others. Simultaneously, policies that single out Muslims for immigration restrictions have been enacted in countries as far apart as the US and India. 

Against this context, understanding and reiterating the distinction between conservative expressions of religion and extremist interpretations of Islam has become even more important. Toxicity analysis allows researchers to make such distinctions between more or less toxic content, and differentiate between the types of toxic content.

Toxicity Detection

Over the course of a year-long research project looking at the intersections of Gen-Z identities and Salafism in the digital space, ISD researchers applied a toxicity-based approach to analyse a dataset of 3.5 million pieces of English, German and Arabic Salafi content, in an attempt to distinguish between more or less toxic Salafi content online.

In collaboration with Textgain, a start-up specialising in language technology and artificial intelligence (AI), ISD researchers supported the development of a ranking tool that is able to recognise expressions frequently used by Salafis in English, German, Arabic and Latin Arabic at scale. The approach is outlined in greater detail in ISD’s snapshot of the Digital Salafi Ecosystem and its accompanying Methodological Appendix.

Simply put, subject matter experts assign a value between 0 (neutral) and 4 (highly toxic) to words commonly used in Salafi discourse. Based on these scores, an algorithm scans the overall material, and detects which and how many toxic terms are contained in a single post. This results in an overall toxicity score between 0 (not toxic) and 100 (highly toxic) for each post. Instead of treating toxicity as a binary concept, this approach takes into account that there is a spectrum between ‘not at all toxic‘ and ‘highly toxic‘ content. To increase the accuracy of the toxicity scores, every term that was valued was manually reviewed by domain and language experts. The English and German lists were reviewed by two experts, and the Arabic by three, each reviewing the other’s work.

Researchers also determined what type of toxicity these words could indicate. Whether, for example, use of the word could indicate a post calling for violence, a post using discriminatory or dehumanising language, a post containing a derogatory slur – or a post doing or all three. 

This probabilistic approach allowed researchers to assess how toxic content was and in which way it was toxic, enabling distinctions to be made between Salafi actors online voicing conservative expressions of religion, and those voicing supremacist, sectarian and violent positions. It also gave subject matter experts control over input into the model and the data it was trained on.

In addition to assigning toxicity scores to single pieces of content, it is possible to aggregate the average level of toxicity per platform, language and over time through such an approach. This allows researchers to answer important questions, such as which platforms are most (and least) affected by violent and discriminatory Salafi content; what events lead to spikes in dehumanising and supremacist rhetoric; and what differences can be identified between different linguistic Salafi communities.

There are several further advantages to applying toxicity analysis for the analysis of Salafi communication online. Foremost, it allows for the fast analysis of large datasets, and every step and decision taken during the analysis is explainable and replicable. Through building on existing lexicons used during previous research, it can also enable increasingly fine-tuned results and granular insights. In a recent paper, our partners at Textgain describe the use of their AI system for the detection of antisemitism, evidencing how similar approaches could be used to analyse many different types of toxic content from extremist discourse across the ideological spectrum, as well as racist, antisemitic, anti-Muslim, misogynist or anti-LGBT content. 

Based on these considerations, we propose that toxicity is a helpful concept to consider when thinking about the potential harmful expressions within online Salafi communities. In going beyond a narrow focus on violent radicalisation, it allows researchers to make crucial distinctions between Salafi content that advocates for different types of toxic positions, and content that merely expresses conservative guidance on religious and social issues. It is therefore a suitable approach for future research that aims to track problematic manifestations of Salafism online without reducing the entire movement to its violent fringe.


Jakob Guhl is a Research Manager at ISD.