“Hey, fellow humans!”: What can a ChatGPT campaign targeting pro-Ukraine Americans tell us about the future of generative AI and disinformation?

5 December 2023

Fig 1: Tweet from Oct 24th.

Summary

On September 22^nd, an X account belonging to Alexey Navalny tweeted about the Russian opposition figure’s treatment as he was shuffled in and out of punishment cells in a maximum-security prison east of Moscow. Over a week later, an odd reply appeared.

“I cannot fulfill this request as it goes against OpenAI’s use case policy by promoting hate speech or targeted harassment,” an account with the name ‘Nikki’ and a profile photo of a young woman informed Navalny. “If you have any other non-discriminatory requests, please feel free to ask.”

Fig 2: Screenshot of @navalny tweet and @planmolimo, 1982 response.

‘Nikki’ is one of a network of at least 64 accounts which appear to be using content generated by OpenAI’s ChatGPT app to engage in a targeted harassment campaign directed at Navalny and his associate Maria Pevchikh, and at their organisation The Anti-Corruption Foundation (ACF). The goal of the campaign appears to be to undermine support for Navalny and the ACF among pro-Ukraine American and Western audiences.

ISD has not found conclusive evidence to establish who is likely to be behind this campaign. However, it is notable there appears to be a general pattern in which most posting takes place Monday to Friday and in line with business hours in Moscow and St Petersburg.

ChatGPT is an AI chatbot that uses large language models (LLMs) to create humanlike conversational dialogue. The rapid adoption of ChatGPT and other generative AI tools such as DALL-E and Midjourney has prompted a great deal of speculation about what the rise of LLMs for content generation might mean for the future of disinformation, hate speech and targeted harassment. This network offers a useful early example of how this is actually playing out in the wild.

The bottom line is, it’s bad news.

The ChatGPT content, overall, is very good. It has some quirks and some oddities – weird metaphors, unwieldy hashtags, a predilection for melodrama, a peculiar fixation on food. Seen as a whole, the corpus of content does feel oddly robotic. When you already know to look for signs of AI use, there are reasons to be suspicious.

The problem is that the vast majority of social media users won’t know to be suspicious. For most people scrolling casually through a platform like X, the content could easily pass as authentic. Even if they do suspect, it may be more or less impossible for users to be certain that an account is using AI generated content unless – as in this case – the operators are sloppy enough to post a refusal from the AI.

This is the situation less than a year after the public launch of ChatGPT. It is, in other words, just the beginning; the technology will only become more sophisticated, and more convincing, from here.

Campaign overview

Accounts and posting behaviour

Aside from the use of ChatGPT, this is not a technically sophisticated campaign. That lack of sophistication is what has made it detectable. ISD first identified the potential existence of a ChatGPT-generated campaign via the tweet above referencing OpenAI’s use case policy on targeted harassment. ISD then identified a further 64 accounts on X which appear highly likely to be part of the same network, based on coordinated posting patterns and shared vocabulary, topics and hashtags.

The campaign is pro-Ukraine and anti-Russia, has used vocabulary commonly linked to the ‘NAFO’ movement, and seeks to paint Navalny and his associates as a form of controlled opposition being run by the Russian security and intelligence services. Interestingly, however, a small number of tweets from the accounts also pose as supporters of Navalny and Pevchikh. This strategy will be discussed in more detail below.

The campaign is conducted primarily in English, with accounts posing as Americans from Texas, California, Louisiana and other areas of the US. Some accounts also have related activity in Lithuanian over a very specific time period. One account has also been used simultaneously in the campaign against Navalny and to promote cryptocurrency.

The campaign’s tactics are largely limited to spamming the replies of Pevchikh, Navalny, the ACF and related individuals, including replying to tweets from days or weeks before, or posting on its own accounts’ timelines.

The accounts themselves appear to be bulk-created and may have been purchased from a commercial supplier. There has been no effort to build convincing personas for the accounts or to tailor them for the campaign.

There are (at least) two ‘generations’ of accounts. The first generation are older accounts, mostly created around 2010 or 2011 but with no activity before June or July 2023 when the campaign began. Initially it appears likely that the operators were either writing content manually or using a more basic form of automatic content generation such as spintax. This early activity was much more repetitive, with identical or extremely similar content spammed multiple times from multiple accounts – a tactic known as ‘copypasta’.

Fig 3: Screenshot of tweets on July 24th from two ‘first generation’ accounts.

The second generation of accounts became active on September 26^th. These accounts were newly created in September 2023 and again appear likely to have been bulk-created and potentially purchased. Based on comparisons of the content, this also appears to be when the operators shifted to using ChatGPT content for all of their accounts, both first and second gen.

On or around November 6^th it appears that some of the accounts may have been removed by X. They were almost immediately replaced with fresh accounts, also created in September 2023.

Fig 4: Screenshots showing side by side comparison of the same account posting potentially spintax generated content (left) on September 20th and suspected ChatGPT-generated content (right) on September 29th.

The posting patterns of the network reflects a high level of coordination. It is unclear whether there is any use of automation for posting. Changes to API access on X have made it harder for many third-party automated posting tools to access the platform, so it is possible that posting is being done manually. The consistency of the number of accounts active each day in the weeks since September 26^th (56 on Mondays-Thursdays, 45 on Fridays and between 7-9 on weekend days) is striking.

Fig 5: Unique authors (i.e. number of accounts actively tweeting each day) from July 1st to November 6th. Note the sporadic activity in July, more organised pattern in Aug-early September and very stable pattern after September 26th.

Interestingly, there is a very consistent pattern of posting predominately on weekdays, with a notable drop-off on Fridays and low activity on weekends.

Fig 6: Posts and replies by day. The drop-off and then resurgence in replies (purple) in late September reflects a likely transition period followed by the introduction of the second generation accounts and likely shift to ChatGPT.

If the activity were truly organic and not coordinated, the posting pattern would be expected to be much more varied and irregular. Activity would likely be higher on weekends rather than much lower. This pattern suggests coordinated, organised posting happening during business days.

Another interesting point is the average time of day that the accounts are active. Tweets from the network appear to begin to pick up around 5am GMT, rising to a peak at 7am GMT and dropping off from there to a low point at 1pm, with a very small resurgence around 3-7pm GMT. Posting completely ceases from 9pm until 4am GMT.

Fig 7: Average tweets for the period September 26th – November 6th 2023 shown by day of the week and time of day (GMT/UTC+00:00)

5am GMT is 8am in Moscow and Saint Petersburg. 1pm GMT is 4pm in those cities, i.e. the hour leading up to 5pm. On weekends (but not weekdays) a small number of posts are made around 6-9pm Moscow and Saint Petersburg time. There are no posts during the period from 10pm until at least 7am Moscow time. To state the obvious, this appears to reflect the pattern of a workday (with a small amount of after-hours work).

The time in Kyiv is one hour earlier, which would put the bulk of the activity between 7am and 4pm. This does not line up quite as neatly as it does with Russian office hours, but is nonetheless worth noting.

ISD has not found evidence to conclusively attribute the campaign to any particular actor. It is noteworthy that there are prominent examples of state actors, including those affiliated with the Kremlin, impersonating American social media users to covertly promote their own interests.

While this campaign is superficially anti-Russian, it is worth considering whose interests are served by driving a wedge between Navalny and his supporters and liberal Western audiences. It would not be the first time that the Kremlin has sought to infiltrate and sew division between their opponents.

If the campaign successfully fooled the NAFO movement into focusing their energies on harassing Navalny and his supporters, that would hypothetically be two birds with one stone for the Kremlin, in both misdirecting NAFO’s attentions and ramping up the pressure on Navalny and the ACF.

Narratives and tactics

The overarching narrative theme of the campaign is that Alexey Navalny, his organisation and his associates are enemies of Ukraine and likely to be ‘controlled opposition’ coordinating with the Russian government. Navalny and his associates, particularly Maria Pevchikh, are framed as supporters of Russia’s invasion of Ukraine and complicit in its brutality against Ukrainians.

For example, in late October and early November the network was largely focused on questioning how Navalny’s account was continuing to tweet while Navalny himself is in prison and three of his lawyers have been arrested. The line of questioning appears to be intended to imply that Navalny is coopted by or working with the Russian government.

Fig 8: Tweets questioning how Navalny is continuing to tweet and “who’s really pulling the strings.”

There also appears to be a particular fixation on Pevchikh, including a concentrated effort to imply sinister connections between Pevchikh and the well-known Russian nationalist Aleksandr Dugin. Dugin was Pevchikh’s thesis supervisor at Moscow State University; Pevchikh herself has spoken publicly about this. The network appears to be seeking to use this connection to infer some deeper, ongoing covert connection between Pevchikh and Dugin.

There is an effort to tie the campaign narratives into the news cycle. In the wake of the October 7^th Hamas attack on Israeli civilians, for example, the accounts began tweeting critically about how the ACF was ‘prioritising Russian superiority’ and ignoring the actions of Hezbollah.

Fig 9: Tweets accusing the ACF of ignoring Hezbollah’s actions in favour of ‘Russian superiority’.

A very small proportion of the tweets are supportive of Navalny and the ACF. In some cases, the tactical logic behind this is fairly clear. One tweet, for example, reads “I support @pevchikh, a genuine victim in this conflict. These #NAFO are savage creatures. It’s prestigious to study under Dugin.”

While posing as supportive, this tweet is clearly aimed to bolster the narrative that Pevchikh and Dugin are linked. Other examples are more difficult to understand, however, as they appear to be fairly straightforward expressions of support, usually bookended by other tweets from the same account taking the exact opposite stance.

Another tactic which the campaign appears to be employing is to attempt to build affinity with particular audiences through posing as part of those communities. In the earlier phase of the campaign, there appeared to be an effort to align the accounts with the ‘NAFO’ movement. ‘NAFO’ supporters have repeatedly clashed with ACF figures online, while there is significant suspicion and even hostility towards Navalny’s movement in Ukraine more broadly. The tweets use a number of terms commonly used by the NAFO movement, for example frequently referring to “Navatniks” (a portmanteau of Navalny and vatnik), a term predominately used by NAFO supporters beginning in mid-July 2023.

Tweets during July and August frequently referred to NAFO in a supportive manner. On July 20th, for example, network accounts tweeted 240 times with supportive messages for NAFO.

Fig 10: Tweets from accounts in the network on July 20th. This was during the period prior to the likely introduction of ChatGPT content.

After the shift on September 26th and the introduction of ChatGPT and the second generation of accounts, the focus appears to have shifted away from specifically NAFO audiences to a broader American and Western English-speaking audiences. The new ChatGPT accounts started posing not as NAFO supporters, but as ordinary Americans.

Fig 11: Tweets from ChatGPT accounts claiming to be Americans.

This appears to represent a widening in strategy by the campaign. Instead of seeking to drive a wedge between NAFO supporters and the ACF, the campaign now appears to be aiming to divide the ACF from Americans specifically and perhaps Western Anglophone audiences more broadly.

The efforts to mobilise NAFO specifically made a brief return in late November, however. In an interesting tactical move, on November 18^th the network tweeted at least 800 times from 53 accounts accusing Pevchikh and the ACF of silencing Ukraine supporters and funding troll farms. Many of the tweets called for “NAFO Article 5” to be invoked, which is essentially a way of trying to call up authentic NAFO followers to attack Pevchikh and the ACF. There does appear to be a clear effort to manipulate NAFO and weaponise the movement against the Russian liberal opposition.

Fig 12: Screenshots of ChatGPT accounts accusing Pevchikh and the ACF of funding trolls.

ChatGPT content

The ChatGPT content as a whole is alarmingly good. It seems likely that for a casual observer, most of it could pass as authentic comments written by real people, especially if viewed in isolation.

When tweets from the network are viewed together, for example when they spam the replies to a particular tweet, there is a perceptible ‘uncanny valley’ effect which might make some observers suspicious. It would likely be difficult for most social media users to be certain whether these replies were authentic or not, however.

Fig 13: A ChatGPT reply viewed in isolation (left) vs the same reply alongside multiple other ChatGPT replies, interspersed with one non-network reply redacted in red (right).

Fig 14: Screenshot of multiple ChatGPT bot accounts replying to a tweet from @pevchikh.

Despite being grammatically correct and coherent, ChatGPT does at times use phrases and language which would likely strike a native English speaker as very odd. This includes strange metaphors, florid language, unusual hashtags and overly contrived efforts to shoehorn local stereotypes into sentences.

The ChatGPT used by this network, for no discernible reason, appears to have a fixation on food and cooking. Many of the tweets use metaphors relating to food, recipes and cooking which are grammatically correct but which would strike most native English speakers as out of place.

Fig 16: Tweets using food and cooking metaphors.

Fig 15: Tweets using food and cooking metaphors.

Some of the tweets also involve notably odd hashtags, like the #TastetheTruth hashtag in the above tweet. Some of the hashtags used by the network are unique and appear only in a single tweet, while others like #EndRussianAggression and #RussiaOutofUkraine are used repeatedly by multiple accounts in the campaign.

It does not appear that the campaign operators are employing hashtags in a strategic way. The hashtags used by the campaign accounts are largely exclusive to the campaign, and not widely used by authentic users. There is no clear strategy of piggybacking on existing popular hashtags in order to get more engagement and impressions for the campaign’s tweets.

The language used in some of the tweets reads as very melodramatic and florid to a native English speaker.

Fig 17: Tweets using melodramatic language.

Fig 16: Tweets using melodramatic language.

An interesting facet of the content is how the AI draws on stereotypes about specific US states. It seems as though the AI has been instructed to write from the perspective of people from particular parts of the US, resulting in awkward efforts to shoehorn in references to some supposed feature or characteristic of that state.

In some cases these are references to the weather.

Fig 17 Tweets referring to state weather.

In others, it is drawing on state stereotypes, for example that people from New York are tech and finance-oriented while LA is creative and has bad traffic.

Fig 18: Tweets drawing on state stereotypes.

Sometimes these state stereotypes combine with the fixation on food, for example state-based gumbo metaphors.

Fig 20: Tweets referring to different kinds of gumbo.

Fig 19: Tweets referring to different kinds of gumbo.

A small but fascinating note is that the AI does not always write the ACF’s name in the same way. Sometimes the tweets refer to the ‘Anti Corruption Foundation’, sometimes to the ‘Anti-Corruption Foundation’ (with hyphen), sometimes to the ‘anti corruption foundation’ (no capitalisation).

Although this is likely an aspect of the AI rather than a strategy by the campaign operators, the possible effect of this inconsistency is to make the content seem more authentic in the same way that real social media users would be inconsistent with punctuation and capitalisation.

Fig 21: Tweets using different versions of the ACF’s name.

Fig 20: Tweets using different versions of the ACF’s name.

Future implications for generative AI in influence campaigns

This campaign presents an early example of how generative AI can be used to fuel targeted harassment and influence campaigns. It is bad news.

Despite the quirks and vagaries described above, the content overall is strikingly authentic-looking. In particular, it is impressively and somewhat surprisingly proficient at presenting a message through inference and implication – the “just asking questions” strategy so commonly used by conspiracy theorists, extremists and disinformation actors alike. This is a more subtle approach than some more direct methods of spreading distrust and it might have been expected that the AI might struggle with it, but this does not appear to be the case.

While ISD has not come to any firm attribution as to who might be behind the campaign, it seems plausible and perhaps even likely that they are not native English speakers. Native speakers, or even proficient non-native speakers, would presumably have noticed and adjusted some of the strange language choices made by the AI. If this is the case, it reflects the way in which generative AI makes cross-cultural and cross-linguistic influence campaigns much easier. Over time these models are only likely to become more sophisticated. LLMs trained on highly relevant datasets, for example corpuses of real social media posts, would likely produce more appropriate content with fewer peccadilloes (such as this campaign’s odd food fixation).

For researchers, this sort of AI-generated campaign is likely to present real methodological challenges in the future. Shared content, like copy-pasted posts or phrases, has been a crucial way in which researchers have historically linked accounts to form a picture of coordinated networks. In the future with AI-generated campaigns that will likely no longer be possible. Significant effort will need to be put into developing new methodologies to replace these methods which are likely to become obsolete.

This particular campaign was identifiable because it is crude and because the operators made obvious and easily avoidable mistakes. Even moderately more sophisticated campaigns could fly (and very possibly already are flying) entirely under the radar. It also seems very likely that there will be many instances in which both researchers and ordinary social media users strongly believe that they are seeing a coordinated AI-generated network – but will be unable to prove it without technical signals and data held only by social media companies.

This will likely leave not only general users, but also journalists, researchers and regulators even more heavily reliant on platforms to be transparent about the steps they are taking to identify influence campaigns. This should include transparency both about how they are attempting to identify and moderate AI-generated content, as well as what steps they are taking to audit the accuracy of their own detection systems and guard against false positives.

The broader implications of this for the state of public discourse and the online information ecosystem may be profound, particularly in the context of social media companies backpedalling on their disinformation policies ahead of 2024. It is almost inevitable that there will be large-scale disinformation and influence campaigns using generative AI, but it is also likely that false accusations will be used to discredit and dismiss real people and movements expressing genuine opinions. In the long run, the effect may be to deepen polarisation and increase distrust and hostility on social media.