Though Musk later clarified that he remains committed to the deal, he continued to hammer on the issue of fake accounts. He wrote, on Twitter, that his team would do their own analysis and expressed doubt about the accuracy of numbers Twitter has reported in its most recent financial filings.
In its first-quarter earnings report this year, Twitter acknowledged there are a number of “false or spam accounts” on its platform, alongside legitimate monetizable daily active usage or users (mDAU). The company reported, “We have performed an internal review of a sample of accounts and estimate that the average of false or spam accounts during the first quarter of 2022 represented fewer than 5% of our mDAU during the quarter.”
Twitter also admitted to overstating user numbers by 1.4 million to 1.9 million users over the past 3 years. The company wrote, “In March of 2019, we launched a feature that allowed people to link multiple separate accounts together in order to conveniently switch between accounts,” Twitter disclosed. “An error was made at that time, such that actions taken via the primary account resulted in all linked accounts being counted as mDAU.”
While Musk may be justifiably curious, experts in social media, disinformation and statistical analysis say that his suggested approach to further analysis is woefully deficient.
Here’s what the SpaceX and Tesla CEO said he would do to determine how many spam, fake and duplicate accounts exist on Twitter:
“To find out, my team will do a random sample of 100 followers of @twitter. I invite others to repeat the same process and see what they discover.” He clarified his methodology in subsequent tweets, adding: “Pick any account with a lot of followers,” and “Ignore first 1000 followers, then pick every 10th. I’m open to better ideas.”
Musk also said, without providing evidence, that he picked 100 as the sample size number for his study because that’s the number Twitter uses to calculate the numbers in their earnings reports.
“Any sensible random sampling process is fine. If many people independently get similar results for % of fake/spam/duplicate accounts, that will be telling. I picked 100 as the sample size number, because that is what Twitter uses to calculate <5% fake/spam/duplicate.”
Twitter declined to comment when asked if his description of its methodology was accurate.
Facebook co-founder Dustin Moskovitz weighed-in on the issue via his own Twitter account, pointing out that Musk’s approach is not actually random, uses a too small sample, and leaves room for massive errors.
He wrote, “Also I feel like ‘doesn’t trust the Twitter team to help pull the sample’ is it’s own kind of red flag.”
BotSentinel founder and CEO Christopher Bouzy said in an interview with CNBC that analysis by his company indicates that 10% to 15% of accounts on Twitter are likely “inauthentic,” including fakes, spammers, scammers, nefarious bots, duplicates, and “single-purpose hate accounts” which typically target and harass individuals, along with others who spread disinformation on purpose.
BotSentinel, which is primarily supported through crowdfunding, independently analyzes and identifies inauthentic activity on Twitter using a mix of machine learning software and teams of human reviewers. The company monitors more than 2.5 million Twitter accounts today, primarily English-language users.
“I think Twitter is not realistically classifying ‘false and spam’ accounts,” Bouzy said.
He also warns that the number of inauthentic accounts can appear higher or lower in different corners of Twitter depending on topics being discussed. For example, more inauthentic accounts tweet about politics, cryptocurrency, climate change, and covid than those discussing non-controversial topics like kittens and origami, BotSentinel has found.
Carl T. Bergstrom, a University of Washington professor who co-wrote a book to help people understand data and avoid being taken in by false claims online, told CNBC that sampling one hundred followers of any single Twitter account should not serve as “due diligence” for making a $44 billion acquisition.
He said that a sample size of 100 is orders of magnitude smaller that the norm for social media researchers studying this sort of thing. The biggest issue Musk would face with this approach is known as selection bias.
Bergstrom wrote in a message to CNBC, “There’s no reason to believe that followers of the official Twitter account are a representative sample of accounts on the platform. Perhaps bots are less likely to follow this account to avoid detection. Perhaps they’re more likely to follow to seem legitimate. Who knows? But I just can’t fathom that Musk is doing anything other than trolling us with this silly sampling scheme.”