Tech News

X’s ‘open source’ algorithm is not transparent, researchers say

When X’s engineering team published the code that powers the platform’s “own” algorithm last month, Elon Musk said the move was a victory for transparency. “We know the algorithm is dumb and needs a lot of improvement, but at least you can see us striving to make it better in real time and transparently,” Musk wrote. “No other social media companies are doing this.”

While it’s true that X is the only major social network to open-source aspects of its recommendation algorithm, researchers say the company’s publications don’t provide the kind of transparency that would be helpful to anyone trying to understand how X works in 2026.

The code, similar to an earlier version published in 2023, is a “modified” version of the X algorithm, according to John Thickstun, an assistant professor of computer science at Cornell University. “My concern with this release is that it gives you the impression that it’s exposing the code and the feeling that someone could use this release to do some kind of audit activity or oversight activity,” Thickstun told Engadget. “And the truth is that that is absolutely impossible.”

Predictably, as soon as the code was released, users on X began posting long threads about what it meant for creators hoping to increase their visibility on the platform. For example, one post that has been viewed more than 350,000 times advises users that X will “reward people who chat” and “raise X’s vibration.” Another post with over 20,000 views says video posting is the answer. Another post says that users should stick to their “niche” because “changing the topic hurts your reach.” But Thickstun cautioned against reading too much into techniques that are thought to be potentially viral. “They will not be able to reach those conclusions from what has been released,” he said.

While there are a few details that shed light on how X recommends posts — for example, it filters content older than one day — Thickstun says much of it is “impossible” for content creators.

Structurally, one of the biggest differences between the current algorithm and the version released in 2023 is that the new system relies on a large Grok-like language model for ranking. “In the previous version, this was hard-coded: you took how many times something was liked, how many times something was shared, how many times something was answered … and based on that you calculate a score, and you rank based on the score,” explains Ruggero Lazzaroni, PhD researcher at the University of Graz. “Now the points are not taken from the actual amount of likes and shares, but how much Grok thinks you will like and share the post.”

That also makes the algorithm more transparent than before, Thickstun said. “So much of the decision-making … happens inside dark neural networks that they train on their own data,” he says. “Many of the decision-making powers of these algorithms do not change outside of public view, but in fact are not visible or even understood by the internal engineers working on these systems, because they are transferred to these neural networks.”

The release has even smaller details about other aspects of the algorithm that were made public in 2023. At the time, the company was compiling information about how it measures various interactions to determine which posts should be ranked at the top. For example, a reply was “worth” 27 retweets and a reply that generated a reply from the original author was worth 75 retweets. But X has now redacted information about how it measures these factors, saying the information was not included “for security reasons.”

The code also does not include any information about the data the algorithm was trained on, which would help researchers and others understand or perform research. “One of the things I’d really like to see is, what training data are they using for this model,” said Mohsen Foroughifar, an assistant professor of business technology at Carnegie Mellon University. “if the data used to train the model is inherently biased, then the model may end up being biased, regardless of what kind of factors you consider in the model.”

Being able to do research on X’s recommendation algorithm would be very valuable, says Lazzaroni, who is working on an EU-funded project exploring alternative recommendation methods for using social media. Much of Lazzaroni’s work involves simulating real-world social networks to explore different approaches. But he says the code released by X doesn’t have enough information to reproduce its recommendation algorithm.

“We have the code to run the algorithm, but we don’t have the model you need to run the algorithm,” he says.

If researchers were able to learn X’s algorithm, it could reveal insights that could have far more impact than social media alone. Many of the same questions and concerns that have been raised about how social media algorithms behave are likely to arise again in the context of AI chatbots.” Many of these challenges we see in social media and recommendation. [systems] they’re emerging in the same way as these generative systems as well,” said Thickstun. “So you can extrapolate the kinds of challenges we’ve seen with social media to the kind of challenges we’re going to see working with GenAI platforms.”

Lazzaroni, who spends a lot of time modeling the most toxic behavior on social media, is even smarter. “AI companies, in order to increase profits, develop large language models of user interaction and not tell the truth or take care of the mental health of users. And this is exactly the same problem: they make more profit, but the users get a worse society, or they get a worse mental health from it.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button