Understanding Social Media Recommendation Algorithms Knight First Amendment Institute

## Colophon date: title:: Understanding Social Media Recommendation Algorithms | Knight First Amendment Institute type:: [[literature-note]] tags:: url:: https://knightcolumbia.org/content/understanding-social-media-recommendation-algorithms status:: [[bean]] ## Notes > The research community is not close to being able to fully observe and explain the underlying feedback loops, both because the methods remain immature and because of lack of adequate access to platforms. — [view in context](https://hyp.is/ZSJzFMD6Ee27Wo-lRBfH1A/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: more research> Understanding Social Media Recommendation Algorithms — [view in context](https://hyp.is/cMdDtsD6Ee2MfWfyRB_I1Q/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: [[Understanding Social Media Recommendation Algorithms Knight First Amendment Institute - 20230310]]> Why focus on recommendation algorithms? Compared to search, recommendation drives a bigger (and increasing) fraction of engagement. More importantly, the platform has almost complete control over what to recommend a user, whereas search results are relatively tightly constrained by the search term. — [view in context](https://hyp.is/yS-1xMD6Ee2nBfOAWzWmkw/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: recommender vs search> Over the past two decades, the progression has been from the subscription model to the network model to the algorithmic model. We appear to be in the middle of the latter shift (from network to algorithm), notably with Instagram and Facebook. Other platforms are facing similar pressure as well, because of the success of TikTok. Any such shift has major impacts on the business, on the type of content that’s amplified, and on the user experience. For example, Instagram’s changes led to a user outcry that forced it to roll back some changes. — [view in context](https://hyp.is/GPNVnsD_Ee28pbdkkfrZtQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Subscription -> Network -> Algorithmic> Structural virality is the answer to the question: “How far from the poster did the post travel through the network?” It’s a simple question that reveals a lot, illustrated by the stylized trees (in computer science, “trees” are drawn upside down). The cascade pattern of a tweet like @JameelJaffer’s would look like the one on the left, retweeted by many people who aren’t following the original account, whereas @JoeBiden’s tweet would look like the one on the right. The structural virality of a post is the number of degrees of separation, on average, between users in the corresponding tree. The deeper the tree, with more branches, the greater the structural virality. — [view in context](https://hyp.is/YrCnkMGaEe26VO_ngiIzDw/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: The difference between virality and popularity.> Virality is Unpredictable — [view in context](https://hyp.is/vvMy1MGaEe26VucOY2V6fA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: [[properties of virality]]> Viral Content Dominates Our Attention — [view in context](https://hyp.is/xCE2osGaEe2aol8bWnCx7g/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: [[properties of virality]]> The distribution of engagement is highly skewed. A 2022 paper quantified this for TikTok and YouTube: On TikTok, the top 20% of an account’s videos get 76% of the views, and an account’s most viewed video is on average 64 times more popular than its median video.18 18. Benjamin Guinaudeau et al., Fifteen Seconds of Fame: TikTok and the Supply Side of Social Video, 4 Computational Commc'n Rsch. 463 (2022).On YouTube, the top 20% of an account’s videos get 73% of the views, and an account’s most viewed video is on average 40 times more popular than its median video. In general, the more significant the role of the algorithm in propagating content, as opposed to subscriptions or the network, the greater this inequality seems to be. — [view in context](https://hyp.is/-tfwWsGaEe2if2ebgZOrMQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: "for most creators, the majority of engagement comes from a small fraction of viral content"> Viral Content is Highly Amenable to Demotion — [view in context](https://hyp.is/F7buxMGbEe2_1cv7I5GPsA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: [[properties of virality]]> Demotion, downranking, reduction, or suppression, often colloquially called shadowbanning, is a "soft" content moderation technique in which content deemed problematic is shown to fewer users, but not removed from the platform. — [view in context](https://hyp.is/OhLsAsGbEe2doPfE9i0XlA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > Platform companies may have many high-level goals they care about: ad revenue, keeping users happy and getting them to come back, and perhaps also less mercenary, more civic goals. But there’s a problem: None of those goals are of much use when an algorithm is faced with a decision about what to feed a specific user at a specific moment in time. There isn’t a good way to measurably connect this kind of micro-level decision to its long-term impact. That’s where engagement comes in. — [view in context](https://hyp.is/ev6sgsHEEe2j85-Lx1-BLA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Engagement is how decisions are made. > While there are many differences in the particulars, the similarities between different platforms’ recommendation algorithms overwhelm their differences. And the differences that do exist are generally specific to the design of the platforms. For example, YouTube optimizes for expected watch time, but Twitter doesn’t, because Twitter is not video based. Spotify has the somewhat unique challenge of generating playlists that are coherent as a whole, rather than merely compiling a list of individually appealing track recommendations, so its logic departs somewhat substantially from the above. Perhaps for this reason, it relies more on content analysis and less on behavior.37 37. Dmitry Pashtukov, Inside Spotify’s Recommender System: A Complete Guide to Spotify Recommendation Algorithms, Music Tomorrow Blog (Feb. 9, 2022), https://www.music-tomorrow.com/blog/how-spotify-recommendation-system-works-a-complete-guide-2022 [https://perma.cc/4BNE-F7RG]. In other words, there is no competitive risk to platform companies from being more open about their algorithms. This might contradict one’s mental picture of the algorithm being closely guarded secret sauce. In a blog post analyzing TikTok, I argued that this view is a myth, but that argument applies to other platforms too. — [view in context](https://hyp.is/u3YZYMHLEe2Imp-etGML8w/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Transparency about algorithms is not a competitive risk.> That explains the current unsatisfactory and somewhat paradoxical state of algorithmic transparency. Besides, companies have shared precious little about the effects of algorithms. There have only ever been two published studies from major platform companies looking at the effects of their algorithms, as far as I’m aware. — [view in context](https://hyp.is/75KmeMHLEe2djxtYA2TcCw/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Platforms do not disclose to users/broader public.> To break it down, let’s start with similarity between users. There are three main types of signals that are available: network, behavior, and demographics. — [view in context](https://hyp.is/uMH_gMHMEe2e6bdjU8Zk7A/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: - " Network refers to the user’s interaction with others: following, subscription, commenting, and so on " - " Behavior is the most critical signal. Two users are similar if they have engaged with a similar set of posts. Its importance is a matter of sheer volume. " - " Demographics refers to attributes such as age, gender, and, more importantly, language and geography. Demographic information is useful when a user first joins the platform since there is little else to rely on " > In particular, keeping the computation tractable is a major challenge. The volume of information is vast: Based on the back-of-the-envelope calculations for TikTok above, the number of behavioral records may be of the order of a quadrillion (1015). A naive algorithm—for instance, one that attempted to compute the affinity between each user and each post—would be millions of times slower than an optimized one, and no amount of hardware power can make up the difference. — [view in context](https://hyp.is/ndASrMHREe2Z6HP3y-Huug/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Translating algorithm descriptions to working versions is intensive.> It’s worth pausing to ask how well recommendation algorithms work. It may seem obvious that they must work well, considering that they power tech platforms that are worth tens or hundreds of billions of dollars. But the numbers tell a different story. One way to quantify it is by engagement rate: the likelihood that a user engages with a post that is recommended to them. On most platforms, it is less than 1%. TikTok is an outlier, but even there, it is only a little over 5%.51 51. Elena Cucu, [STUDY] TikTok Benchmarks: Performance Data and Stats Based on the Analysis of 616,409 TikTok Videos, Socialinsider Blog (Sept. 21, 2022), https://www.socialinsider.io/blog/tiktok-benchmarks/ [https://perma.cc/86T4-29T2].This is not because the algorithms are bad, but because people are just not that predictable. As I’ve argued elsewhere, when the user interface is good enough, users don’t mind the low accuracy. — [view in context](https://hyp.is/ITIOTMHTEe2bm8NH83FuqA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: How well do recommender systems work? Users are unpredictable.> If they are so hit-or-miss, how can recommendation algorithms possibly be causing all that is attributed to them? Well, even though they are imprecise at the level of individual users, they are accurate in the aggregate. Compared to network-based platforms, algorithmic platforms seem to be more effective at identifying viral content (that will resonate with a large number of people). They are also good at identifying niche content and matching it to the subset of users who may be receptive to it. I believe it is in the aggregate sense that recommendation algorithms are most powerful—and sometimes dangerous. — [view in context](https://hyp.is/Xz42UsHTEe27cffZDPJKJQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: "Recommender systems are accurate in the aggregate" > there should be no need to manually adjust the formula by affinity. If users like to see content from friends over brands, the algorithm should be able to learn that—again, at a granular, per-user level that cannot be achieved by manual tweaking of weights. Why, then, does the formula use affinity scores? It appears to be an explicit attempt to fight the logic of engagement optimization, manually programming in a preference for friends-and-family content even at the expense of short-term engagement with the aim of increasing long-term satisfaction, which the algorithm can’t measure. — [view in context](https://hyp.is/pL1M5MIzEe2eLK_G_ZGmtQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Affinity scores in Facebook's MSI calculation offset the engagement optimisation and manually set a preference for friends and family content.> Comments are overwhelmingly more important than any other type of interaction. Although it doesn’t seem to have been reported in the press, a likely consequence of these weights is that posts that implicitly or explicitly encouraged users to comment would have done even better after this change. And one reliable way to encourage people to comment is to post divisive content. — [view in context](https://hyp.is/E_-c4sI0Ee2HoTMCzq2fxg/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Comments appear to have the highest weight, incentivising divisive content.> It shouldn’t be surprising, though, that attempting to steer a system of such extraordinary complexity using so few knobs would prove challenging. — [view in context](https://hyp.is/M8z6JMI0Ee2_Gu9Tw0-GkQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: When efforts fail, too few knobs for such complex systems and interactions.> There are obvious and important arguments in favor of neutrality. After all, platforms are already under attack from all sides of the political aisle for supposedly being biased. But neutrality is hard to achieve in practice. Many biases are emergent effects of these systems. One is the rich-get-richer effect: Those who already have a high reach, whether earned or not, are rewarded with more reach.67 67. Linhong Zhu & Kristina Lerman, Attention Inequality in Social Media, Cornell Univ., ArXiv (Jan. 26, 2016), https://arxiv.org/abs/1601.07200 [https://perma.cc/72X9-M937].For example, the top 1% of authors on Twitter receive 80% of tweet views.68 68. Tomo Lazovich et al., Measuring Disparate Outcomes of Content Recommendation Algorithms with Distributional Inequality Metrics, 3 Patterns (2022), https://www.cell.com/action/doSearch?text1=Measuring+Disparate+Outcomes+of+Content+Recommendation+Algorithms+with+Distributional+Inequality+Metrics&field1=AllField&journalCode=patter&SeriesKey=patter [https://perma.cc/99S3-YHXU].Another is demographic bias: Users’ tendency to preferentially engage with some types of posters may be amplified by the algorithm.69 69. Christine Bauer & Andrés Ferraro, Music Recommendation Algorithms Are Unfair to Female Artists, but We Can Change That, The Conversation (Mar. 30, 2021, 8:58 AM), https://theconversation.com/music-recommendation-algorithms-are-unfair-to-female-artists-but-we-can-change-that-158016 [https://perma.cc/7R56-R3SP].Ultimately, designing for neutrality ends up rewarding those who are able to hack engagement or benefit from social biases. — [view in context](https://hyp.is/FNyY7sI1Ee2K-pfyfEtX1g/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: "Ultimately, designing for neutrality ends up rewarding those who are able to hack engagement or benefit from social biases." Neutrality is also a power-replication machine?> I think there are two driving principles behind Facebook engineers’ thinking that explain why they’ve left themselves with so little control. — [view in context](https://hyp.is/Rc5dZsI1Ee20fSMKfkzIpA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: 1. "First, the system is intended to be neutral towards all content, except for policy violating or “borderline” content." 2. "The second main driving principle is that the algorithm knows best. This principle and the neutrality principle reinforce each other."> The algorithm-knows-best principle means that the same optimization is applied to all types of speech: entertainment, educational information, health information, news, political speech, commercial speech, art, and more.70 70. There are a few small exceptions. For example, after the 2020 U.S. elections, Facebook began experimenting with decreasing the amount of political content in news feeds ( Anna Stepanov, Reducing Political Content in News Feed, Meta Newsroom (Feb. 10, 2021), https://about.fb.com/news/2021/02/reducing-political-content-in-news-feed/ [https://perma.cc/AC3G-DP2U]). Later in 2021, the company made a more drastic attempt to demote all political content, but this had the unanticipated effect of suppressing high-quality news sources more than low-quality ones, and misinformation in fact rose ( Jeff Horwitz et al., Facebook Wanted Out of Politics. It Was Messier Than Anyone Expected., Wall St. J. (Jan. 5, 2023, 9:51 AM), https://www.wsj.com/articles/facebook-politics-controls-zuckerberg-meta-11672929976 [https://perma.cc/R5MP-A2L2]).If users want more or less of some types of content, the thinking goes, the algorithm will deliver that. — [view in context](https://hyp.is/WbbZNMI1Ee2_HZ-wc0uOzA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > The issues I identify in this section will persist even if companies improve transparency around their algorithms, invest more resources into content moderation, and provide users more control over what they see. — [view in context](https://hyp.is/lIXIRMJAEe2PvKuA4Utzog/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Optimising for engagement. Note, this appears to be a different critique than the 'ad model'.> The problem with implicit feedback is that it relies on our unconscious, automatic, emotional reactions: “System 1,” rather than our rational and deliberative mode of thought: “System 2.”75 75. Daniel Kahneman, Thinking, Fast and Slow (Macmillan, 2011).A rich literature in behavioral economics documents the biases that System 1 suffers from. A TikTok user might swipe past a video by a medical expert reminding people to get a flu shot because she doesn’t look like the stereotype of a medical expert, and dwell on an angry video that they enjoy in the moment but regret later. By default, implicit-feedback-based feeds cater to our basest impulses. — [view in context](https://hyp.is/zQJyvMJAEe2UaitBx6uLAw/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Implicit feedback, i.e. inferring based on user's actions is imperfect. > The increase in distribution of viral content comes at the expense of suppressing more boutique types of content. — [view in context](https://hyp.is/Z6jsksJBEe23aNvnSIwRYQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > I want to highlight one particular set of harms to society, pertaining to institutions and markets: — [view in context](https://hyp.is/tzEIisJBEe2PtafESTeYcw/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > Each institution has a set of values that make it what it is, such as fairness in journalism, accuracy in science, and aesthetic values in art. Markets have notions of quality, such as culinary excellence in restaurants and professional skill in a labor market. Over decades or centuries, they have built up internal processes that rank and sort what is produced, such as peer review. But social media algorithms are oblivious to these values and these signals of quality. They reward unrelated factors, based on a logic that makes sense for entertainment but not for any other domain. — [view in context](https://hyp.is/HPyKmsJCEe295BeXNVWPyA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > Why haven’t they done more? The most obvious explanation is that it hurts the bottom line. There’s certainly some truth to this. The reliance on subconscious, automatic decision making is entirely intentional; it’s called “frictionless design.” The fact that users might sometimes exercise judgment and resist their impulses is treated as a problem to be solved.82 — [view in context](https://hyp.is/kEQuVMJCEe2FHbNYT_EcKQ/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > I don’t think this is the entire answer, though. The consistent negative press has genuinely hurt platforms’ reputation, and there have been internal efforts to do better. So it’s worth talking about another limitation. Most of the drawbacks of engagement optimization are not visible in the dominant framework of platform design, which places outsize importance on finding a quantitative, causal relationship between changes to the algorithm and their effects — [view in context](https://hyp.is/oESpKMJCEe2XkmstkVHmjA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > The user’s experience of the app as an individual is, on balance, positive at all time scales, but there has been a barrage of negative press about its harmful effects on other people and for democracy. The disconnect could be because individual users don’t necessarily internalize societal harms: Users who consume election misinformation may actually love it. Or it could be because some harms such as privacy are structural and cannot be understood as the aggregate of individual, transactional harms — [view in context](https://hyp.is/AwcrssJDEe2Yp6usEK1zhg/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Harms manifest in the aggregate, and may even be perceived as benefits at the level of the individual. > Experimenting on users critically relies on the assumption that each user’s behavior is independent. Collective harms completely violate this assumption. Even if the platform were to run a yearslong A/B test, societal-scale harms such as undermining democracy affect all users (and nonusers), so the churn in the experimental group wouldn’t necessarily be any higher than in the control group. — [view in context](https://hyp.is/wcX3DsJDEe2vlTMV_RNVrA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Limitations of data science. How do you test for harms to democracy?> At their core, recommendation algorithms are a response to information overload — [view in context](https://hyp.is/Ej9vEsJEEe2rpo9T3V4NCA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: This is key. We haven't found the balance between information scarcity and information overload.> Finally, let’s keep in mind that "reverse chronological" is an algorithm, albeit a simple one. Chronological feeds are not normatively neutral: They are also subject to rich-get-richer effects, demographic biases, and the unpredictability of virality. There is, unfortunately, no neutral way to design social media. Algorithmic recommendations could in fact be an opportunity to actively counteract harmful patterns of information propagation. — [view in context](https://hyp.is/XPqiQsJEEe2tpqfC1_9t9w/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Reverse chronological is also an algorithm subject to many of the same issues - rich-get-richer effects - demographic biases - the unpredictability of virality> ecommendation algorithms remain poorly understood by the public. This knowledge gap has consequences ranging from mythologizing algorithms to policy stumbles.87 87. Kelley Cotter et al., In FYP We Trust: The Divine Force of Algorithmic Conspirituality, 16 Int'l J. of Commc'n 2911 (2022); Daphne Keller, Amplification and Its Discontents: Why Regulating the Reach of Online Content is Hard, 21-05 Knight First Amend. Inst. at Colum. Univ. (June 8, 2021), https://knightcolumbia.org/content/amplification-and-its-discontents [https://perma.cc/23KP-27GT].Of course, algorithms aren’t the whole picture: Just as important is the design of social media, platform processes, their incentive structures and, most critically, human-algorithm interactions. Demanding much more transparency from platform companies—and not being easily swayed by their arguments about competitive risks—will go a long way toward improving our understanding of all these aspects of social media. — [view in context](https://hyp.is/AFymksJFEe2XlT-KRI4DvA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) > imagine a future where children learn how platform algorithms work, just as they learn about other types of civic infrastructure and grow up empowered to participate in a healthier way on algorithmic platforms, as well as to help govern them. — [view in context](https://hyp.is/CwArbsJFEe2H3lcrA6XulA/knightcolumbia.org/content/understanding-social-media-recommendation-algorithms) - Annotation: Creators already do.