Yes, a government bureaucracy can regulate Facebook without eviscerating the first amendment.
As the title suggests, I’m going to try to make an argument that is completely possible to regulate Facebook and other social media without reactively censoring objectionable content. I’m writing this up following some Twitter conversations in which it became clear to me that I can’t overcome the common objections in 280 character quips. The idea, in a nutshell, is this: Clean up algorithms themselves so that the kind of content we don’t want doesn’t take over the network.
Facebook would like you to believe that monitoring content is impossible. If I was more cynical, I would believe the ham-fisted efforts we’ve seen so far intentionally miss the mark. We’ve all heard many stories of friends getting put in “Facebook jail” over anodyne posts that somehow triggered their mechanisms. I’m not that cynical enough to believe they’re doing this deliberately, but I think they’re putting a very small token of their R&D budget into solving these problems. There’s a basic conflict of interest and nobody is forcing them to do anything about it. That’s why their content moderation algorithms suck. My thesis here is that this can be changed.
Now to be very clear, I’m not talking about a panel of all-powerful Presidential-appointed moderators who zap objectionable content as it comes out, scarring the Facebook landscape with lightning bolts while purging the world of all hateful thoughtcrime. This is absolutely not what I’m talking about. Not only would this probably fail, but it will also be abused, especially if the moderators are appointed in bad faith.
I’m going to outline something crude and slightly wonky. I’m not a government bureaucrat, so I’m not going to do this very well. My goal is only to argue that it can be done.
What The Algorithms do
Before outlining a solution to the problem, I will try to explain how I see the problem: The instant you open your Facebook app, there is a pool of hundreds of posts that Facebook could decide to put on your feed. Facebook has to decide which ones to put in your face, and which ones not to. So how does it do this? My guess is that it gives each piece of content a “you-score”. Maybe that score is tailored to you specifically, maybe it’s tailored to users with some affinity to you, I don’t know the exact underlying algorithm, as I don’t work there, but somehow some sort of you-score number is computed that predicts how well Facebook’s objectives will be satisfied by putting a particular post into your feed. Now an easy first guess of what the score might measure is your probability of engaging with the post itself. Another possibility is that the score may be the expected amount of time that you spend on Facebook after seeing this post. For example, seeing a picture of a family doing happy family stuff might cause you to put down your phone and engage with your loved ones, while for someone else that same picture might cause them to feel a pang of emptiness and propel them on a scrolling journey, searching for something elusive. A political post that utterly infuriates you might inspire you to scroll longer, looking for blood, hoping to query one of your politically opposite friends just exactly who the “sheeple” is now. Again, I don’t know exactly what goes into this decision. I can only follow the incentives. From what I can tell Facebook has two primary incentives as regards any user’s time on Facebook. First, they want you to stay on Facebook for as many hours of the day as possible. Second, they want as much information about you as possible. The more information they have about you, the more they will be able to keep you engaged and online, and more importantly, they can deliver you (or people who behave like you) perfectly placed advertisements, delivering the best possible value to their real clients. Now, this brings up another likely possibility for how they you-score the posts: Predict the probability that you will click through an advertisement that they’re going to show you in about 90 seconds. These sorts of predictions are completely achievable by modern AI, especially considering the trillions of interactions that Facebook is constantly mining. So it’s possible that they show you an article about “the Fed printing money” now, they might follow this up with a post of your brother-in-law on his boat and then, 45 seconds later, hit you with an advertisement to join Coinbase to achieve financial freedom knowing this has a probability of a click-through. This might sound diabolical, but this is a giant corporation with a war chest of billions of dollars and zettabytes of data on human behavior, driven only by the motivation to deliver value to their shareholders. Again, I do not work at a social media company, so I’m not privy to their exact algorithms, so the above is conjecture.
To summarize, Facebook, I believe, scores each post according to some metric and uses this score to decide which one to put in your face.
So the first step is to ask them to show us what goes into this you-score. Not the entire algorithm, just the objective they’re trying to optimize. This is not as complicated as they would like you to believe. We wouldn’t dare ask them to reveal their entire AI stack and the IP therein: The architecture, the number of layers, the parameters, the hyperparameters, the state-of-the-art advances in NLP. We only need to know what is being predicted. Whatever is being predicted has to be something specific because this is how machine learning works; you create a model, and you evaluate the model based on how well it predicts something you actually can measure. Then you tweak the model so that it predicts this objective better. The power of modern-day machine learning is that you can tweak the model incrementally trillions of times so that it works really damn well.
To reiterate this point, in order to train any machine learning model, you have to have your model make a prediction of something specific so that you can evaluate if your model is garbage or not. You have to have that specific prediction be something real and observable, otherwise, you can’t train the model. If you know a few things about machine learning, you know that there are probably huge sets of other intermediate variables that don’t necessarily have real-world meaning, they might describe latent spaces or embeddings — we’re not asking for these variables which can be in-house proprietary IP, we’re asking for what the output is supposed to predict.
Step 1 in regulating Facebook: They can keep all of their 99.9% of the their AI in a blackbox, but they have to tell us what hard predictions the score is based on. Sure, they will throw a fit and talk about how this cramps their style, but pretty much every industry has annoying regulations that we just deal with. This won’t end Facebook.
Now if you’re still with me, surely some of you are still waiting for the moment you get to scream out, “but the first amendment.” We’ll get to that, but so far we’re fine — I think. All we’ve done is what Congress can do according to the commerce clause, by my estimation. (I’m not a legal expert, but I do know a few words.) All I’m suggesting so far is that Facebook has to declare how they are using the content that others have created in order to sell advertisements. To my unassuming legal mind, we’re not talking about regulating speech, we’re talking about regulating commerce, which Congress has every right to do. They might complain about privacy, but corporations have no such right.
We need a government agency to get involved and regulate. It’s impossible to have a one-size-fits-all statute, because we don’t know exactly this will play out. The regulators should have some flexibility.
To begin we require that any company whose business model is to deliver content creators to advertisers on a certain scale and uses algorithms to tailor the content and advertisements(I won’t be too precise here, but it’s feasible to describe pretty well what a social media corporation does) is required to register as a social media company if they want to operate the US. They register with some agency created by Congress, which let’s just call the Social Media Oversight Bureau, or SMOB. The social media company has to declare what the top decision layer of the algorithm is based on.
Now SMOB makes rules about what this function can and can’t be. I don’t know exactly what the form would be, but they could say, for example, that interactions with future advertisements themselves can only account for 45% of the variation in this score, with appropriate specificity. But then we get to add a more controversial layer — we require that deciding algorithm balances between objectionable qualities as well. This is where I know I’m going to get the most flack, so I’ll try to explain this carefully.
For any piece of content that is created on Facebook, one could judge this content as communicating something which is false, one could judge this content as suggesting violence — there are quite a few other categories — racism, harassment, etc. The science of what is and what is not violent is imperfect, but it’s better than most people think. A picture of your dog curled up next to you watching Bridgerton is communicating something neither false nor violent. An ask for recommendations for burgers in Bend, Oregon is neither false nor violent. Most posts are not. A picture of the border wall in Texas with a caption, claiming this wall is on the border between Mexico and Guatemala, on the other hand, is objectively false. A post picturing Mike Pence in gallows with the comments “this is what we do to traitors” is suggesting violence. So how do we de-emphasize such posts? Working for the SMOB is a large team of content regulators. They take 1000’s of random posts coming through Facebook, and they score these on various axes. Say, anodyne posts get 0, slightly iffy ones get 1 or 2, iffier ones get 3, 4 and the horrible nasty ones get a score of 10. Or something. They send these scores back to Facebook. Facebook then has to come up with their own algorithms for predicting the truth scores, based on their own internal AI. My guess is that, if they actually tried to do this, the result would be fantastically accurate. Now we are not yet zapping speech. We are asking Facebook to grade a piece of content based on the probability that it contains false or violent information. Then, here’s the BIG REGULATION: We require Facebook to have use these violence, falsity and other scoring factors in computing their you-score for a post. The SMOB can come up with specifics for a formula or bounds for a formula.
Also, this will be verifiable and auditable: Facebook should be required to make available the scores for any specific post of sets of posts so that it’s verifiable that their algorithms are predicting these things with reasonable accuracy.
But Free speech
Now let’s play through the free speech issue. Fred posts a racist meme. Facebook runs it through their AI, and .003 seconds later determines that showing Steve this post might make Steve more amenable to the LifeLock advertisement they’re about to show Steve, but, because there’s a 99.7% probability that the post will be deemed racist and violent, their algorithm, which was regulated by SMOB, gives this a bad “Steve-score” and instead opts to show Steve a picture of his sister’s dog modeling the new doggie-scarf. Is this a violation of free speech? SMOB did not stop Fred from posting the nasty racist meme. It’s still there, on his page. If you go to Fred’s page, there’s the racist meme. Just like nobody stopped Steve’s sister from posting a picture of her dog. We simply ask Facebook’s algorithm to factor potential racism into the decision about which post to show Steve. We’re not regulating speech at this point we’re regulating commerce because now this involves Facebook monetizing content that Fred made, and congress has every right to regulate this. Again, Fred’s racist meme is still there. It’s not suppressed, it’s not hidden or flagged, it’s just not being used to sell LifeLock subscriptions. If Facebook wants to go the extra mile and remove it, that’s up to them.
Now for objection # 1: “It’s all subjective! Truth is relative.” I don’t buy this. This objection is true, only when interpreted in the absolute. Think about all the judges and juries our court systems are built upon. There’s subjectivity all over the place, but the system works, not perfectly, but it works because we believe in the notion of truth, as something that can be determined. We believe that there is something called a “reasonable person”. We believe in concepts such as “reasonable doubt”. Reasonability is all over the legal system and while this is extremely vague, it’s both flexible and rigid enough that our system works orders of magnitude better than the alternatives. This cynical bullshit that truth can’t actually be determined is at the heart of the Putin/Trump world order. Since we can’t trust truth, we can only trust the strongman.
Objection # 2. “What if Trump Jr. wins in 2024 and takes over the SMOB and order that all of the truth-determiners determine that 2+2 = 5 and racism is not racist? “ Simple, you just set it up so they can’t. Create an independent SMOB run by an independent board. I don’t know, say 13 people on the board. They have lifetime positions, can resign at any time, and can only be replaced by a vote of the remaining people on the board. And these people on the board do all the hiring and firing of the people below them in the bureau. I’m not a legal expert here, but I can google: https://www.justia.com/administrative-law/independent-agencies/
“ Independent agencies are not subject to direct control by the president or the executive branch, unlike executive agencies. The leaders of independent agencies do not serve as part of the president’s Cabinet.
To create an independent agency, Congress passes a statute granting an agency the authority to regulate and control a specific area or industry. The statute provides clear guidelines for the objectives that the agency must work toward and specifies the extent to which the independent agency may exercise rulemaking authority. The regulations enacted by an independent agency have the full force and power of federal law.”
So in order for this to continue to function without Josh Hawley taking over and reprogramming The Truth, you simply have to believe that a group of 13 reasonable board members should be able to find other reasonable people to replace them as they retire or move on. The probability of this going off the rails anything in the near future seems quite remote. And if Josh Hawley can take over the board and declare that 2+2=5 is no longer false, we’re probably pretty much cooked at this point anyway. The goal is to keep people, the voters, sane enough so they don’t elect the crazy politicians. If we keep electing Trumpy politicians, we eventually will succumb to some attack on the system. Having a more sane populace is the first defense against that.
Objection # 2a. “ But the institutions failed us. We can’t trust them.” No, the institutions didn’t fail us. They were tested, some of them failed, but enough of them held. Yes, the Senate should’ve convicted trump the first time, but we can be thankful for the Republicans across the country didn’t pervert the results of the election, despite numerous opportunities to do so. The difference seems that the Senators behave like politicians, who tend to be truth-agnostic. On the other hand, the rest of us, most Republicans or Democrats, the people who count votes and register voters, etc, tend to be less corruptible. So apart from the fact that some political institutions (like the Senate) failed us during some points of Trump’s presidency, I don’t think this means every institution failed us. Trump tried. He tried to overturn the election and it didn’t work, because Truth still holds a lot of value in the US. What we experienced is a reason to build up more institutional protections, not less.
Objection # 3. “Do not use power to suppress opinions you think pernicious, for if you do the opinions will suppress you.”
Here’s where I’m going to have to disagree with the free-speech absolutist when it comes to the notion that people can always figure out the truth eventually. This has been proven demonstrably false over the last decade. The internet has completely failed to help people determine the truth, instead, the internet has given people the conviction to believe whatever they want. Facebook has created completely different reality universes. So the whole idea that the truth will triumph if we just get out of the way does not hold in the social media regime.
The consequences of living in a post-Truth country are obvious. We get Trump and QAnon and attacks on Congress and this doesn’t stop in the foreseeable future.
Taking it one step further
If I were to run the SMOB I would go even further. My feeling is that much of the content, especially political content, is basically junk, even though it might not be explicitly false, or violent. This junk content does not contribute to the dialogue at all, it only serves one purpose, to keep people outraged. This is the real sugary-soda/ nicotine of Facebook. And it too can be quantified — it’s slightly more delicate to do, but it can be scored and minimized. This is something that definitely plagues both sides of the political spectrum — so much of the content is only slightly misleading or based on an appeal to hypocrisy or something that leaves users feeling angry, outraged, and this is the emotional manipulation that keeps users engaged and on Facebook. The 24/7 bombardment of outrage also makes people less amenable to considering the other side of the story.
Also, Blockchain, what could possiblye go wrong?
Facebook is also attempting to create a stablecoin cryptocurrency that Facebook would love to see become a widely used international settlement currency. Noble aspirations to “bank the unbanked” aside, this ability to pair user data with everything that you do with your money in the future (even when you put the app down) is crazy. Blockchains are open and traceable, provided you know which addresses belong to whom. So Facebook would be able to compare which content they show you with your spending habits after they show you the content. I’m not willing to just let them run away with this power, consider how untrustworthy they’ve behaved in the past.