Deepfakes are near-perfect videos of things that never happened — of Presidents, and Prime Ministers, and professors, and scientists, and historians, and celebrities, and ordinary people saying things they never said, and doing things they never did. Deepfakes are near-perfect photos and morphing videos of people who never existed. Deepfakes are near-perfect “nude photos” of real people who were actually photographed with their clothes on. Deepfakes are near-perfect pornographic videos of real people — including celebrities — having sex with fake people, or with other celebrities, or people they’ve never met.
Deepfakes combined with the internet and social media are a reality-obliterating engine without parallel in the history of the world.
Deepfakes already exist to a limited extent, and at this point, they’re mostly just a novelty. As they get better, they will further decimate our collective ability to inhabit a shared reality. Deepfakes could also alter our understanding of history, because deepfakes can be artificially aged, and made to mirror (and distort) any historical moment. That is, any moment since photography or motion-picture film or video were invented. Which means that both the powerful, and mischief-makers, will gain yet another devastating tool of mass confusion, stylized for any moment of interest, past or present.
Photography and photomanipulation
The photographic era began in the early part of the 19th century. Photography promised truth beyond what a painter could render of a scene. The light that passed through a lens and onto a photographic plate was an accurate representation of what existed in front of the lens at the moment of exposure. Prior to photography, a painter had to decide to capture certain details and omit others. But a photographic plate captured it all. The physical connection of a silver nitrate photographic plate to the light bouncing off reality is what has given photography its power. Power to tell the truth — and to lie.
Photography has always required context. What is in the frame is often less important than what is outside of the frame. By making a decision of what to include, the photographer made the far more important decision of what to exclude. A photo is therefore always an editorial. Framing is authorship. This decision isn’t only spatial, it’s also temporal — when will you trip the camera shutter? We’ll come back to that, later.
Photos, of course, could be altered. Alterations were crude at first. Double exposures led to trick photography. Lenses could be used to force perspective and alter apparent scale. Negatives and prints could be painted, or cut and pasted. Photorealistic alterations weren’t easy at first. Then came the airbrush, and finally Adobe Photoshop. Photoshop allows almost infinite layering, masking, cloning, colorizing, airbrushing, scaling, combining, to the point where it’s become its own art form.
For decades, we’ve expected Photoshop-surrealism in art and advertising. Yet a news photographer, or news organizations can have their reputations ruined overnight, for even slightly altering a photo of an event. This is as it should be. We take for granted, even now, that photos we see from hard-news organizations are authentic, because reporting provides us with the necessary context. Beyond mere reputation, though, there is provenance in the form of metadata, (location, time, exposure settings), along with corroborating photos from multiple cameras, and often video. Detecting photo fakery through forensics is straightforward, by looking for disturbance of the organically randomized patterns of pixels that are present in authentic photos.
We take note of the frequent stories of fraudulent photomanipulation, even by governments. The most common fakery seems to be cloning: of people to make crowds look larger, or of weapons systems, such as was famously done by Iran’s government. People and objects are also commonly removed from photos.
Sometimes obvious photo fakery is the point, especially in misleading memes. Putting heads on different bodies, altering facial expressions, combining people with animals, cutting and pasting people and objects in places they never were. Most of these types of memes don’t even try to hide their manipulation. Fakery is the point. People’s imagination does the rest. Like false, lurid, salacious prose, fake photos can tend to go viral much more readily than real photos. The same is true for obvious video fakery. Haha, President Trump is whacking Hillary with a golf ball, and she falls down. Hahahahahahaha. No one thinks it’s real, but the partisan point is made. The rule of thumb is that fake stories that support hyper-partisan viewpoints circulate about six times faster than real stories. And this rule will no doubt hold for deepfake photos and videos, as well.
The power of photography to fool us goes right back to the truth of that original photographic plate. Seeing is believing. Same thing with video editing. A montage cut from a single train moving in one direction on a track, but mirror imaged, can provide the illusion that two trains are about to hit each other. Subconsciously, when we see an image, we assume it came directly from a camera. When we see an edited video, our brains interpret it as a single, connected event. Our brains haven’t yet caught up to the number-crunching tomfoolery that’s possible. Even when we know better, our brains still see an authentic image recorded as if in silver nitrate, or a video of two trains about to collide — where only one existed.
It doesn’t have to be visual or auditory trickery, though. Good old fashioned out-of-context quoting works also. In October 2020, the Trump campaign deceptively edited a video of Dr. Anthony Fauci making a statement, to make it seem as if he supported President Trump’s Covid-19 response. What he actually said was referring to the efforts of public health officials, not President Trump. When publicly called out on the lie, rather than expressing contrition, GOP campaign officials insisted Fauci’s words were accurate.
Fakery works, and it pays dividends.
Cash and reputational politics
Where clicks translate to money and political power, the incentive for fakery is strong. In democracies, fakery leads to terribly effective perceptual shifts that can, and do, swing elections.
Fake news, photos, and videos hurt good politicians more than they hurt bad ones. Bad politicians, who usher in corrupt governments, have already abandoned any pretense of a good reputation. Their authoritarian “strongman” image is often tied to their faux badassery — of refusing to follow rules and laws. Actual videos of them doing or saying antisocial things seem to help them, rather than hurt them.
Politicians promising clean government and popular reforms must keep their trustworthy reputations intact. They are therefore uniquely vulnerable to photo and video manipulation.
One of the most powerful tools of fascists has always been to brazenly accuse their political enemies of exactly what they’re guilty of. Fascists’ goal is to destroy the very possibility of integrity.
“There is no truth or goodness, everyone’s in the swamp, so you might as well vote for our guy, who’s not pretending.”
This propaganda gambit worked for 20th-century European fascists, and it’s been equally useful to usher in our 21st-century American fascism. Remember the overquoted words of Hannah Arendt, “The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction (i.e., the reality of experience) and the distinction between true and false (i.e., the standards of thought) no longer exist.”
Deepfakes are therefore a direct engine of fascism. Four lights or five lights? With deepfakes, we’ll have no idea.
The Trump administration has relied on this tactic from the very beginning. President Trump was exposed on video bragging about sexually assaulting women, and he got elected anyway in 2016. He has dozens of credible sexual misconduct allegations against him including a lawsuit alleging the rape of a child. He’s now using the Justice Department to defend himself from at least one of his rape accusers. His business interests and his presidency have been unparalleled galleries of open nepotism. His cronyism is legendary. He has multiple bankruptcies and is almost certainly guilty of tax fraud. He’s sold out his country in a thousand ways to foreign interests.
In spite of this, his supporters inexplicably want to “lock up” the Bidens, and spin wild conspiracy theories in which Democrats are the actual pedophiles, and the Bidens are a “corrupt crime family.” Even worse, democrats are not only pedophiles, but they also eat children, and drink their blood! These vicious and slanderous stories circulated against Hillary Clinton in 2016 as well, forming the basis for Pizzagate which infamously morphed into QAnon. It’s all tabloid “alien baby” nonsense, but it’s been picked up by global news organizations, convincing millions that there might be something to it.
Belatedly, social media companies are now closing the door on QAnon, but the horse has already galloped far, far away from the barn.
Normally, seeing is believing. But with QAnon and other lunatic right-wing conspiracies, these stories have bamboozled about a third of the population without providing any evidence. And certainly nothing for anyone to see.
QAnon, bad as it is, is entirely the product of the pre-deepfake era. What horrors await when conspiracies come complete with fabricated visual “evidence?”
Welcome to the 2020s and beyond
Jordan Peele famously altered a clip of President Obama in 2017, rendering an almost indistinguishable video of him saying things he didn’t actually say. If you looked closely, you could tell that it wasn’t perfect. It was also Jordan Peele’s voice. So the clip wasn’t 100% fake, just misleading. The astounding fact is that Peele created a plausible video — that never existed — of an American president speaking. That same year, researchers at the University of Washington also created several fake clips of Obama speaking, using facial modeling.
As always, software gets better, and better, and better. Computers get faster every year. Software gets cheaper, and requires less skill to operate.
In 2018, Baidu released an algorithm to create vocal mimicry. It’s now becoming possible to resynthesize anyone’s speech to make them convincingly say anything.
In 2019, thieves used an audio deepfake of a CEO’s voice to wire transfer a quarter of a million dollars into their account. They were never caught.
The visual portion of the deepfake will soon be undetectable to the naked eye. It won’t matter that forensics can still detect a faked video, or synthetic voice clip, or that a victim of this fakery might strenuously deny ever having uttered the words.
Increasingly, seeing and hearing, will be believing.
Forget kompromat. That’s old school. Imagine near-perfect deepfake video of a candidate masturbating in a public bathroom, or having sex with someone who’s not their spouse. Imagine near-perfect deepfake video of a candidate shoplifting or committing arson, or consorting with known political rivals, or criminals, or taking bribes from business leaders. Imagine a near-perfect video of a straight, married candidate cavorting in a gay bar, or going to a motel with a rent boy. Imagine someone releasing a near-perfect rendition of a Trumpian “pee tape.”
Manipulating voters like this is becoming too damn easy. Fakes that reinforce voters’ confirmation bias, are a slam-dunk. So we know they will increasingly become a fact of political life.
Flooding the justice zone with shit
Cell phone video of police brutality has become an incredibly powerful tool of social justice. American history is filled with lynchings of black people. But long after lynchings supposedly stopped in 1981, police were still murdering black people with impunity.
Before we saw the video of George Floyd, or Philando Castile, or Eric Garner, black men in America were slaughtered in large numbers, under color of authority. We don’t know most of their names or their stories, because no one was there with a camera. Seeing it happen frequently in high-definition video, changed everything. Now cops are on notice that bystanders will be recording them. They also have body cameras. Turning off those cameras automatically places them under suspicion.
That’s huge progress!
And yet — increasingly — people who don’t want to admit what’s happening are questioning the significance of actual police murder videos!
“What happened before the tape” is a repetitive refrain, inexcusably used by those who oppose police reform. This brings me back to my point at the beginning of the article, about spatial and temporal framing, and editorializing: “What’s inside the frame,” vs. “what’s outside the frame,” becomes, “when was the camera started, and when did it stop?”
In the mind of a police apologist, a victim of a police shooting is always imagined to have done something horrible, deserving of death, before the camera was turned on. So even now, in the pre-deepfake era, even when a person is lying on the pavement under a cop’s knee, some Americans will still justify their murder. Even if a suspect is on video with their hands up or their back turned, running away, some Americans will still justify their death in a hail of taxpayer-funded bullets.
That’s today. Now imagine what those who want police accountability will be up against in 2025 or 2030: deepfake “bystander videos” of police assassinations that purport to show the event from a different angle, and in a very different light.
By the middle or end of this decade, we won’t be able to trust a single video we see on the internet. We will have experienced dozens of incidents where convincing video of everything from police shootings to presidential interviews, to wars, and genocides, and atrocities, and beheadings, will have been exposed as fake. The enemies of democracy and accountability will churn out an endless stream of nonsense, complete with alternative “video evidence.”
If nothing is real, then no one is accountable, and anyone can be accused, or exonerated.
When you can’t trust evidence, everyone’s either guilty — or everyone’s innocent, depending on your point of view. Cynicism will be off-the-charts, even by 2020’s abysmal standards. Eventually we may adapt, and discover ways to authenticate footage, and suppress deepfakes. But until then, it’s going to be the Wild West for an indefinite amount of time. The feudalist and the anarchist are both salivating equally at the broad opportunity deepfake videos will provide, to further subvert justice.
How do we maintain social or legal accountability, when even video evidence can’t be trusted? Can the rule of law even survive? Can society?
The mechanics of deepfakes
Inorder to understand the power of this fakery, you need to understand how it works. Deepfake video uses the same principles as the high-budget CGI we’ve seen in blockbuster content like Game of Thrones or Star Wars or the Marvel Cinematic Universe, but applied to everyday situations.
Previously, creating convincing CGI trickery took massive amounts of time and computing power. Studios employed dozens or hundreds of artists, and built or leased vast render farms from cloud-computing providers, at great expense. Today, advances in computing power and software have greatly reduced the computing requirements to produce convincing footage. And, new techniques and software are being developed constantly, to reduce the laborious aspects of the process.
Soon, you won’t need green screens and visual effects artists to extract a 3D-model or motion-capture data from a piece of video. Computers are becoming capable of analyzing and recreating a 3D simulacrum of a scene or person quickly — with little to no skill required. It’s not there yet, thankfully. As mentioned, you can already feed a sample of someone’s voice into a voice generator, and that software will reproduce a credible reproduction of that person saying any combination of words you desire. As the software gets more refined, you’ll be able to edit their vocal inflection, and the video animation will be able to provide facial expressions to match.
As these tools are perfected, it will mean that it will no longer take someone with the resources of a film studio or government to produce high-quality deepfakes. And turnaround time will be blindingly fast. Breaking news will soon be augmented with a zillion “alternative” deepfakes.
The example I used in the first paragraph, of “nude photos” of clothed people, exists today. Deepfake celebrity porn is already a huge category. No amount of hand-wringing, or even regulation, will stop this evolution. People are going to have to recognize that creating a simulated 3D body-model of someone in a photograph, then covering it with a texture of realistic-looking skin and genitalia, will soon become a technical triviality. And, even if social media bans them, these photos and videos will be everywhere.
Deepfake technology is going mainstream, no matter how much anyone frets about it.
And get ready, because it won’t just be nudes. Everyone from teenage pranksters and forgepreneurs, to governments, to terrorists and extortionists, will eventually be producing feature-film-quality fakes. Some for lulz, some for power, and some for murder. Their topics and targets will run the gamut from run-of-the-mill revenge porn, to destroying opposition candidates, to inciting genocides and terror attacks — by faking genocides and terror attacks by the other side. Real people will have their lives ruined in countless ways, the political order will be further upended, and real people will die.
I first recognized the coming deepfake crisis, when I saw the films District 9, and Cloverfield. Both of those films came out more than 10 years ago. It’s one thing to see pristine CGI special effects in a space opera, or superhero film. We’ve all gotten used to that. But it’s another thing entirely when you see shaky, grainy footage that looks like news, but with aliens and monsters. I remember thinking at the time, “I’m glad this is still expensive and technically difficult.”
But right then and there, I knew we were eventually in for a world of hurt. And that hurt is coming soon, to a screen near you.