Professor Hany Farid of UC Berkeley spoke at Avast’s CyberSec&AI Connected virtual conference last week. The event showcased leading academics and tech professionals from around the world to examine critical issues around AI for privacy and cybersecurity.
Farid has spent a lot of his time researching the use and evolution of deepfake videos. It was an intriguing session and demonstrated the lengths that the creators will go to make them more realistic and what security researchers will need to do to detect them.
His session started off by taking us through their evolution: What began as innocent and simple photo editing software has evolved into an entire industry that is designed to “pollute the online ecosystem of video information.” The past couple of years has seen advances in more sophisticated image alteration and using AI tools to create these deepfakes. Farid illustrated his point by merging video footage of Hollywood stars Jennifer Lawrence and Steve Buscemi. The resulting clip retained Lawrence’s clothes, body, and hair, but replaced her face with that of Buscemi. Granted, this wasn’t designed to fool anyone, but it was a quite creepy demonstration of how the technology works nonetheless.
Farid categorizes deepfakes into four general types:
Non-consensual porn, which is the most frequently found example. One woman’s likeness is pasted into a porn video and distributed online.
Misinformation campaigns, designed to deceive and “throw gas on an already lit fire,” he said.
Legal evidence tampering, such as demonstrating police misconduct that never actually happened. His non-academic practice has frequent consultations in this area, where he is hired to ferret out these manipulations, and
So how do you detect these fakes? One way is to very carefully analyze the facial mannerisms and expressions and see how they are unique to each individual. He calls this a “soft biometric,” meaning it isn’t an exact science in the same way DNA or fingerprints can ID someone. The predictability goes up for often-filmed celebrities, where there is a huge amount of existing video footage that can be used to compare these visual “tics.” As an example, try saying the words mother, brother and parent without closing your mouth. “You can’t do it, unless you are a ventriloquist,” he said. When Alec Baldwin does his Trump impressions, he doesn’t quite get these mannerisms exactly right, which can be a “tell” to indicate that it could be a fake. He has mapped the various political candidate videos on an earlier project, and you can see that there is a cluster of fake Obama videos from this graph:
There are several challenges ahead. First the technology is quickly evolving and getting better at creating more convincing deepfakes. The transmission velocity across social networks is also increasing. What used to take days or weeks now gets noticed within hours or even minutes. The public is now polarized, which means that people are willing to believe the worst in those holding opposite viewpoints or those that they don’t particularly like. There is also the rise of what he calls the liar’s dividend, meaning that just saying something is fake is usually enough to neutralize something, even when it isn’t. “That means nothing has to be real anymore,” said Farid.
Social media platforms need to be proactive
“There is no single magic answer to solving the misinformation apocalypse,” argues Farid. Instead, the social platforms must be more responsible, and that means a combination of better labeling, a better focus on regulations of reach (rather than just deleting offensive or fake content), and the presentation of alternative views
Professor Farid spoke at CyberSec&AI Connected, an annual conference on AI, machine learning and cybersecurity co-organized by Avast. To learn more about the event and find out how to access presentations from speakers such as Garry Kasparov (Chess Grandmaster and Avast Security Ambassador) visit the event website.