In the ever growing world of social media, new terms and phrases can enter the public dialogue in an instant, and sometimes it can be hard to keep up. In this article we will aim to break down the history, controversy and applications of all the latest areas in the world of technology.
Deepfake is a term which was coined in 2017 by a Reddit user by the same name who was demonstrating the use of face swap technology in relation to pornography, but the ideas they built on had been around since 1997 with the Video Rewrite program by Bregler et al.
The CDEI had this to say on the matter “The general public should exercise caution when viewing audio and visual content where the trustworthiness of the source is in doubt. Social media platforms could offer guidance on how to sense-check videos and images, for example by checking the reliability of the source and looking for different versions of the content elsewhere on the internet.”
“We therefore advise that researchers, journalists and policymakers continue to scan the horizon for technological breakthroughs and other developments that might change the overall threat level of audio and visual disinformation.” But what does this mean to the consumers and where did it all begin?
Palo Alto, 1997 – Christopher Bregler, Michele Covell and Malcolm Slaney were working for the technology think tank ‘Interval Research’ developing a program which would use computer vision to track the points around the person’s mouth to rewrite what they had said. The abstract from the paper describes it as “Video Rewrite uses existing footage to create automatically new video of a person mouthing words that she did not speak in the original footage. Video Rewrite uses computer-vision techniques to track points on the speaker’s mouth in the training footage, and morphing techniques to combine these mouth gestures into the final video sequence”. This was one of the earliest formalisations of this technology from an academic setting and helped define the trajectory of future advances.
“The new video combines the dynamics of the original actor’s articulations with the mannerisms and setting dictated by the background footage. Video Rewrite is the first facial-animation system to automate all the labeling and assembly tasks required to resync existing footage to a new soundtrack.” The research goals for this experiment said the system “…can be used for dubbing movies, teleconferencing, and special effects” but we are still only starting to see this technology used in this way.
Below are the results from the experiment.
Following this experiment the academic world occasionally worked on similar projects, and hobbyists around the world started to tinker. Major releases from academia were few and far between. You did see small releases from research teams recreating the work and theorising about the technology, however it wasn’t until 2016 that we saw the next breakthrough release of the Face2Face research paper and accompanying program.
This release is what bought these edited videos into the mainstream as they focused on well known figures like Donald Trump, George W Bush, and Arnold Schwarzenegger. This then spurred on the discussion on whether this could be used for harm on the global stage. Be it from releases from adversarial governments in an attempt of disinformation or to create false and defamatory material to accuse someone of.
Soon after in 2017 one of the most memorable Deepfake video was released alongside a research paper from Suwajanakorn et al. This paper was behind the infamous Obama video and really helped bring this discussion into the public discourse.
While some of the more amateur deepfake videos can be easily identified as fake, more organised groups are releasing videos which to the human eye, are nearly impossible to say which is the faked video. With this we turn to using technology to help us detect the fakes.
When looking into the technology used to detect these fakes, three main categories exist. The first is by utilising current techniques and methods of detecting faked items. A good simple example of this is the use of watermarks on the faked media to make it clear that what you are watching isn’t real, this however comes with the onus being on the producer of said media to be honest and add these watermarks to them. With some of the malicious uses of it listed above, this comes into some problems.
The second category noticed is techniques is social detection, this is where we learn to detect through education and reason, usually using our own eyes. The blink test is a detection method used in the world of deepfakes.
When the algorithms are being trained they are mostly done with videos and pictures of people with their eyes open, and not much data on their faces when blinking. This has led to a method of detecting these fakes by noticing how little the person blinks and how long they blink for. This has allowed for a lot of social media users to be more experienced in noticing when something isn’t quite right and is leading to global social education in making ourselves safer from these malicious forms of technology due to exposure to it.
The final category follows on from what we discussed above: going along the same line as checking the eyes for blinking researchers thought ‘Why not get a computer to do this for us?’. TackHyun Jung et al alongside Konkuk University has devised a method using DeepVision (a machine learning algorithm) to look at the eyes of subjects in videos and saw how well it could be at detecting the fake ones. From the sample set used the algorithm managed to detect the Deepfaked videos with a “87.5%” accuracy. While this result is positive it is only the beginning of utilising the advancements in machine learning algorithms to help the fight back against these malicious deepfakes.
When looking through the history of deepfakes its easy to view it all as a negative, where many benefits do also exist. One of the main areas this technology can be used for good is when it’s applied to accessibility. For those suffering with ALS ( Amyotrophic Lateral Sclerosis ), Deepfakes, or AI-Generated Synthetic media as the technology is more commonly known in this setting. A patient’s voice can be recorded and saved to allow them to continue speaking and sounding the same through a computer generated voice designed around them. This type of technology is just a small step in bringing independence to their lives. Looking to the future this type of technology could be utilised in creating digital voice boxes to restore someone’s ability to speak with their own voice.
VOCALiD is an example of one of these technologies. Their software creates a human voicebank from you to be able to create the synthesized speech through a device. They have worked with many charities specialising in various areas, like the Veteran Affairs department of the US government, ALS Foundation for Life and USP (United Cerebral Palsy).
Hopefully this has piqued your interest into the world of Deepfakes and their history, and shown the possible benefits and the ever present danger with this emerging field of computer science. The fightback against harmful uses of technology has always raged on, for nearly every advancement, the ultimate question of how positive it could really be comes down to how we can find uses to improve everybody’s lives and push forward the research into more societal benefits.
Whilst you must remain vigilant with video clips you come across online, hold a little bit of solace in the fact public awareness and development in ways to stop disinformation are also growing, and while these discoveries might not always make the mainstream news, it is an equal part to the discussion.
Nathaniel is a Web Design Executive who also writes content on technology and loves spending his days researching and building new projects, and generally complaining about new trends.