Voiceverse NFT plagiarism scandal
In January 2022, 15—the pseudonymous Massachusetts Institute of Technology artificial intelligence researcher and creator of the non-commercial generative artificial intelligence voice synthesis research project 15.ai—discovered that the blockchain-based technology company Voiceverse had plagiarized from their platform. Voiceverse marketed itself as a service that offered AI voice cloning technology that could be purchased and traded as non-fungible tokens.
Amid heightened controversy over NFTs in the gaming industry, voice actor Troy Baker announced his partnership with Voiceverse on January 14, 2022, triggering immediate backlash over concerns about the environmental impact of NFTs, potential for fraud, predatory monetization in video games, and the potential of AI displacing jobs for human voice actors. Later that same day, 15 revealed through server logs that Voiceverse had generated voice lines using 15's free text-to-speech platform, pitch-shifted the audio to make them unrecognizable, and falsely marketed the samples as their own technology before selling them as NFTs. Within an hour of being confronted with evidence, Voiceverse confessed and stated that their marketing team had used 15.ai without proper attribution while rushing to create a technology demo to coincide with Baker's partnership announcement, further exacerbating the already negative reception to the original announcement. In response, 15 replied "Go fuck yourself"; the interaction went viral and garnered a large amount of support for the developer. News publications universally characterized this incident as Voiceverse having "stolen" from 15.ai.
The next day, Baker appeared on a podcast and stated that his motivation had been to help independent creators who were unable to afford professional voice actors. Following continued backlash and the plagiarism revelation, Baker ended his partnership with Voiceverse on January 31, 2022. Subsequently, the incident was documented in multiple AI ethics databases, criticisms of predatory monetization in video games, and retrospectives as one of the earliest instances of plagiarism and theft stemming from artificial intelligence during the AI boom.
Background
Troy Baker
is a prominent voice actor in the video game industry best known for his performances as Joel Miller in The Last of Us franchise. Baker has been described as "ubiquitous" by Polygon, "one of the most high-profile and prolific voice actors in video games" by Eurogamer, and "arguably the most famous voice actor in the gaming industry" by GameGuru. His other prominent roles include voicing Agent John "Jonesy" Jones in Fortnite, Booker DeWitt in BioShock Infinite, and both Batman and Joker in multiple Batman video games., Baker holds the record for the most acting nominations at the BAFTA Games Awards, with five between 2013 and 2021.Voiceverse
Voiceverse is a blockchain-based startup founded by the Bored Ape Yacht Club that marketed itself as offering AI voice cloning technology in the form of NFTs. Prior to the announcement of their partnership with Baker, Voiceverse had partnered with LOVO, Inc., an AI voice platform that, according to LOVO, could generate human-like voices. Voiceverse stated that any user who purchases a voice NFT would have unlimited and perpetual access to the voice model, which could be used to create content such as audiobooks, YouTube videos, podcasts, e-learning materials, in-game voice chat, and Zoom calls. Voiceverse promised that buyers would "OWN sic| all of the IP" of content they created using these voices. Voiceverse's roadmap included plans to release 8,888 initial voice NFTs, a feature to add emotions to existing voices, and the ability for users to mint their own voices as NFTs. Prior to Baker's partnership, Voiceverse had also partnered with voice actors Charlet Chung, who voices D.Va in Overwatch, and Andy Milonakis of The Andy Milonakis Show.15.ai
is a free web application launched in 2020 that uses artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by a pseudonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at MIT, it was an early example of an application of generative artificial intelligence during the initial stages of the AI boom. The platform showed that deep neural networks could generate emotionally expressive speech with only 15 seconds of speech; the name "15.ai" references the creator's statement that a voice can be convincingly cloned with just 15 seconds of audio, as opposed to the tens of hours of data previously required. 15.ai became an Internet phenomenon in early 2021 when content utilizing it went viral on social media and quickly gained widespread use among various Internet fandoms. 15 has emphasized that it remain free and non-commercial; it only requires users to give proper credit when using the service for content creation.NFTs in the video game industry
By early 2022, NFTs had become highly controversial within the gaming industry. Critics raised concerns about their environmental impact due to the significant energy consumption of blockchain technology. In addition, the prevalence of scams, fraud, and potential money laundering associated with NFT sales, as well as fears that NFTs were a new form of predatory monetization following the increasing frequency of loot boxes, caused vocal pushback from the gaming community. Several major gaming companies had begun exploring NFT integration into their products, though fan backlash had already forced some projects to be cancelled. On December 16, 2021, the developers of S.T.A.L.K.E.R. 2: Heart of Chernobyl announced that they would be including NFTs in the game, but cancelled within an hour of the announcement due to immediate universal backlash. Simultaneously, the rise of AI voice technology raised concerns among voice actors about potential job displacement and the devaluation of their work amidst the voice acting industry's ongoing struggles for better compensation and working conditions.Partnership announcement and backlash
On January 14, 2022, 1:02a.m. EST, Baker announced on Twitter that he was partnering with Voiceverse "to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create." The announcement concluded with the statement "You can hate. Or you can create." Baker's specific role with Voiceverse remained unclear at the time of the announcement.File:Chubbiverse_promotional_video.webm|thumb|left|upright=0.75|A Chubbiverse promo video posted by Voiceverse. Server logs later showed that the voices were actually generated using 15.ai and had been distorted to sound unrecognizable.
Along with Baker's announcement, Voiceverse promoted their supposed voice AI technology on Twitter by posting animated videos that featured a cat character created by NFT firm . The videos concluded with text that read "The Voice Powered By Voiceverse"; Voiceverse stated on Twitter that the voices in the animations had been generated using their own AI voice synthesis technology and presented the videos as a technology demonstration of their voice NFT capabilities.
The announcement provoked immediate and widespread backlash from the gaming community. Baker's tweet received thousands of replies and quote retweets, far more than the number of likes; Michael McWhertor of Polygon described it as a "textbook example of being ratioed" and commented that reactions had been amplified by the final part of Baker's announcement. Michael Beckwith of Metro called Baker's approach "bizarrely aggressive".
Later that day, Baker responded to the backlash by apologizing for his choice of words. He said he appreciated people's thoughts and acknowledged that the "hate/create part might have been a bit antagonistic," calling it a "bad attempt to bring levity". Despite the apology, Baker and his fellow voice actors did not distance themselves from Voiceverse at this point. At the same time, Voiceverse attempted to address the criticisms, stating that they were working to move to more environmentally friendly blockchain technology and that voice actors would receive royalties from NFT sales, with actors benefiting from any increase in NFT value.
Plagiarism revelation
An excerpt of the log files that were posted by 15 showing that Voiceverse had used 15.ai to generate voices of Twilight Sparkle and Rainbow Dash, with text matching those spoken in the Chubbiverse promotional video
On December 13, 2021, amidst the increasingly negative reactions toward NFTs among the general public, the creator of 15.ai announced that they had "no interest in incorporating NFTs into any aspect of work."
On January 14, 2022, 11:17a.m. EST, 15 commented on the Voiceverse venture, stating that it "sounds like a scam". Two hours later, at 1:20p.m., 15 explicitly accused Voiceverse of "actively attempting to appropriate work for own benefit." 15 provided evidence through server log files that showed that the voices Voiceverse was claiming credit for had actually been generated by 15.ai. The log files, which showed the details of the server request–responses exactly matching up with those present in Voiceverse's video, proved that Voiceverse had used 15.ai to create the voice samples that they were marketing as their own technology. The Chubbiverse promotion videos featured distorted voices of characters from the animated television series My Little Pony: Friendship Is Magic. The voice lines had then been sold as NFTs, a violation of 15.ai's terms of service, which explicitly prohibited commercial use and required proper attribution.
Voiceverse initially responded sarcastically before deleting that response. At 2:09p.m. EST, Voiceverse wrote in an apology to 15: "We are extremely sorry about this. The voice was indeed taken from your platform, which our marketing team used without giving proper credit. Chubbiverse team has no knowledge of this. We will make sure this never happens again." In their Discord server, Voiceverse further stated that their marketing team had been in such a rush to create a partnership technology demo that they resorted to using 15.ai without waiting for their own voice technology to be ready.
In response, at 3:34p.m. EST, 15 tweeted "Go fuck yourself"; the interaction went viral, garnering widespread support for 15. In a subsequent statement, the creator expressed being "extremely depressed" by the incident and wrote: "Not only because my work was stolen and used for profit, but also because of this scandal, the entire field of vocal synthesis is now being misrepresented by charlatans who are only in it for the money." Voiceverse subsequently deleted the incriminating tweet, but Twitter users had already saved and reshared the video.