Currently, there is an ongoing beef between Google and OpenAI. We’re not just talking about Gemini versus ChatGPT. We’re talking about Sora. Sora is an OpenAI video generation platform, and we suspect that the company used video data from YouTube to train it. Google’s not happy about that, and this simple fact outlines the hypocrisy of the company.
To catch you up, OpenAI unveiled Sora back in February this year, and it both wowed and frightened the community. Obviously, this is the sort of technology that could put many filmmakers out of business. Well, in order to make videos that realistic, the model behind Sora needs to have been trained on a ton of video data. Well, what is one of the biggest video reservoirs in the world? YouTube.
So, it’s tempting to believe that OpenAI trained Sora using YouTube videos. The issue with that is the fact that, if this is true, it will technically be in violation of YouTube’s terms of service. Google and Alphabet CEO Sundar Pichai mentioned this during a recent interview with the Verge.
If OpenAI trains Sora on millions of videos, then that could possibly get the company into some deep legal trouble with Google.
Google’s beef with Sora highlights the company’s hypocrisy
If Google doesn’t want OpenAI to scrape YouTube videos, then, according to the terms of service, the company is well within its right to take action. However, we can’t Overlook the fact that Google is being a bit hypocritical about this. The company does not like OpenAI scraping videos from its platform. Well, what is Google doing to basically the entire internet? It’s doing the same thing!
Google’s crawlers crawl through websites and extract information from them. This information is used to feed data into Gemini, and Gemini sometimes regurgitates that data when asked. What makes this a worse issue is its AI Overviews. Google’s AI Overviews basically lets users bypass visiting websites in order to save a few minutes of research.
It’s obvious that this feature could screw over millions of websites, especially news websites that rely on ad revenue. Google’s crawlers scrape data from these websites and feed it to Gemini and the chatbot spits it out in a quick overview. At that point, what’s the point of reading the articles? There is none! Sure, there will be some people who will venture further and do more research. However, how much of the population will do that compared to those who will read the overview and go on with their day?
The only thing that makes this worse is the fact that Google is currently working on adding advertisements to AI Overviews. So, rather than split ad revenue with news publications that rely on it, Google will display ads on its AI Overviews to make money from that. It’s literally cutting off a water supply to an already parched town and redirecting it to itself.
Sound familiar?
It’s a little hypocritical that Google is so up in arms about OpenAI taking its data for its own use when pretty much every person reading this article has had some bit of data scraped from Google. At this point, we are still in the dark as to whether OpenAI trained Sora on YouTube videos. When OpenAI’s Mira Murati was asked if Sora used YouTube videos, she responded with “I’m actually not sure about that.” So, it’s obvious that the company is being coy about it. It’s hard to believe that OpenAI would Source videos from any other site.
YouTube is probably the biggest website with freely available videos. So, if OpenAI did source data from YouTube videos, then we expect there to be some messy legal battle. However, that will only distract from the bigger issue which is Google’s blatant indifference and hypocrisy.