Google challenges ChatGPT creator for the AI video creation crown

The AI battle between Google and OpenAI (the team behind ChatGPT) is heating up, with both companies rolling out new products, features, and updates this month. Now, Google DeepMind, Google’s AI research lab, is setting its sights on beating OpenAI at the video-generation game – and it just might pull it off, at least for a while.

Google takes on OpenAI’s Sora with Veo 2

DeepMind has unveiled Veo 2, the next-generation video-generating AI and the successor to Veo, which powers various products in Google’s lineup. Veo 2 can produce clips longer than two minutes, with resolutions reaching up to 4K (4096 x 2160 pixels). That’s four times the resolution and over six times the duration of OpenAI’s Sora, which was just recently made available to users.

 
However, this advantage is still theoretical. In Google’s experimental video tool, VideoFX, where Veo 2 is currently exclusive, videos are limited to 720p and only eight seconds long. (Sora, on the other hand, can generate 20-second videos at 1080p.)

VideoFX is currently on a waitlist, but Google is increasing the number of users who can access it this week. The company plans to roll it out to more of its products, including YouTube Shorts, sometime next year. Much like the original Veo, Veo 2 can create videos from a simple text prompt or a combination of text and a reference image.

A short video generated with Veo 2. | Video credit – Google

So, what’s different with Veo 2? Well, DeepMind says this new model comes with a better “understanding” of physics and camera controls, which leads to “clearer” footage. By clearer, they mean sharper textures and images, especially in action-packed scenes.
 
When it comes to camera controls, Veo 2 can now position the virtual camera more precisely and move it around to capture people and objects from various angles.It can also simulate different lenses and cinematic effects, giving videos a more polished, movie-like feel. Plus, it’s said to capture more subtle human expressions. DeepMind shared a few carefully selected samples, and I think they look pretty impressive for AI-generated footage.

Video credit – Google 

That said, there’s still some work left to do. Take a look at the oddly slick road in the footage above or the pedestrians in the background merging together. So, for anyone worried that AI might take over, it’s made huge strides, but it’s still a long way from replacing human knowledge and skills.

Veo 2 was trained on a ton of videos, which is pretty standard for AI models. By being fed countless examples of data, these models start recognizing patterns that enable them to generate new content. While DeepMind doesn’t reveal the exact sources of the videos used to train Veo 2, YouTube is a likely candidate, given that Google owns it.

Like other Google image and video models, Veo 2 embeds an invisible SynthID watermark in its outputs to mark them as AI-generated, which is meant to help prevent misinformation and misattribution. But let’s be real – most people probably aren’t checking for that watermark before sharing a video, which still leaves room for misinformation to spread.

Along with Veo 2, Google DeepMind also revealed upgrades to Imagen 3, its image-generation model. A new version of Imagen 3 is now available to users of ImageFX, Google’s image creation tool, starting this Monday. The updated model promises to deliver “brighter, better-composed” images and photos in various styles, including photorealism, impressionism, and anime.


Source link
Exit mobile version