Google has announced the launch of Google Veo 2, the second generation of its video generation model, which it first introduced earlier this year. The new model is claimed to deliver “incredibly high-quality videos in a wide range of subjects and styles.” In head-to-head comparisons judged by human raters, Veo 2 achieved state-of-the-art results against leading models, claims Google.
Google Veo 2 brings an improved understanding of real-world physics and the nuances of human movement and expression, which helps improve its detail and realism overall. “Veo 2 understands the language of cinematography: ask it for a genre, specify a lens, suggest cinematic effects, and Veo 2 will deliver,” said Google.
One can generate videos at up to 4K Resolution and extended to minutes in length. Users can into further detailing while asking for the shot, such as a low-angle tracking shot that glides through the middle of a scene, or a close-up shot on the face of a scientist looking through her microscope, and Veo 2 creates it.
One could even suggest “18mm lens” in their prompt and Google Veo 2 knows to craft the wide angle shot that this lens is known for, or blur out the background and focus on your subject by putting “shallow depth of field” in the prompt. Google adds that generally video models often “hallucinate” unwanted details — extra fingers or unexpected objects, for example — and Veo 2 produces these less frequently.
Similar to Google’s other models, Veo 2 outputs include an invisible SynthID watermark that helps identify them as AI-generated, helping reduce the chances of misinformation and misattribution.
Google is bringing Veo 2 capabilities to its Google Labs video generation tool, VideoFX, and expanding the number of users who can access it. Those interested can visit the Google Labs website to sign up for the waitlist. The company also plans to expand Veo 2 to YouTube Shorts and other products next year.
Google Imagen 3 Improvements
Aside from Veo 2, Google has announced improvements to the Imagen 3 model which was rolled out in August. The improvements help Imagen 3 generate brighter, better composed images. It can now render more diverse art styles with greater accuracy — from photorealism to impressionism, from abstract to anime. This upgrade also follows prompts more faithfully, and renders richer details and textures, as per Google. The latest Imagen 3 model is rolling out globally in ImageFX, Google’s image generation tool from Google Labs, to more than 100 countries.
Google Whisk Experiment
Finally, Google also announced a new experiment called Google Whisk. As the newest experiment from Google Labs, Whisk lets you input or create images that convey the subject, scene and style you have in mind. Then, you can bring them together and remix them to create something uniquely your own, from a digital plushie to an enamel PIN or sticker, said Google.
Under the hood, Whisk combines Google’s Imagen 3 model with Gemini’s visual understanding and description capabilities. The Gemini model automatically writes a detailed caption of your images, and it then feeds those descriptions into Imagen 3. This process allows you to easily remix your subjects, scenes and styles in new ways.