Saturday, June 15, 2024
HomeAI & SoftwaresSora: OpenAI's New AI Video Generator

Sora: OpenAI’s New AI Video Generator

Sora is the new and first AI video generator produced by OpenAI. It can produce some genuinely impressive cinematographic feats. The model is more capable than OpenAI initially made it to be.

how Sora works

Screenshot of an AI-generated video from OpenAI’s Sora
Image: OpenAI

“Sora has visual patches. Patches have previously been shown to be an effective representation of models of visual data. Patches are a highly scalable and effective representation for training generative models on diverse types of videos and images. At a high level, we turn videos into patches by first compressing videos into a lower-dimensional latent space and then decomposing the representation into spacetime patches.” – Research paper OpenAI.

Several OpenAI researchers co-authored the study, “Video generation models as world simulators”. which reveals important details about Sora’s architecture. For example, it reveals that Sora can produce videos with any aspect ratio and resolution (up to 1080p). According to the research, Sora may modify images and movies in a variety of ways, including making looping videos, extending them forward or backward in time, and altering the background of an already-existing video.

The writer finds Sora’s capacity to “simulate digital worlds,” as stated by the OpenAI co-authors, to be particularly fascinating. In an experiment, OpenAI gave Sora commands that contained the term “Minecraft,” and it was tasked with simultaneously directing the player character and rendering a HUD and game that looked remarkably similar to Minecraft, complete with physics.

Sora is less of a creative person and more of a “data-driven physics engine.” It involves more than just creating a single image or video; it also entails calculating the physics of every object in a scene and rendering an image, video, or interactive 3D world depending on the results.  observed by senior Nvidia researcher Jim Fan

In the video game domain, Sora’s standard restrictions are applicable. The physics of simple interactions, such as the shattering of glass, cannot be faithfully approximated by the model. Furthermore, Sora is frequently uneven, even when modeling interactions. For instance, it may depict a person eating a burger but not include bite marks.

It seems Sora could pave the way for more realistic — perhaps even photorealistic — procedurally generated games from text descriptions alone. That’s equally exciting and terrifying – that is why OpenAI has decided to temporarily gate Sora behind a very restricted access program. The public will not have access. And it is not clear when they will gonna launch it for the public.



Please enter your comment!
Please enter your name here

Most Popular