
OpenAI has released a revised version of its Sora text-to-video model. Sora 2 is said to have improved physical accuracy and higher photorealistic quality compared to its predecessor Sora, which was launched in February 2024. However, the most significant innovation is the system’s ability to generate videos with synchronized dialogue, sound effects, and background noises.
While previous models sometimes distorted reality to fulfill a prompt, Sora 2 promises more consistent results. OpenAI cites a basketball throw as an example: instead of a missed ball teleporting to the basket, it should now bounce off the board in a physically correct manner. The controllability and technical possibilities have also been expanded. However, the resolution still seems to be limited to 1920 × 1080 pixels – at least all examples on the website are in 1080p – at most, even if the resolution 4K is specified in an example prompt. Other video generators, such as Veo3, also support 4K resolutions.
OpenAI does not specify an official upper limit for the video length for Sora 2; the published prompts specify lengths of five to ten seconds. OpenAI uses the example of a dragon video to show how detailed the instructions can be. The associated prompt not only defines the scene but also the camera type (“large-format digital sensor emulation”), lens (50 mm spherical), filtering (“Black Pro-Mist 1/8”), color palette (“steel blue glacier,” “warm amber edge on the kite”), and the exact soundscape (“thunder of the wing membrane at each downswing,” “distant calving boom of the glacier”).
Empfohlener redaktioneller Inhalt
Mit Ihrer Zustimmung wird hier ein externes Video (TargetVideo GmbH) geladen.
Videos immer laden
Video jetzt laden
Social app for iOS
Alongside the presentation of the model, OpenAI is launching a new iOS app called “Sora.” This enables users to create videos, remix other people’s content, and exchange information via a feed. A central function of the app is “Cameos.” This allows users to integrate their image into scenes generated by Sora after a one-off recording of video and audio.
According to OpenAI, it addresses the risks associated with the technology, such as disinformation and the potential for addiction. The Sora app feed should be optimized for creativity rather than dwell time. Users should be able to control the recommendation algorithm via natural language instructions.
Daily usage limits and restricted authorizations for the “Cameos” function are provided as standard for young people. Parents can also deactivate the personalization of the feed via the ChatGPT parental controls. According to OpenAI, users retain control over their image created using “Cameo” and can revoke access to it at any time. The generation of harmful, violent, or adult content is to be prevented through filters and moderation.
The Sora app will initially be launched in the USA and Canada via an invitation system. After receiving an invitation, access is also possible via the sora.com website. Sora 2 will initially be free to use. ChatGPT Pro subscribers will receive access to a higher-value model known as “Sora 2 Pro.” An API for developers is also planned. As a possible future business model, OpenAI is holding out the prospect of users being able to pay for additional computing capacity. The predecessor model, Sora 1 Turbo, will remain available.
(vza)
Don’t miss any news – follow us on
Facebook,
LinkedIn or
Mastodon.
This article was originally published in
German.
It was translated with technical assistance and editorially reviewed before publication.
Dieser Link ist leider nicht mehr gültig.
Links zu verschenkten Artikeln werden ungültig,
wenn diese älter als 7 Tage sind oder zu oft aufgerufen wurden.
Sie benötigen ein heise+ Paket, um diesen Artikel zu lesen. Jetzt eine Woche unverbindlich testen – ohne Verpflichtung!