OpenAI announced on Monday, Dec 9th, that it has released its artificial intelligence model, which generates video from text prompts to ChatGPT Plus and Pro users, further expanding its efforts into multimodal AI technologies.
The Microsoft-backed company, which sparked a generative AI wave with the launch of its ChatGPT chatbot in November 2022, now aims to compete with similar text-to-video tools from Meta, Alphabet’s Google, and Stability AI’s Stable Video Diffusion.
The AI model, named Sora, was initially introduced in February but its access had been limited to safety testers during its research preview phase. Now, it is available to ChatGPT Plus and Pro users as Sora Turbo at no additional cost.
“We’re working on tailored pricing for different types of users, which we plan to make available early next year,” the company stated on its website.
Users will be able to generate videos up to 1080p resolution, with a duration of up to 20 seconds, and in widescreen, vertical, or square aspect ratios.
“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction. Sora, our text-to-video model, can generate videos up to a minute long while maintaining visual quality and adhering to the user’s prompt.”
OpenAI mentioned that while Sora is not yet available in EU countries, Switzerland, and the UK, it will be accessible in other regions where ChatGPT is available.
The company also noted that it will block the creation and upload of harmful content, such as child sexual abuse material and sexual deepfakes, to prevent misuse. “Uploads of people will be limited at launch, but we intend to roll the feature out to more users as we refine our deepfake mitigations,” OpenAI added.
Sora offers a number of features that provide users with greater control over the video generation process. Here’s an overview of each:
Remix: The remix feature allows users to reimagine existing videos by modifying elements like colors, backgrounds, or visual components, without altering the core essence of the original video. This is ideal for creators looking to update old content, tailor videos for specific themes, or experiment with different variations for branding purposes.
Re-cut: The re-cut feature lets creators identify and isolate the most impactful frames in a video. These key frames can be extended in either direction to construct a more complete scene. This tool is perfect for highlighting specific moments, emphasizing particular visuals, or ensuring smoother transitions between scenes. By focusing on the strongest frames, re-cut helps refine the storytelling process, giving creators more control over pacing and emphasis.
Loop: The loop feature enables seamless repetitions of video clips. Ideal for creating background visuals, music videos, or hypnotic animations, it ensures smooth transitions between loops, allowing creators to extend the duration of captivating moments or maintain a consistent rhythm for videos meant to play continuously.
Storyboard: The storyboard feature lets creators generate specific shots at designated frame points along the timeline. This provides precise control over the visual narrative. For example, in OpenAI’s demo, the storyboard sequence is as follows:
Frames 0-114: “A vast red landscape with a docked spaceship in the distance.”
Frames 114-324: “Looking out from inside the spaceship, a space cowboy stands center frame.”
Frames 324-440: “Detailed close-up view of an astronaut’s eyes framed by a knitted fabric mask.”
Blend: The blend feature allows users to merge different video or style elements, creating new and unique compositions. By mixing footage, colors, or artistic approaches, this tool enables creators to experiment and craft visuals that feel distinct and fresh. It’s ideal for experimental projects, mashups, or creative storytelling that explores unconventional ideas.
Style Presets: The style preset feature provides a collection of predefined aesthetic templates that can be applied to videos. These presets make it easier for creators to achieve specific looks, whether cinematic, vibrant and playful, or professional.