ByteDance, the Chinese company behind TikTok and viral video editor CapCut, has released its first AI text-to-video model designed to compete with the yet-to-be-released Sora from OpenAI — but for now, it’s only available in China.
Jimeng AI was built by Faceu Technology, a company owned by ByteDance that produces the CaptCut video editing app and is available for iPhone and Android as well as online.
To get access you have to log in with a Douyin account, the Chinese version of TikTok which suggests if it does come to other regions it will be linked to TikTok or CapCut. It is possible, but purely speculative, that a version of Jimeng will be built into CapCut in the future.
ByteDance isn’t the only Chinese company building out AI video models. Kuaishou is one of China’s largest video apps and last month it made Kling AI video available outside of China for the first time. It is one of my favorite AI tools with impressive motion quality and video realism.
What is Jimeng AI?
(Image credit: Jimeng AI/ByteDance)
Jimeng AI is a text-to-video model trained and operated by Faceu Technology, the Chinese company behind the CapCut video editor. Like Kling, Sora, Runway and Luma Labs Dream Machine it takes a text input and generates a few seconds of realistic video content.
Branding itself the “one-stop AI creation platform” you can generate video from text or images and it gives you control over camera movement and first and last frame input. This is something most modern AI video generators offer where you give it two images and it fills in the moments between them.
The focus for Faceu has been on ensuring its model can understand and accurately follow Chinese text prompts and convert abstract ideas into visual works.
How does Jimeng AI compare?
From the video clips I’ve seen on social media and the Jimeng website, it appears to be closer to Runway Gen-2 or Pika Labs than Sora, Gen-3 or even Kling. Video motion appears slightly blurred or shaky and output is more comic than realism.
What I haven’t been able to confirm, as it isn’t available outside of China, is how long each video clip is at initial generation or whether you can extend a clip.
Most tools including Kling start at 5 seconds where Runway is 10 seconds and Sora is reportedly 15 seconds. Many of them also allow for multiple extensions to that initial clip.
I think Jimeng being mobile-first and tied to apps like Douyin and CapCut put it in a different category to the likes of Kling and Dream Machine. It is better compared to the likes of the Captions App or Diffuse in that its content is primarily aimed at social video than production.