With chatbots and text-to-image generators taking the internet by storm, the next frontier in AI may be text-to-video generators.
Nvidia recently published a research paper titled “High-resolution Video Synthesis with Latent Diffusion Models” on its experiments in its Toronto AI lab, detailing how it uses stable diffusion to create a tool Which can create moving art results from a text prompt.
The tech company showed demos of Latent Diffusion Models (LDMs), which use text to generate video clips without large amounts of computer processing. techradar noted.
The tool is capable of generating GIF-style moving images that are approximately 4.7-second long videos at 1,280 x 2,048 resolution. According to the research paper, it is also capable of producing longer videos at a lower resolution of 512 x 1024.
After seeing a demo of the technology, TechRadar said the tool is ideal as a text-to-GIF generator at this point. The publication noted that it can easily handle simple signals such as a storm vacuuming the beach Or Teddy bear playing electric guitar, High Definition, 4K, Even so, the result still produced random artifacts and blur in GIFs, as is common on other regularly used AI tools such as Midjourney.
The publication believes that long videos still need a bit more development before they can hit prime time, but Nvidia seems to be working quickly to get the technology ready. They can work well for stock libraries and similar purposes.
There are other companies experimenting with AI text-to-video generators. Google demonstrated its Fenaki generator, which allows for longer signals that create 20-second clips. Another startup called Runway last month announced its second generation video model, also based on stable diffusion. demo of its prompt The late afternoon sun peeking through a New York City loft window Shows how you can add a slight motion effect to still images.
According to TechRadar, users can also benefit from the inclusion of AI in other programs such as Adobe Firefly and Adobe Premiere Rush.
Some other companies, such as Narkit and Lume5, market themselves as text-to-video generators. However, many of these tools work more like PowerPoint presentations, putting together text, audio, images and perhaps some pre-made video clips, as opposed to generating a unique work.