Wan2.1 Video
Last updated
Last updated
Wan2.1 is a cutting-edge open-source video generation model that outperforms both open-source and commercial solutions in benchmarks. The model excels in Text-to-Video and Image-to-Video. Notably, it’s the first video model with robust Chinese/English visual text generation. Wan2.1 combines high performance, accessibility, and multi-task versatility, advancing open-source video AI.
Writing prompts for Wan Video differs from writing prompts for Stable Diffusion models. Instead of using a list of tags, you should write in natural language. Focus on describing the actions and motions in the scene to achieve the best results for image-to-video generation.
The basic formula for crafting an effective prompt is Subject + Motion + Scene. Here's how it works:
Subject: Clearly define the main focus of your video, such as a person, object, or animal.
Motion: Describe the action or movement you want to emphasize, as this is crucial for image-to-video generation.
Scene: Set the context by detailing the environment, background, or setting where the action takes place.
For example:
"A knight (Subject) riding a horse through a forest (Scene) with a flowing cape (Motion)."
"A butterfly (Subject) fluttering its wings (Motion) in a sunlit garden (Scene)."
This structure ensures clarity and helps the AI generate more accurate and visually compelling results.
A start frame is mandatory, while an end frame is optional. Including both a start frame and an end frame can significantly enhance the consistency and coherence of the video.
The size of the input image plays a crucial role. If the input image is too small, the quality of the generated video will suffer substantially.
High-quality start frames are essential for producing high-quality videos. The better the input, the more refined and visually appealing the output will be. Always ensure your start frame is clear, detailed, and appropriately sized for optimal results.
Below is examples of 2 frames and the final output video:
Tips: For optimal results, choose the style that best aligns with the motion you envision. This approach significantly enhances the success rate and ensures greater stability in your video output, delivering smoother and more coherent results.
Some styles are trained on lower resolutions, which may result in artifacts when used in Professional Mode. If the output appears blurry in Professional Mode, reducing the strength of the style can help improve the clarity and overall quality of the result.
Below is the example using Live wallpaper style to animate an image