Intricate artwork of an eye with orange fluid-like elements and text overlay.
,

Getting Started Installing Stable Video Diffusion: Introduction to SVD in ComfyUI

ComfyUI is a user-friendly graphical user interface that lets you easily use Stable Video Diffusion and other diffusion models without any coding. You can simply drag and drop different nodes to create an image generation workflow, and then adjust the parameters and settings to customize your output. In this guide, I will show you how to get started with Installing Stable Video Diffusion in ComfyUI, and how to create amazing videos from images with just a few steps.


Getting Started Installing Stable Video Diffusion: Introduction to SVD in ComfyUI

  • Update ComfyUI and the Manager to the latest version

  • Load the Stable Video Diffusion workflow created by Enigmatic_E

  • Install the missing custom nodes required for the workflow

  • Generate a video from an image with motion and dynamics

Welcome to this tutorial on how to use Stable Video Diffusion in ComfyUI, a node-based GUI for diffusion models. Stable Video Diffusion is a cutting-edge technology that can generate realistic and dynamic videos from static images. It works by using a diffusion model, which is a type of artificial intelligence that can create diverse and high-quality outputs by adding and removing noise from the input.

To get started, you will need to download and install these prerequisites:

Step 1: Update ComfyUI and the Manager

Before you can use Stable Video Diffusion, you need to make sure that you have the latest version of ComfyUI and the Manager installed on your device. The Manager is a tool that allows you to manage your ComfyUI installation, such as updating, installing, or uninstalling custom nodes.

To update ComfyUI and the Manager, follow these steps:

Step 2: Load the Stable Video Diffusion workflow

The next step is to load the Stable Video Diffusion workflow created by Enigmatic_E, which is a JSON file named ‘SVD Workflow’. This workflow contains the nodes and settings that you need to generate videos from images with Stable Video Diffusion.

To load the workflow, follow these steps:

  • Go to the ComfyUI menu and click on Load.

    This will open a file explorer window where you can browse and select the JSON file that you want to load.

  • Navigate to the folder where you have saved the ‘SVD Workflow’ file and select it.

    Then click on Open. This will load the workflow into ComfyUI.

  • You will see a graphical representation of the workflow on the ComfyUI interface, with different nodes connected by wires.

Step 3: Install the missing custom nodes

After you have loaded the workflow, you might notice that some of the nodes have red boxes around them. This means that you are missing the custom nodes that are required for the workflow to function properly. Custom nodes are additional nodes that are not included in the default ComfyUI installation, but can be downloaded and installed separately.

To install the missing custom nodes, follow these steps:

  • Go back to the Manager menu and click on Install Missing Custom Nodes.

    This will open a new menu with a list of the custom nodes that you need to install for the workflow.

  • Click on each of the custom nodes that are listed.

    Then click on the Install button on the bottom right corner of the menu. This will download and install the custom nodes on your device.

  • Repeat this process for every custom node that is listed.

    You will see a message saying “Installation Complete” when it is done.

  • To apply the changes, you need to restart ComfyUI.

    Close ComfyUI and then launch it again. It will continue downloading some required items before you can use it.

In this section, I will explain how to adjust the settings for Stable Video Diffusion in ComfyUI: nYou will learn how to:

  • Download and Install SVD Models

  • Select the preferred SVD model

  • Upload your image

  • Change the width, height, video frames, motion bucket, fps, and augmentation level

  • Save and generate your video

After you’ve downloaded the models, you need to place them in the Stable Diffusion Models folder, which is where ComfyUI will look for them when you use the Stable Video Diffusion node. You can find the folder in the same directory where you installed all your Stable Diffusion models.

The first setting that you need to adjust is the Image Only Checkpoint Loader (img2vid model) node. This is where you can select the preferred SVD model that you want to use for the video generation. There are two SVD models that you can choose from: svd.safesensors and svd_image_decoder.safesensors. They both do essentially the same thing, but they differ in the number of frames that they can generate. The svd.safesensors model can generate 14 frames, while the svd_image_decoder.safesensors model can generate 25 frames. The more frames you have, the smoother and longer the video will be, but it will also take more time and resources to generate.

To select the preferred SVD model, follow these steps:

  • Click on the Image Only Checkpoint Loader (img2vid model)

    node and then click on the dropdown menu on the bottom right corner of the node. This will open a list of the available SVD models that you can choose from.

  • Select the SVD model that you want to use,

    either svd.safesensors or svd_image_decoder.safesensors. The selected model will be displayed on the node preview.

The next setting that you need to adjust is the Load Image node. This is where you can upload the image that you want to use as the input for the video generation. You can use any image that you want, as long as it is in PNG or JPG format.

To upload your image, follow these steps:

  • Click on the Load Image node

    Then click on the Browse button on the bottom right corner of the node. This will open a file explorer window where you can browse and select the image file that you want to upload.

  • Navigate to the folder where you have saved the image file and select it.

    Then click on Open. This will upload the image to the node and display it on the node preview.

  • Height/Width:

    1024×576

  • Height/Width:

    576×1024

  • svd.safesensors:

    14 frames (video frames)

  • svd.safesensors:

    25 frames (video frames)

The next setting that you need to adjust is the SVD_img2vid_Conditioning node. This is where you can change the width, height, video frames, motion bucket, fps, and augmentation level of the video generation process. These parameters affect the quality, duration, and style of the video that you will get.

To change these parameters, follow these steps:

  • Click on the SVD_img2vid_Conditioning node

    Then use the sliders and buttons on the bottom right corner of the node to adjust the parameters. You will see the values of the parameters on the node preview.

  • The width and height parameters determine the resolution of the video.

    Stable Video Diffusion requires a specific resolution, which is either 1024×576 or 576×1024. You can use the buttons to switch between these two resolutions.

  • The video frames parameter determines the number of frames that the video will have.

    This depends on the SVD model that you have selected. The svd.safesensors model can generate 14 frames, while the svd_image_decoder.safesensors model can generate 25 frames. You can use the slider to choose the number of frames that you want, up to the maximum limit of the model.

  • The motion bucket parameter determines how quick the motion is happening inside your video. It defaults at 100.

    The lower the number, the slower the video, and the higher the number, the faster the video. You can use the slider to choose the motion bucket that you want. The motion bucket affects the speed and intensity of the movement and animation of the objects and background in the image.

  • The fps parameter determines the frames per second of the video.

    It should be left at 6, as this is the optimal value for Stable Video Diffusion. You don’t need to touch this parameter for now.

  • The augmentation level parameter determines how augmented and how animated the background and the details in the image are.

    It ranges from 0 to 9, with 0 being the least and 9 being the most. You can use the slider to choose the augmentation level that you want. The augmentation level affects the amount and variety of the changes and effects that are applied to the image.

You can play around with the motion bucket and the augmentation level parameters to see what type of effects you get. Different combinations of these parameters can produce different styles and moods of the video.

To set the cfg value, read this:

The CFG value ranges from 0 to 10, with 0 being the lowest and 10 being the highest. A low value like 3-4 seems to work well for most people, as it produces clear and realistic videos. You can also set it to 3.5, which is the default value. A high value can give you some different results, such as more noise, distortion, or variation. You can experiment with different values to see what it does.

The last setting that you need to adjust is the KSampler node. This is where you can set the cfg value, which is a parameter that affects the quality and diversity of the video. You can also use this node to save and generate your video.

To save and generate your video, follow these steps:

  • To save your video,

    Right click on the video preview on the node and then click on, Save Preview . This will automatically save the video.

  • To generate your video, click on the Queue Prompt button on the bottom right corner of the node.

    This will start the video generation process. You will see a progress bar on the node indicating the status of the process.

How to Install AnimateDiff in Automatic1111 WebUI
How to Install AnimateDiff in Automatic1111 WebUI

AnimateDiff is an innovative tool that utilizes diffusion models to breathe life into your generative art, resulting in captivating animated GIFs. By infusing motion into text or image inputs, AnimateDiff paves the way for a fresh approach to animation, free from the constraints of traditional methods. It represent advancement in the way animation may be…

Note***

At this moment, there are no text input and it is mainly used as an image to video generator. You can’t mask, or adjust it much, so most of your creative work is done through image generation and then using it as a reference source for video to create an image to video rendering. The AI figures out the image on its own for you. Right now, it’s best to use simple images without a lot of complex action for this to work right.


Tags And Categories

In: ,

Share this post

Leave a Reply

Your email address will not be published. Required fields are marked *

Horizontal ad will be here