Female student bowing at Samurai Master in feudal japan painting Automatic1111 User Interface, WebUI Forge User Interface
,

Mastering the Automatic1111 User Interface & WebUI Forge: A Comprehensive Guide for Stable Diffusion

As intrepid explorers of cutting-edge technology, we find ourselves perpetually scaling new peaks. Today, our focus is the Automatic1111 User Interface and the WebUI Forge User Interface. If you’ve dabbled in Stable Diffusion models and have your fingers on the pulse of AI art creation, chances are you’ve encountered these 2 popular Web UIs. Its power, myriad options, and tantalizing dropdown menus promise an exhilarating new ways to create content.

But what does it all mean? What do these tabs, options, and sliders do? I empathize—I’ve been there too. That’s precisely why we’re going to deep dive into WebUI Forge and Automatic1111’s user interface (They’re both similar). From the subtle intricacies of the txt2img tab to the img2img section, not to mention the ever-useful Upscaler options—we’re dissecting them all, understanding their functions, and unlocking their potential.

This guide is tailor-made for novices like you. Whether you’re dipping your toes into Automatic1111 or you’re an intermediate user hungry for every last detail, this series is your compass. Now, let’s get to work and learn what all these buttons and sliders do.

Automatic1111 Web UI is a browser-based interface for Stable Diffusion, a powerful open-source AI art generator. It provides a user-friendly way to interact with Stable Diffusion models and create AI images. Here are some key features of the Automatic1111 Web UI:

To install the Web UI, check out this guide below:

Stable Diffusion WebUI Forge is a platform built on top of Stable Diffusion WebUI, which is based on the Gradio framework. The name “Forge” draws inspiration from “Minecraft Forge.” This project aims to enhance the functionality and efficiency of the original Stable Diffusion WebUI by making development easier, optimizing resource management, and speeding up inference.

Key Features of Stable Diffusion WebUI Forge:

  • Speed Improvements:

How to Install Automatic1111 Web UI for Stable Diffusion
How to Install Automatic1111 Web UI for Stable Diffusion

Installing the Automatic1111 Web UI for Stable Diffusion requires a solid groundwork. If you’ve been following our guide series, you’ve likely laid down this essential foundation. This tutorial builds upon the preparatory steps detailed in our previous blog.

How to Install WebUI Forge in 8 Steps: A Faster & Powerful Way to Use Stable Diffusion
How to Install WebUI Forge in 8 Steps: A Faster & Powerful Way to Use Stable Diffusion

Installing the WebUI Forge for Stable Diffusion requires a solid groundwork. If you’ve been following our guide series, you’ve likely laid down this essential foundation. This tutorial builds upon the preparatory steps detailed in our previous blog so that you can learn how to Install WebUI Forge for Stable Diffusion.

In summary, Stable Diffusion WebUI Forge provides a faster and more efficient experience for image synthesis and manipulation, especially for low-VRAM GPUs. To install the Web UI, check out this guide below:

Mastering the Automatic1111 User Interface & WebUI Forge: A Comprehensive Guide for Stable Diffusion

The Forge and Automatic1111 user interface serves as the foundation for your AI art creation. Within this interface, you’ll find an array of tabs and options. While they might appear daunting initially, take the time to explore them—each tab has a distinct purpose, all aimed at simplifying your workflow.

Let’s begin to take a look into the main tabs within the interface and explore their fundamental purposes. Here’s an overview of the General section in both Automatic1111 and WebUI ForgeTxt2img serves as the foundational area, encompassing essential sliders and parameters. Many of these settings are also present in the img2img tab. The Extras tab provides additional features and functionalities, while the PNG Info section offers information related to PNG files.

For handling checkpoint merging tasks, there’s the Checkpoint Merger tab. The Train section pertains to training processes, and the Settings tab allows customization and configuration. Finally, the Extensions tab expands capabilities through extensions. These basic tabs streamline your experience, whether you’re working with text-to-image or image-to-image transformations. 🚀🎨

​Prompt (press Ctrl+Enter or Alt+Enter to generate)
beautiful lady, freckles, dark makeup, hyperdetailed photography, soft light, head and shoulders portrait, cover, expired polaroid
​Negative Prompt (press Ctrl+Enter or Alt+Enter to generate)
(worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art)++++, (watermark, signature, text font, username, error, logo, words, letters, digits, autograph, trademark, name)+, (blur, blurry, grainy), morbid, ugly, asymmetrical, mutated malformed, mutilated, poorly lit, bad shadow, draft, cropped, out of frame, cut off, censored, jpeg artifacts, out of focus, glitch, duplicate, (airbrushed, cartoon, anime, semi-realistic, cgi, render, blender, digital art, manga, amateur)++, (3D ,3D Game, 3D Game Scene, 3D Character), (bad hands, bad anatomy, bad body, bad face, bad teeth, bad arms, bad legs, deformities)++
Sampling Method: DPM++ 2M KarrasSeed: 2810691589
Sampling Steps: 10CFG Scale: 7
Width: 768Height: 768
Download Juggernaut Model (Final)
Generation txt2img img2img Extras PNG Info Checkpoint Merger Train Settings Extensions

Stable Diffusion Checkpoint

  • The Stable Diffusion Checkpoint allows you to select the specific checkpoint or models you want to use. It’s an important setting that determines the starting point for your experiments.
  • When working with Stable Diffusion, you can choose from various pre-trained checkpoints or even fine-tune your own models.
How to Install Stable Diffusion Models for Automatic1111 Web UI
How to Install Stable Diffusion Models for Automatic1111 Web UI

Learn how to install Stable Diffusion Models for AUTOMATIC1111 Web UI. Access powerful art generators with convenient web interfaces.

LoRa (Add Network to Prompt)

  • Think of it as a smaller, more lightweight model compared to the full-sized ones.
  • It’s designed to be easier to train and more optimal for certain tasks. You can use LoRa on top of a trained checkpoint to enhance performance or explore different creative possibilities.
  • To enable LoRa, make sure you turn it on in the settings.
    • Goto Settings > User Interface > [info] Quicksettings list > add sd_lora

Clip Skip

  • CLIP Skip is a feature in Stable Diffusion that allows users to skip layers of the CLIP model when generating images. Let me break it down for you:
  • CLIP Model: The CLIP model is a large language model trained on a massive dataset of text and images. It can be used to generate text descriptions of images and match images to text descriptions.
  • Embedding Process: When using Stable Diffusion, the CLIP embedding process is an important step. It takes your text prompt and converts it into a numerical representation that the model can understand.
  • CLIP Skip: This feature lets you skip some layers of the CLIP embedding process. Imagine the CLIP model as having multiple layers, each becoming more specific than the last. For example, if layer 1 represents “Person,” layer 2 could be “male” and “female.” Going further, layer 3 might include specific terms like “Man,” “boy,” “lad,” “father,” and “grandpa.” The CLIP Skip option allows you to stop at a specific layer, effectively controlling the depth of the embedding process.
    • Example: If you set CLIP Skip to 2, it means you’ll stop at the 10th layer (assuming the CLIP model has 12 layers). Essentially, you’re stopping just before the deepest level of specificity. [Source 1] [Source 2]
  • To enable Clip Skip, make sure you turn it on in the settings.
    • Goto Settings > User Interface > [info] Quicksettings list > add CLIP_stop_at_last_layers

SD VAE

Stable Diffusion (SD) incorporates a technique called Variational Autoencoder (VAE) to enhance the quality of AI-generated images. Let’s learn more:

  1. What is VAE?
    • VAE stands for Variational Autoencoder. It’s a neural network component that encodes and decodes images into and from a smaller latent space. This compression allows for faster computation.
    • In the context of Stable Diffusion, VAEs play a crucial role in improving image quality by recovering fine details.
  2. Do I Need a VAE?
    • By default, Stable Diffusion models (whether v1, v2, or custom) come with a built-in VAE.
    • However, when people refer to downloading and using a VAE, they mean using an improved version of it. These improved versions result from further fine-tuning the VAE part of the model with additional data.
    • Instead of releasing an entirely new model, only the updated VAE portion is provided as a smaller file.
  3. Effect of Using VAE:
    • The impact of using VAE is usually subtle but significant.
    • An improved VAE enhances image decoding from the latent space, leading to better recovery of fine details.
    • It particularly benefits rendering features like eyes and text, where fine details matter.
  4. Stability AI’s Fine-Tuned VAEs:
    • Stability AI has released two variants of fine-tuned VAE decoders:
      • EMA (Exponential Moving Average): Produces sharper images.
      • MSE (Mean Square Error): Yields somewhat smoother outputs.
    • Both versions improve image quality without any degradation.
  5. Comparison (256×256 images):
  6. Example Comparison (512×512 images):
    • Rendering eyes improves, especially for small faces.
    • Text rendering improvements are less pronounced.
    • Neither EMA nor MSE performs worse; they either do better or maintain the same quality.

Download VAE Here.

  1. Download the required file and place it in the stable-diffusion-webui/models/VAE/ directory.
  2. Navigate to the Settings Tab on the A1111 Webui, select Stable Diffusion from the left-side menu, click on SD VAE, and then choose ‘vae-ft-mse-840000-ema-pruned’.
  3. Hit the ‘Apply Settings‘ button and patiently wait for a successful application message.
  4. Proceed to generate your image in the usual manner, using any Stable Diffusion model in either the txt2img or img2img options.
  • To enable SD VAE, make sure you turn it on in the settings.
    • Goto Settings > User Interface > [info] Quicksettings list > add sd_vae
Prompt Prompt Styles Sampling Method Sampling Steps Hires.fix Width/Height Batch count Batch size CFG Scale Seed Script

Below is the WebUI Forge User Interface. The Different between Forge and Automatic1111 is that Forge comes pre-installed with SVD and Z123. SVD stands for Stable Video Diffusion which allows you to turn still images into animated videos. Stable Zero123 is an AI-powered model for generating novel views of 3D objects with improved quality. Released for non-commercial and research purposes, it uses an improved dataset and elevation conditioning for higher-quality predictions.

Automatic1111 and Forge General Tab and Parameters

This tab, short for Text-to-Image, stands as the cornerstone of the Automatic1111 user interface. It’s the favored tool of a new generation of creators called Prompt Engineers. The concept is straightforward yet powerful: you describe an image using words in the prompt box, and the underlying Stable Diffusion algorithm does its best to materialize your textual description into a tangible image. This feature has truly revolutionized digital art creation, bridging the gap between imagination and visualization.

Prompt Engineering is a comprehensive discipline within artificial intelligence (AI) that involves the systematic design, refinement, and optimization of prompts. It plays an important role in training AI models to produce specific outputs. Essentially, a prompt serves as the starting point for an AI model to generate a response. It can be as simple as a few words or as complex as an entire paragraph.

Here’s a breakdown of the key aspects related to prompt engineering:

Anatomy of a Good Prompt:

​Prompt (press Ctrl+Enter or Alt+Enter to generate)
​The Prompt box is is where you write your textual description of the image you wish to generate. The Stable Diffusion algorithm takes this prompt as input and, using its complex learning and pattern recognition capabilities, attempts to create an image that corresponds to your description.
  • good prompt needs to be detailed and specific. When creating a prompt, consider various keyword categories:

    • Subject: Clearly define what you want to see in the image or output.
    • Medium: Specify the material or style used for the artwork (e.g., digital art, oil painting).
    • Style: Describe the desired artistic style.
    • Resolution: Indicate the desired image resolution.
    • Additional Details: Include any relevant specifics, such as color, lighting, and context.

    Use these keywords as a checklist to guide your prompt creation. The more precise your prompt, the better the AI model’s understanding of your intent.

Prompt Analysis:

Subject: Lion man with muscular build.
Medium: Digital Concept Art
Style: Art by Jim Lee
Resolution: 8K
Additional Details: Ambient Lighting

You can see the difference when you use one of these resolution type prompts. It doesn’t literally increase the resolution, but it understands that it means to add more details. In newer and future models, these prompts aren’t necessary anymore but you can experiment with it.

Lion Man Prompt

Negative Prompts:

​Negative Prompt (press Ctrl+Enter or Alt+Enter to generate)
​​The Negative Prompt box is a complementary tool that refines your image generation process. Here, you can input what you specifically don’t want in your image. This allows for a higher degree of control and precision over the output, helping you to avoid unwanted elements in your generated image.
  • Negative prompts steer the AI model away from certain outcomes.

    • Instead of specifying what you want, you indicate what you don’t want. These are essential for newer AI models (v2) to improve image quality and alignment with expectations.
    • For example, if you want to avoid certain attributes (e.g., “ugly,” “deformed”), use negative prompts to guide the model away from those aspects.
Prompt: beautiful lady, freckles, dark makeup, hyperdetailed photography, soft light, head and shoulders portrait, cover, expired polaroidPrompt: beautiful lady, freckles, dark makeup, hyperdetailed photography, soft light, head and shoulders portrait, cover, expired polaroid
Negative Prompts: NoneNegative Prompts: (worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overexposed, underexposed, grayscale, bw, bad photo, bad photography, bad art)

Prompt engineering doesn’t require coding experience; creativity and persistence are key. As AI models continue to evolve, exploring different prompt strategies will enhance your results. Many use LLMs like chatGPT to write their prompts.

Learn how to prompt in my guides below:

Prompt Engineer 01 – Stable Diffusion Prompt Weights & Punctuations – How to use it in Automatic1111
Prompt Engineer 01 – Stable Diffusion Prompt Weights & Punctuations – How to use it in Automatic1111

Learn the ins and outs of Stable Diffusion Prompt Weights for Automatic1111. I’ll be sharing my findings, breaking down complex concepts into easy-to-understand language, and providing practical examples along the way.

Words of Power: The Essential Cheat Sheet for Super Prompts Techniques
Words of Power: The Essential Cheat Sheet for Super Prompts Techniques

Mastering the art of prompting can be a game-changer. This comprehensive guide serves as a toolkit to enhance your image prompting skills. Leveraging power words, super prompts, and strategically curated techniques, this handbook provides a foundational blueprint for creating prompts that yield phenomenal results. Some of these prompts may already be embedded in your AI…

In the context of Stable Diffusion, a token serves as a fundamental unit of text input that you can feed to the model. When constructing prompts, you’ll notice a number like “0/75” displayed at the far end of your prompt bar. This represents the standard maximum limit of tokens you can use in a single prompt. If you exceed this limit, the model cleverly divides your prompt into smaller pieces, each consisting of 75 tokens or fewer. These chunks are then processed independently and later combined to form the complete output. 

For instance, if you have a 120-token prompt, it gets split into a 75-token chunk and a 45-token chunk, which are processed separately and then merged.

  • Tokens and the 0/75 Indicator

    In the context of Stable Diffusion, a ‘token’ is essentially a unit of text input you can feed to the model. The number “0/75” displayed at the far end of your prompt bar represents the standard maximum limit of tokens you can use in a single prompt. Anything beyond that may not be considered by the model.

  • Tokens Beyond 75 – Infinite Prompt Length

    But what if you have more to say? That’s where ‘Infinite Prompt Length’ comes in. If you type more than 75 tokens, the model cleverly divides your prompt into smaller pieces, each consisting of 75 tokens or fewer, and processes each chunk independently. So, if you have a 120-token prompt, it gets divided into a 75-token chunk and a 45-token chunk, which are processed separately and then combined. This allows you to provide more complex instructions to the model.

  • The ‘BREAK’ Keyword

    If you want to start a new chunk of tokens before reaching the 75-token limit, you can use the ‘BREAK’ keyword. This command will fill the remainder of the current chunk with ’empty’ tokens, and any text you type afterwards will start a new chunk. This gives you more control over how your input is processed. And there you have it! A simplified overview of tokens, infinite prompt length, and the BREAK keyword in Stable Diffusion.

Note: Each letter, number, and punctuation mark counts as a token, but spaces between them do not.

The Prompt Style section in Automatic1111/Forge provides tools for managing and customizing prompt styles. Here’s a breakdown of its components:

  • Buttons and Dropdown Menu:

    • Four buttons and a dropdown menu constitute this section.
    • These elements allow you to create presets or upload CSV files containing attributes related to image styles.
  • Button Functions:
    Click on the paintbrush icon to the right and the below menu will appear. Here, you will type in your information and save your presets. This becomes valuable as you start to collect data as you become more experienced to save time.

    • Blue Button (“Read generation parameters from prompt or last generation if prompt is empty”):
      • This button extracts generation parameters from the prompt or the last generated image (if the prompt is empty) and populates them in the user interface.
    • Trash Bin Icon (“Clears the prompts in the prompt and negative prompt box”):
      • Clicking this icon clears the content in the prompt and negative prompt boxes.
    • Paint Brush Icon (“Edit Style”):
      • Clicking this icon opens a menu where you can edit and save custom styles.
    • Dropdown Menu:
      • The dropdown menu allows you to select from available prompt styles that you can use for your creative process.

Sampling methods play an important role in the AUTOMATIC1111 Stable Diffusion model, which aims to generate denoised images. Let’s take a look at details:

1. What is Sampling?

In the context of Stable Diffusion, sampling refers to the process of generating a sequence of cleaner and cleaner images from a noisy starting point.

  • Here’s how it works:

    1. Initially, a completely random image is generated in the latent space.
    2. noise predictor estimates the noise present in the image.
    3. The predicted noise is subtracted from the image.
    4. This denoising process is repeated multiple times (usually a dozen times) until a clean image is obtained.
  • The method used for this denoising process is called the sampler or sampling method.

2. Noise Schedule:

  • The noise schedule controls the noise level at each sampling step.

  • Noise is highest at the first step and gradually reduces to zero at the last step.

  • Increasing the number of sampling steps results in smaller noise reduction between each step, which helps reduce truncation errors.

3. Samplers Overview:

While these samplers are essential for Stable Diffusion, they are just one part of the entire model. If you want to learn more, I recommend reading my guide on Sampling Methods for Stable Diffusion. Below is a list of sampling methods available in both Automatic1111 and WebUI Forge.

DPM++ 2M KarrasDPM++ SDE KarrasDPM++ 2M SDE ExponentialDPM++ 2M SDE KarrasEuler a
EulerLMSHeunDPM2DPM2 a
DPM++ 2S aDPM++ 2MDPM++ SDEDPM++ 2M SDEDPM++ 2M SDE Heun
DPM++ 2M SDE Heun KarrasDPM++ 2M SDE Heun ExponentialDPM++ 3M SDEDPM++ 3M SDE KarrasDPM++ 3M SDE Exponential
DPM fastDPM adaptiveLMS KarrasDPM2 KarrasDPM2 a Karras
DPM++ 2S a KarrasRestartDDIMPLMSUniPC

These sampling methods are available in both Automatic1111 and Forge. The images are generated using 10 steps.
DDPMDDPM KarrasDPM++ 2M SDE SGMUniformDPM++ 2M SDE TurboDPM++ 2M SGMUniform
DPM++ 2M TurboEuler A SGMUniformEuler A TurboEuler SGMUniformLCM
LCM Karras

These are currently only in WebUI Forge. The images are generated using 10 steps.

 To learn more about all the available sampling methods used for Stable Diffusion, please check out the guide below.

Select a Post
Select a Post

Post Excerpt

In Stable Diffusion, a Sampling Step refers to one iteration of the denoising process used to generate an image from random noise. The model starts with a noisy image and progressively refines it through a series of steps, each time reducing the noise and adding detail to the image. The number of sampling steps can affect the quality and clarity of the final image; more steps generally mean more detail at the cost of longer processing time.

Different samplers may have optimal ranges for the number of steps to produce the best results. For instance, some users find that most images stabilize around 30 steps, while others may increase the steps to 64 or even higher if they’re seeking more coherency or correcting issues like distorted limbs.

The choice of sampler and the number of steps used can significantly influence the outcome, and users often experiment to find the best combination for their specific needs. There are various samplers available, such as Euler, Heun, and DDIM, each with its own characteristics and settings. Each samplers and models perform differently at different steps. This is something you will need to play around with yourself, or look at the suggested directions for the models you use.

I’m working on a guide on how to save Prompt Styles:

Select a Post
Select a Post

Post Excerpt

HiRes.fix in Automatic1111 is a powerful feature that allows you to upscale your generated images while they’re being created. Keep reading to learn the details:

  • Upscaling During Generation:

    • Normally, you’d generate an image first and then upscale it using img2img or other options in Automatic1111.
    • However, HiRes.fix streamlines this process by increasing the resolution during image generation itself.
    • This approach ensures that no artifacts or irregularities are introduced during upscaling.
    • Additionally, it enhances the overall image quality.
  • Setting the Target Resolution:

    • When you open HiRes.fix, you’ll notice that it’s initially set to ‘Upscale by 2’.
    • This means that the default width and height (512×512) will be doubled, resulting in a 1024×1024 image.
    • You can further upscale by a maximum factor of 4x.
    • Be aware that the more you upscale, the longer it will take to generate the image.
  • Choosing an Upscaler:

    • HiRes.fix allows you to select from various upscalers available in a dropdown menu.
    • Each upscaler has unique characteristics, so experimentation is key.
    • Here are some options:
      • Latent Upscaler: A good general-purpose choice.
      • R-ESRGAN 4x+: Ideal for photorealistic images.
      • R-ESRGAN 4x+ Anime6B: Works well for animated/cartoon-style images.
  • HiRes Steps: Post-Sampling Brilliance:

    • HiRes steps refine image quality after the initial sampling.
    • These steps occur after the sampling steps and contribute to the overall image quality.
    • You can set HiRes steps in the range of 0–150.
    • Keeping it at 0 makes HiRes steps equal to the sampling steps.
    • For example, if you have 20 sampling steps and 0 HiRes steps, the total steps would be 40.

In summary, HiRes.fix simplifies the process of creating high-resolution Stable Diffusion images directly during generation, resulting in better quality and smoother workflows.

Learn more about upscaling below:

ControlNet Upscale: Learn How to Tile and Upscale with Ultimate SD Upscale
ControlNet Upscale: Learn How to Tile and Upscale with Ultimate SD Upscale

Learn how to upscale Stable Diffusion art with ControlNet Upscale. Ever tried to transform your favorite small image into a massive printable billboard, only to be met with unnatural smoothness, loss of details, or quirky artifacts?

It’s pretty obvious what the width and height parameter does in Automatic1111 and WebUI Forge. Let’s learn about it more in details.

  • Width and Height Parameters:

    • In Stable Diffusion, the Width and Height parameters determine the size of the generated images.
    • By default, Stable Diffusion produces images with dimensions of 512×512 pixels.
    • You can customize these dimensions to create rectangular images with specific aspect ratios.
    • Keep in mind that both the width and height should be multiples of 8 to avoid distortions.
    • Aspect ratios can differ. Common dimensions are 512×768 or vice versa. This depends on the models you’re using.
  • Model Training and Image Size:

    • Versions like Stable Diffusion 1.5 were trained on 512×512 images. They work well with 768 pixel images as well.
    • However, newer models like SDXL 1.0Stable Cascade, and Stable Diffusion 3.0 are trained on 1024×1024 images.
    • When generating images using these newer models, sticking to the aspect ratio they were trained on (i.e., 1:1 or 4:3) tends to yield the best results.
    • For example, if you choose a width of 512 and a height of 1024, you’ll maintain the same aspect ratio as the training data.
  • Automatic1111 & WebUI Forge:

    • The Height/Width parameters in Automatic1111 allow you to specify the desired image size.
    • To achieve op
      • Width: 512 pixels
      • Height: 1024 pixels

Remember that adjusting the image size affects both computational requirements and output quality. Sticking to the aspect ratio used during training ensures consistent and visually pleasing results in Stable Diffusion.

In Stable Diffusion, both batch size and batch count play important roles in generating images. Let’s learn what these parameters mean:

  • Batch Size:

    • Batch size determines the number of images generated in a single batch.
    • When you set the batch size, you’re essentially specifying how many images will be processed together in parallel.
    • Larger batch sizes require more VRAM (graphics card memory) because each image in the batch needs to be processed simultaneously.
    • If your batch size is too high, your graphics card might run out of memory, resulting in an “out-of-memory” error during image generation.
    • Generally, 8GB of VRAM is sufficient for Stable Diffusion unless you’re generating very large batches of images.
    • Increasing the batch size can improve efficiency, but it also increases VRAM usage and processing time.
    • Remember that VRAM is separate from your main system RAM and is essential for handling complex AI models.
    • Setting a higher batch size means more parallel image processing, but it also requires more VRAM.
  • Batch Count:

    • Batch count specifies the number of batches of images you want to generate.
    • In the Stable Diffusion WebUI, the batch size can go up to 8 images, while the batch count can go much higher (up to 100).
    • The total number of images generated is determined by multiplying the batch size by the batch count.
    • For example, if you set the batch count to 10 and the batch size to 1, Stable Diffusion will generate 10 images from the same text prompt.
    • Adjusting the batch count allows you to control the overall number of image generations.

In summary, batch size affects parallel processing and VRAM usage, while batch count determines the total number of images produced. Balancing these settings ensures efficient and successful image generation in Stable Diffusion.

In Stable Diffusion, the CFG Scale, which stands for Classifier Free Guidance scale, is a setting that influences how closely the generated image adheres to the text prompt you provide. A higher CFG Scale value means the output will more strictly align with the input prompt or image, potentially at the cost of introducing distortions. Conversely, a lower CFG Scale value may result in higher quality images that are less faithful to the prompt.

  • Purpose:

    • The CFG Scale controls the guidance provided to Stable Diffusion during image generation.
    • It plays a crucial role in both text-to-image (txt2img) and image-to-image (img2img) processes.
  • Functionality:

    • When you adjust the CFG Scale, you influence how closely the generated image aligns with your text prompt.
    • Higher CFG values result in images that closely match the prompt, while lower values allow for more creative deviations.
    • Striking the right balance is essential: too high, and the image may overcomplicate; too low, and it might stray from your intent.
  • Experimentation:

    • The ideal CFG Scale varies based on factors like the model and samplers you’re using and your desired outcome.
    • Some models prefer higher CFG for consistency, while others yield abstract results with lower CFG.
    • Experiment with different values to find what works best for your creative vision.
  • Remember, the CFG Scale acts as a balancing tool, shaping the interplay between prompt guidance and artistic exploration! 

Imagine CFG as a sliding scale that controls your guide’s attentiveness to your instructions. The default position, at a CFG value of 7, offers a balance, granting Stable Diffusion enough liberty to interpret your prompt creatively, while also ensuring it doesn’t stray too far off course. If you notch it down to 1, you’re essentially giving your guide free rein. Cranking it up above 15, however, is like you’re micromanaging the guide to stick strictly to your instructions.

While the Stable Diffusion Web UI limits the CFG scale between 1 and 30, there are no such limits if you’re working via a Terminal. You can push it all the way up to 999, or even enter negative territory!

CFG Scale Balancing

It’s a balancing act to find the right CFG Scale value that provides the best combination of fidelity to the prompt and image quality. Users often experiment with different values to find the sweet spot for their particular project.

  • Color Saturation and Contrast:

    • As you adjust the CFG value, you’ll notice changes in color saturation and contrast within the generated images.
    • Higher CFG values often result in more vivid colors and pronounced contrasts.
  • Detail and Artifacts:

    • However, there’s a delicate balance to strike.
    • Pushing the CFG value too high can lead to unintended consequences:
      • Loss of Detail: Images may lose fine details, becoming less intricate.
      • Blurrier Output: High CFG values might introduce blurriness.
      • Artifacts: Unwanted artifacts may appear, affecting image quality.
  • Scenario Example:

    • Consider using the DPM++ Karras model with 20 Sampling Steps (the default Stable Diffusion Web UI settings).
    • Experiment with CFG values to find the sweet spot that aligns with your creative vision.

The CFG Scale, ranging from 1 to 30, demonstrates noticeable differences at each level. However, these variations are also influenced by other factors and settings. Does that mean you’re stuck with blurry images if you want to stick to higher CFG values? Not quite! You can counterbalance this by:

  • Increasing Sampling Steps:

    Adding more sampling steps typically adds more detail to the output image, but be mindful of processing times, which could increase as well.

  • Switching Sampling Methods:

    Some sampling methods perform better at specific CFG and sampling steps. For instance, UniPC tends to deliver good results at a CFG as low as 3, while DPM++ SDE Karras excels at providing detailed images at CFG values greater than 7.

More on CFG Scales below:

Select a Post
Select a Post

Post Excerpt

To squeeze the best image quality from Stable Diffusion without blowing up memory and processing times, it’s essential to strike the right balance between CFG, sampling steps, and the sampling method. The XYZ Plots technique is a handy tool to neatly display all your results in an organized grid. Be sure to check out my dedicated blog post on mastering XYZ Plots, linked below.

Latent Latent (antialiased) Latent (bicubic) Latent (bicubic antialiased) Latent (nearest) Latent (nearest-exact) None Lanczos Nearest DAT x2, x3, x4 ESRGAN_4x LDSR R-ESRGAN 4x+ R-ESRGAN 4x+ Anime6B ScuNET GAN ScuNET PSNR None

When you enable HiRes steps in Automatic1111, it reveals a set of sliders that allow you to fine-tune the resolution of your upscaled images. These steps occur after the initial sampling and contribute significantly to the overall image quality. By adjusting these sliders, you can precisely control how much the image is upscaled, ensuring optimal results for your specific requirements.

  1. What Are HiRes Steps?
    • HiRes steps refine image quality after the initial sampling process.
    • They occur after the sampling steps and are responsible for upscaling the image.
    • The total number of steps in the process includes both the sampling steps and the HiRes steps.
    • You can set the HiRes steps in the range of 0 to 150.
    • If you keep the HiRes steps at 0, it means that there are no additional upscaling steps beyond the initial sampling.
  2. Why Are HiRes Steps Important?
    • HiRes steps allow for further enhancement of image quality.
    • They help refine the denoised image obtained from the sampling process.
    • By adjusting the number of HiRes steps, you can control the trade-off between image quality and computational resources.
    • Note that the impact of HiRes steps on image generation diminishes beyond a certain point.

Denoising Strength

Upscale by

  • The “Upscale By” slider controls the factor by which the image is upscaled during generation.
  • By default, Automatic1111 sets the width and height at 512 x 512 pixels.
  • When you open HiRes.fix, it’s initially set to ‘Upscale by 2’, resulting in a 1024 x 1024 image.
  • You can upscale it by a maximum factor of 4x.
  • Keep in mind that the more you upscale, the longer it will take to generate your image.

Resize width/Height to

HiRes Checkpoint/Sampling Method

In Automatic1111 version 1.8, two new tabs have been introduced in the Hires Fix section: Hires Checkpoint and Hires Sampling Method. These tabs provide additional flexibility when upscaling and fixing images.

  1. Hires Checkpoint: This feature allows you to select a different checkpoint for the upscaling process. It’s particularly useful if you want to introduce a distinct style to your image during upscaling. You can choose a checkpoint that aligns better with your desired outcome.
  2. Hires Sampling Method: The sampling method complements different models. Experimenting with various combinations of checkpoints and sampling methods will help you find the optimal approach for upscaling your images. Additionally, you have the option to prompt the model for specific checkpoints and samplers, providing fine-grained control over the upscaling process.

In summary, HiRes.fix provides a way to upscale your images during the generation process, improving quality and avoiding post-generation artifacts. Play around with these settings to create stellar images in Stable Diffusion using Automatic1111🌟🖼️.

Finding the right balance between denoising and upscaling is essential for achieving optimal results in the AUTOMATIC1111 Stable Diffusion model. Experiment with different settings to achieve the desired output! 🌟

Employing the Hires.fix function effectively can help avoid issues such as twinning and loss of composition in your upscaled images. “Twinning,” in this context, refers to the unwanted duplication or multiplication of features in your creations. For instance, this might result in characters with two faces or two heads, which can be visually disruptive unless intentionally desired.

Restore Faces Button – How does restore faces work in Stable Diffusion?

Restore Faces in Automatic1111

The Restore Faces feature in Automatic1111 is designed to enhance and restore facial features in images and videos. Specifically, it focuses on improving the appearance of faces by applying advanced restoration techniques. When enabled, it can significantly enhance the quality of facial details, reduce imperfections, and create more visually appealing results.

Face Restoration Models

  1. CodeFormer:
    • This model utilizes a deep learning architecture to restore faces. It is trained on a large dataset of facial images and can handle various lighting conditions, poses, and expressions.
    • The goal of CodeFormer is to produce realistic and high-quality results by learning from diverse face images.
  2. GFPGAN:
    • GFPGAN is another face restoration model that uses generative adversarial networks (GANs).
    • It enhances facial features by learning from a wide range of face images, resulting in improved visual quality.

How to Turn On Restore Faces

To enable the Restore Faces feature in Automatic1111, follow these steps:

  1. Open Automatic1111/WebUI Forge.
  2. Navigate to the Settings section.
  3. Look for the Postprocessing settings on the left sidebar.
  4. Under Face Restoration, you’ll find the Restore Faces option.
  5. Click on the Restore Faces button to enable it.
  6. Apply Settings > Reload UI

CodeFormer Weight Slider

  • The weight slider allows you to control the intensity of the face restoration effect. By adjusting the slider:
    • Moving it towards the right increases the restoration effect.
    • Moving it towards the left reduces the effect, preserving more of the original facial features.

Move Face Restoration Model from VRAM After Processing

  1. After the face restoration process (e.g., applying enhancements to an image or video frame), some software applications choose to move the model from VRAM to RAM.
  2. Here’s how it works:
    • During Processing:
      • The model resides in VRAM for fast access by the GPU during active processing.
    • After Processing:
      • Once the processing is complete, the model is moved from VRAM to RAM.
      • By moving the model to RAM:
        • It frees up VRAM for other tasks.
        • It ensures that the model is available for future frames or subsequent processing steps.
        • Both the CPU and GPU can access the model, allowing for more flexibility.

Might be better off sometimes.

In some instances, leaving the ‘Restore Face’ function off could yield better results. On the flip side, if you notice an image where a face appears incomplete or unrefined, that might be a perfect scenario to engage the ‘Restore Face’ feature. In the end, it’s all about experimenting and finding the balance that works best for your creative process. Or you can use After Detailer below for better face detailing.

ADetailer 101: A Detection and Inpainting Tool for Image Quality Improvement in Stable Diffusion
ADetailer 101: A Detection and Inpainting Tool for Image Quality Improvement in Stable Diffusion

Discover the power of ADetailer, a web-UI extension for image enhancement. Fix distorted faces, change backgrounds, add or remove objects, and more!

Seeds are numbers that define the starting point of your image generation in Automatic1111, a GUI for stable diffusion. They control the content of the image, so changing the seed will result in a different image, even if you keep all the other settings the same. Seeds can be set to any integer value, or -1 for a random seed.

You can think of seeds as coordinates in a very large and complex map of possible images. Each seed corresponds to a unique location on this map, and each location has a different image. When you use Automatic1111, you are exploring this map by moving from one location to another, using your text prompt and other settings as guidance.

The seed value is important because it determines what kind of images you will see when you use Automatic1111. For example, if you use the same prompt but different seeds, you will get different variations of the same concept. If you use the same seed but different prompts, you will get different interpretations of the same image. If you use the same seed and the same prompt, you will get the same image every time.

You can find the seed used to generate an image in Automatic1111 by looking at the output window, the filename, or the PNG info tab. You can also change the seed value manually by typing it in the seed box, or clicking the random button to generate a new random seed. By changing the seed value, you can explore different images that match your prompt and settings.

Additional Seed Tools: Dice, Recycle and ‘Extra’

How to use the Seed in Automatic1111/Forge

The Seed is a number that defines the starting point of your image generation in Automatic1111, a GUI for stable diffusion. It controls the content of the image, so changing the seed will result in a different image, even if you keep all the other settings the same. Seeds can be set to any integer value, or -1 for a random seed.

Seed icons

Next to the Seed box, there are a few icons that you should know about:

  • The dice icon: This sets your Seed to -1, which means Automatic1111 will choose a random Seed for each image.
  • The recycle icon: This copies the Seed from your previous image.

Extra Seed menu

If you want to have more control over the Seed, you can click on the ‘Extra’ option. This will show you the Extra Seed menu with some advanced settings:

  • Variation seed: This is another Seed that you can use to modify your image.
  • Variation strength: This is a slider that lets you adjust how much of the Variation seed you want to apply to your image. A value of 0 means you only use the original Seed, while a value of 1 means you only use the Variation seed.
  • Resize seed from width/height: This is a checkbox that you should enable when you change the size of your image. Otherwise, changing the width or height will also change the content of your image, even if you keep the same Seed. By checking this option, you preserve the content of your image when resizing.

Finding and changing the Seed

You can find the Seed used to generate an image in Automatic1111 by looking at the output window, the filename, or the PNG info tab . You can also change the Seed value manually by typing it in the Seed box, or clicking the random button to generate a new random Seed. By changing the Seed value, you can explore different images that match your prompt and settings.

The ‘Script’ is a feature that allows you to automate the image generation process in Automatic1111, a GUI for stable diffusion. It lets you run custom scripts that can modify the settings, prompts, seeds, and outputs of your images. Scripts can be written in Python or other languages, and can be loaded from files or text boxes. Scripts can also be shared and installed from online sources.

Script options

Next to the Script box, there is a dropdown menu that shows you the available scripts that you can use. By default, you have four options:

  • None: This means you don’t use any script, and you generate images manually by clicking the buttons.
  • Prompt Matrix: This is a built-in script that generates a matrix of images based on different combinations of prompts. You can specify the prompts in the text box, separated by commas. You can also use negative prompts by adding a minus sign before them. For example, if you type cat, dog, -bird, you will get images of cats, dogs, and cats and dogs, but not birds.
  • Prompts from file or textbox: This is a built-in script that generates images based on a list of prompts that you provide in a file or a text box. You can specify the file name or the text box name in the Script box, and the script will read the prompts from there. Each prompt should be on a separate line. You can also use negative prompts by adding a minus sign before them. For example, if you have a file named animals.txt with the following content:
  • X/Y/Z plot: This allows you to generate variations of images based on settings for each parameter within Automatic1111, such as seed, scale, prompt weights, denoising strength, etc. You can also use expressions or functions to define the parameters. 
Maximize Your Workflow Efficiency: Pre-Planning with Stable Diffusion’s XYZ Plot Image Grid
Maximize Your Workflow Efficiency: Pre-Planning with Stable Diffusion’s XYZ Plot Image Grid

Successful AI creations hinge not only on talent, but also on the effective use of sophisticated tools. Among them, the XYZ Plot of Stable Diffusion shines bright. This tool streamlines your creative process, offering a tangible roadmap for your AI visual generation. How do you use XYZ plot in Stable Diffusion? The XYZ plot, found…

The Img2Img tab in Automatic1111 is a feature that allows you to generate new images from an input image and a text prompt. The output image preserves the color and composition of the input image, but modifies it according to the text prompt. For example, you can use the Img2Img tab to turn a simple sketch into a realistic painting, or to add details to an existing image. The Img2Img tab uses a technique called Stable Diffusion, which is a state-of-the-art method for image synthesis.

You can learn more about img2img in the guide below. It is pretty deep and deserves a completely different guide for it. 👇

Inpainting 101: How to Inpaint Anything in Stable Diffusion using Automatic1111
Inpainting 101: How to Inpaint Anything in Stable Diffusion using Automatic1111

Learn about Stable Diffusion Inpainting in Automatic1111! Explore the unique features, tools, and techniques for flawless image editing and content replacement.

PNG info is a feature in Automatic1111 that allows you to view and edit the text information stored in PNG files. This information can include the prompt, the negative prompt, and other generation parameters used to create the image with Stable Diffusion. You can also send the image and the parameters to other tabs in Automatic1111, such as img2img, inpainting, or upscaling, to modify or enhance the image. PNG info can help you understand how an image was generated, recreate it, or use it as a starting point for your own creations.

PNG info of Redhead woman in Automatic1111

Extras Tab for Upscaling Your Images

The Extras tab in Automatic1111 is a section with features that allows you to enhance and customize your images using various options. You can use this tab to upscale your images by choosing the scale factor or the target size. You can also use CodeFormer, a robust face restoration algorithm, to improve the quality and fidelity of faces in your images. Additionally, you can split oversized images into smaller pieces, crop them automatically based on focal points or size, create flipped copies for data augmentation, and generate captions using Deepbooru, a deep learning-based tag estimation system.

The Extras tab is useful for creating high-resolution, realistic, and diverse images for your projects. I will explore this further in the future, but for now, check out my guide on upscaling using the features in Extras.

ControlNet Upscale: Learn How to Tile and Upscale with Ultimate SD Upscale
ControlNet Upscale: Learn How to Tile and Upscale with Ultimate SD Upscale

Learn how to upscale Stable Diffusion art with ControlNet Upscale. Ever tried to transform your favorite small image into a massive printable billboard, only to be met with unnatural smoothness, loss of details, or quirky artifacts?

This means that you will ignore some of the details of the cat, such as its fur pattern or eye color, and focus more on its general features, such as its shape or size. This might give you a more abstract or creative image of a cat, but it might also be less realistic or accurate.

Checkpoint Merger: A Tool for Combining and Creating Models

Checkpoint Merger is a complex but intriguing aspect of this toolset, requiring its own dedicated tutorial for complete understanding. In essence, this feature allows you to amalgamate different fine-tuned models, generating novel ones. It provides the ability to create Checkpoints and Safesensors, and it even supports the integration of VAEs.


  • Be sure to revisit for a comprehensive guide on how to effectively utilize the Checkpoint Merger in the near future.

Learn to Train Your Own AI Art Models

The Train tab, a gateway to personalized creative AI work suit, enables you to train your own model. This complex area, equipped with tools like Hypernetwork, Textural Inversion, and Dreambooth, demands its own focused tutorial, which we’ll be providing in due time.

Training your model paves the way for a personalized approach to AI art, replacing the hit-and-miss method with precision. By using machine learning to customize your style and coupling it with other tools like Runway ML Gen 2 and ControlNet, you can unlock a plethora of creation possibilities, ranging from comics to full-blown films. The beauty of Stable Diffusion is directly proportional to your dedication.

Despite the brilliant video generation capabilities of tools like RunwayML, Stable Diffusion Automatic1111, equipped with extensions and free of charge, brings similar functionality right to your fingertips. With the right hardware, you can let your creativity run wild.

For a guide on training your own diffusion models, check out this link.

  • Related – ‘A Guide to Training Your Personalized Stable Diffusion Models’ Coming Soon’
Stable Diffusion: Captioning for Training Data Sets
Stable Diffusion: Captioning for Training Data Sets

This captions and data sets guide is intended for those who seek to deepen their knowledge of Captioning for Training Data Sets in Stable Diffusion. It will assist you in preparing and structuring your captions for training datasets.

Extensions – Uncovered Features in Automatic1111

The Extension tab in AUTOMATIC1111’s Stable Diffusion Web UI is where you can manage and install extensions created by the community. These extensions enhance the functionality of the web interface and provide additional features beyond the default capabilities. You can learn how to install extensions below:

How to Install Automatic1111 Extensions for Stable Diffusion
How to Install Automatic1111 Extensions for Stable Diffusion

Enhance your Stable Diffusion experience with Automatic1111 Extensions. Follow our comprehensive guide to install Automatic1111 extensions.

Github Automatic111 v1.8.0 Updates information

How to Install SDXL 1.0 for Automatic1111: A Step-by-Step Guide
How to Install SDXL 1.0 for Automatic1111: A Step-by-Step Guide

Welcome to this step-by-step guide on How to install SDXL 1.0 for Automatic1111. This blog post aims to streamline the installation process for you, so you can quickly utilize the power of this cutting-edge image generation model released by Stability AI.

Official Automatic1111 Web UI guide found here.

Learning Stable Diffusion 101:


Tags And Categories

In: ,

Share this post

Leave a Reply

Your email address will not be published. Required fields are marked *

Horizontal ad will be here