ComfyUI Course: Ep04 - IMG2IMG and LoRA Basics
Move beyond simple prompts,learn how to blend existing images with custom styles using ComfyUI’s IMG2IMG and LoRA tools. Gain practical skills to control, remix, and iterate your AI art with intention, precision, and creative flair.
Related Certification: Certification in Applying IMG2IMG Techniques and Implementing LoRA Models with ComfyUI

Also includes Access to All:
What You Will Learn
- Build IMG2IMG workflows in ComfyUI and use VAE Encode
- Adjust denoising strength to control edits vs. remixes
- Install, load and calibrate LoRA models with trigger words
- Integrate LoRA into IMG2IMG for consistent style transfer
- Troubleshoot common issues like memory, VAE mismatches, and artifacts
Study Guide
Introduction: Unlocking ComfyUI’s Power with IMG2IMG and LoRA
In the world of AI image generation, there’s a point where you stop simply prompting and start truly directing the creative process. That’s what this guide is about: moving beyond text-to-image basics and learning how to take control using two of ComfyUI’s most transformative features,Image-to-Image (IMG2IMG) workflows and LoRA (Low-Rank Adaptation) models.
This course is designed to walk you through the practical, tactical, and conceptual sides of both IMG2IMG and LoRA basics. You’ll learn when and why to use each, how to structure your workflows, what pitfalls to avoid, and how to combine these tools for powerful results. By the end, you’ll be able to create images that blend the strengths of existing visuals with the nuanced adaptation that LoRA models offer, unlocking new creative possibilities that were previously out of reach to most users.
Understanding the Foundations: What Makes IMG2IMG and LoRA Different?
Before we dive into the how-to, let’s clarify what sets these tools apart from the default ComfyUI text-to-image workflow.
- Text-to-Image: You start with a blank “latent” canvas,an empty space waiting for your prompts to conjure something from nothing.
- Image-to-Image (IMG2IMG): You begin with an existing image. This isn’t just a background; it’s the foundation, the DNA, of what comes next. Your prompts and parameters now act as tools of transformation instead of pure creation.
- LoRA: Instead of retraining a massive model for every new style, subject, or niche, LoRA lets you “bolt on” specialized knowledge,fine-tuning the model’s behavior for specific tasks, styles, or subjects, with minimal computational cost.
By mastering these, you move from experimenting to engineering your vision. Let’s break down the mechanisms.
Section 1: Building an IMG2IMG Workflow in ComfyUI
What is an IMG2IMG Workflow?
An IMG2IMG workflow lets you use a pre-existing image as the start point for generation. Think of it like a sculptor starting with a rough block of marble (your input image), and using prompts and parameters as chisels to shape it into something new. Unlike text-to-image workflows that create from a void, IMG2IMG gives you an anchor,a way to blend the intent of your prompt with the content of an image.
Why Use IMG2IMG?
- You want to remix, retouch, or evolve an existing image with AI help.
- You’re refining generated art, making incremental changes, or keeping composition consistent across variations.
- You want to combine the structure of a photo with the imagination of a prompt (e.g., turning a sketch into a painting, or a photo into a fantasy scene).
1.1 The Core Nodes: From Empty Latent to Loaded Image
Default Text-to-Image Flow Recap:
- “Empty Latent Image” node: Starts with blank latent space.
- “Load Checkpoint” node: Loads the base diffusion model.
- “Prompt” nodes: Encode your positive and negative prompts.
- “K Sampler” node: Generates the new image in latent space.
- “VAE Decode” node: Turns the generated latent image back into a viewable image.
Transitioning to IMG2IMG:
The main change: swap out the “Empty Latent Image” node for a “Load Image” node.
But it’s not as simple as plugging your photo in. The diffusion model expects a “latent” format,a compressed, internal representation,not regular pixel data.
The Solution:
- Add a “Load Image” node to bring in your external image.
- Use a “VAE Encode” node to convert that image from pixel mode to latent mode.
- Make sure the “VAE Encode” node is connected to the same VAE as your base model (the checkpoint).
Example 1:
Suppose you have a photo of a cityscape. You want to turn it into a cyberpunk version. You’d:
1. Load the cityscape with “Load Image”.
2. Encode it with “VAE Encode”.
3. Feed this latent image into the “K Sampler”, using prompts like “cyberpunk, neon lights, futuristic”.
4. Set your denoising strength (more on this soon) to control how much the city changes.
Example 2:
You generated a fantasy portrait in a previous session but want to slightly alter the character’s expression. Copy and paste the previous result, load it into the workflow, and use a prompt like “smiling, gentle expression”, adjusting the denoising strength to keep most of the image intact while changing just the emotion.
1.2 The Role of VAE Encode: Why It Matters
What is VAE Encode?
The “VAE Encode” node is critical because stable diffusion models process images in a latent space (a kind of compressed, learned representation), not as direct pixel grids. Encoding your image with VAE bridges the gap between the world of regular images and the world of diffusion models.
Best Practice: Always ensure the VAE you use for encoding matches your model’s VAE for decoding. Mismatched VAEs can lead to color shifts or artifacts.
Practical Application:
- If you want to use a hand-drawn sketch as a starting point, VAE Encode will convert it to the right format.
- When iterating on a generated image, VAE Encode lets you “re-enter” the diffusion process with your result as the new base.
1.3 Denoising Strength: The Creative Lever
Definition: Denoising strength (set in the K Sampler node) controls how much the output image will deviate from the input image.
The Range:
- Low denoising strength (e.g., 0.1): Minor modifications. The input image is almost unchanged, only slight details may shift.
- High denoising strength (e.g., 1): The input image is nearly obliterated,turned into noise. Output is driven almost entirely by your prompt.
- Mid-range (e.g., 0.65): A balance. The input image is still recognizable, but the prompt has significant influence.
Analogy: Imagine tracing paper over your image.
- Low denoising = very transparent tracing paper. You see the original clearly, new details are subtle.
- High denoising = nearly opaque tracing paper. You’re drawing something new, barely referencing the original.
Example 1:
You have a photo of a dog and want to “cartoonize” it. Setting denoising to 0.2 will keep most facial features, just smoothing the style. At 0.8, the dog might become a different breed or even a new animal, guided more by your prompt than the original image.
Example 2:
You generated an image of a castle in a forest, but want to see it at night. A denoising strength of 0.5 with a prompt like “castle at night, illuminated windows” will keep the castle structure but shift lighting and mood.
Tips:
- To preserve the overall layout and composition, start with low to mid denoising strengths.
- For radical changes, push towards higher values, but don’t expect the output to closely resemble the input.
- Experiment! The sweet spot for “creative remixing” is typically around 0.6 to 0.7.
1.4 Image Sizing: Avoiding Performance Pitfalls
Why Image Size Matters:
Stable diffusion models aren’t like Photoshop,they have strict requirements on image dimensions. Large images can cause memory errors, slow processing, or even crash your workflow.
Guidelines:
- For SDXL and similar models, keep images around 1024x1024 pixels.
- Image dimensions should be divisible by 64 (e.g., 512, 768, 1024). This fits the internal structure of most models.
Example 1:
You try loading a 4000x3000 pixel photo. The workflow grinds to a halt, or you get an out-of-memory error.
Example 2:
A 1024x768 image works smoothly, but a 1050x800 image gives odd results or warning messages.
Tip: Resize large images before loading, or use ComfyUI’s internal tools for scaling.
1.5 The Upscale Image Node: Resizing with Precision
What It Does: The “Upscale Image” node lets you adjust the size of your input image before it’s encoded by the VAE. This prevents memory errors and lets you match your base model’s preferred dimensions.
How to Use:
1. Place “Upscale Image” between “Load Image” and “VAE Encode”.
2. Set output width and height to match the model (e.g., 1024x1024 for SDXL).
3. Optional: Use cropping to adjust aspect ratio if needed.
Example 1:
You have a 1600x1200 pixel sketch. Upscale it down to 1024x1024 so the model can process it efficiently.
Example 2:
You want to create a portrait. Crop a 1920x1080 image to 1024x1024, focusing on the subject’s face.
Tip: Upscale or downscale images inside the workflow to streamline your process and avoid back-and-forth with external editors.
1.6 Efficient Iteration: Copy, Paste, and Workflow Shortcuts
Speeding Up Your Process:
ComfyUI makes iteration fast. You can copy a generated image (right-click, “copy image”), and then paste it directly into your workflow (Ctrl+V). This auto-creates a “Load Image” node with that image loaded.
Why This Matters:
- Use a generated image as the next starting point without saving and reloading.
- Build chains of modifications, refining your image step by step.
Example 1:
You generate an artwork you like, but want to try a new color scheme. Copy the image, paste it, and use a new prompt (“vivid colors, sunset tones”).
Example 2:
You’re designing a product and want to show small variations. Copy and paste each result, changing the prompt slightly each time (“blue version”, “sleeker design”, etc.).
Tip: Use this shortcut for rapid prototyping. It’s much faster than exporting and re-importing images for each iteration.
Section 2: LoRA Models,Fine-Tuning Without the Overhead
What is LoRA?
LoRA stands for Low-Rank Adaptation. It’s a method to fine-tune large pre-trained models for specific tasks or styles without retraining the entire model. Instead, it updates only small, targeted parameters, making adaptation fast and resource-efficient.
When to Use LoRA:
- The base model doesn’t know about your specific subject (e.g., a new gadget, a celebrity, a rare art style).
- You want consistency in style or subject across multiple generations.
- You need to adapt to a specific workflow without spending hours (or days) retraining a giant model.
2.1 LoRA in Practice: Typical Use Cases
Example 1:
You want to generate images of a vintage camera model that the base diffusion model has never seen. Download a LoRA trained on that camera, and you can now prompt for it directly.
Example 2:
You’re producing illustrations in the style of a famous painter. A LoRA trained on that artist’s works lets you generate consistent, on-style images using the right trigger words.
Example 3:
You want to generate a series of images of the same fictional character with recurring features. A character-specific LoRA ensures the model “remembers” those features every time.
Example 4:
You need to create product images for a yet-to-be-released tech device. A LoRA trained on concept renders allows you to visualize it in any scenario.
2.2 Downloading and Installing LoRA Models
Where to Find LoRAs:
- CivitAI is the go-to resource. You’ll find thousands of LoRAs for people, styles, objects, and more.
How to Install:
1. Download the LoRA file (usually a .safetensors or .ckpt file).
2. Place it in your ComfyUI directory under models/loras.
3. Restart ComfyUI if it was running.
Tip: Always choose a LoRA compatible with your base model. SDXL LoRAs for SDXL models, v1.5 LoRAs for v1.5, etc. Mismatches can lead to errors or poor output.
Example 1:
You want an SDXL-compatible “steampunk” art style. Download an SDXL LoRA from CivitAI, and drop it into models/loras.
Example 2:
You found a character LoRA for SD 1.5 and are using an SDXL base model. It doesn’t work,models must match!
2.3 Integrating LoRA into a ComfyUI Workflow
Node Placement Is Key:
- Add a “Load LoRA” node (sometimes called “load laura”).
- Place it between the “Load Checkpoint” (your base model) and the prompt encoding nodes (positive/negative prompts).
Why This Order?
The model is first adapted with the LoRA’s knowledge, then your prompts are interpreted. This ensures the adjustments from the LoRA are active when reading your text instructions.
Example 1:
- Your flow: Load Checkpoint → Load LoRA → Prompt Encoding → K Sampler...
Example 2:
If you put the LoRA after the prompts, the LoRA’s effect is not considered when interpreting your prompt. This reduces the impact or breaks the workflow.
Best Practice: Always connect the LoRA node before your prompt nodes.
2.4 Trigger Words: Activating LoRA Effects
What Are Trigger Words?
Most LoRA models require you to include specific words or phrases in your prompt. These trigger words “activate” the specialized knowledge or style embedded in the LoRA.
Why Are They Important?
If you forget the trigger word, the LoRA’s effect may not show up at all, or only partially.
Example 1:
You downloaded a LoRA for “cyber samurai” art style. The trigger word is “cybersamurai”. Use a prompt like “A cybersamurai in a neon-lit city” to see the effect.
Example 2:
You’re using a LoRA for a specific celebrity. If the trigger word is “JaneDoe”, your prompt must include “JaneDoe” for the model to generate that likeness.
Tip: Check the LoRA’s description on CivitAI or its download page for trigger words and recommended usage.
2.5 LoRA Strength: Calibrating the Influence
Strength Model Parameter:
This controls how much the LoRA modifies your base model. It’s a slider,set too low, and the effect is faint. Too high, and the image may become distorted or lose quality.
Best Practices:
- Start between 0.3 and 1.
- Going above 1.5 or 2 can degrade image quality or cause artifacts.
- Adjust until you find a balance between the base model and the LoRA’s effect.
Example 1:
You want a subtle watercolor style. Set LoRA strength to 0.4.
Output: Realistic image with gentle watercolor wash.
Example 2:
You want maximum effect from a cartoon LoRA. Set strength to 1.
Output: Strong stylization, but still recognizable subject.
Example 3:
You use 2.5 on a LoRA for “oil painting” style. Output is distorted, colors bleed, and quality drops.
Tip: Less is often more. Small increments yield big changes.
2.6 Practical LoRA Workflow Example
1. Download a LoRA for “vintage car” from CivitAI.
2. Place in models/loras.
3. In ComfyUI, connect:
- Load Checkpoint (SDXL) → Load LoRA (vintage car, strength 0.7) → Prompt Encoding (“A detailed painting of a [trigger word] in a city street”) → K Sampler...
4. Generate image. The car in the image now matches the specific model/style from the LoRA.
Advanced Tip: You can chain multiple LoRAs in sequence, but start simple. Too many can create unpredictable results.
Section 3: Combining IMG2IMG with LoRA,A Creative Powerhouse
The Synergy: By blending IMG2IMG workflows with LoRA, you can start with an existing image and fine-tune the creative transformation with both your prompt and a specialist LoRA. This lets you, for example, take a photo of a friend and “paint” them in the style of a manga artist, or update a product shot with the aesthetics of a new design trend.
Key Steps:
1. Load your image (“Load Image” node).
2. Resize if needed (“Upscale Image” node).
3. Encode to latent (“VAE Encode” node).
4. Load the base model (“Load Checkpoint” node).
5. Insert “Load LoRA” node.
6. Encode prompts (with trigger words).
7. Set denoising strength below 1 (e.g., 0.5–0.7) in the K Sampler.
Why Lower Denoising?
If denoising strength is 1, your input image is ignored,only the prompt (and LoRA) matter. Lower values blend the input image with the LoRA’s influence, producing a result that is both recognizable and stylized.
Example 1:
You have a photo of a mountain landscape. You want it in the style of a Japanese woodblock print (with a matching LoRA).
- Set denoising to 0.6.
- Prompt: “Japanese woodblock print, [trigger word], misty mountains”.
Result: The landscape’s composition is preserved, but the colors and lines reflect the woodblock style.
Example 2:
You want to update a product photo with a futuristic “techwear” vibe using a LoRA for that style.
- Set denoising to 0.4.
- Prompt: “Techwear [trigger word], sleek, high-contrast”.
Result: The product remains recognizable, but the style is transformed.
Tip: The lower the denoising, the more the original image shines through; the higher, the more the LoRA and prompt take over. Find the balance that matches your intent.
Section 4: Troubleshooting and Best Practices
Common Issues and Solutions:
1. Memory Errors or Slow Generation
Cause: Input image is too large for your GPU’s memory.
Solution: Use the “Upscale Image” node to resize to 1024x1024 or less. Make sure both width and height are divisible by 64.
2. Unexpected Colors or Artifacts
Cause: VAE mismatch or too high LoRA strength.
Solution: Ensure VAE for encode/decode matches the base model. Reduce LoRA strength if needed.
3. LoRA Not Working
Cause: Missing trigger word in prompt, model incompatibility, or incorrect node order.
Solution: Double-check trigger word(s), confirm LoRA is for your model version, place LoRA node between checkpoint and prompt nodes.
4. Image Output Doesn’t Match Input Enough (or Too Much)
Cause: Denoising strength set too high (or too low).
Solution: Adjust denoising strength in the K Sampler. Higher = more change, lower = less.
5. Workflow Not Updating as Expected
Cause: Node bypassed, or not re-executed after change.
Solution: Check if any node is bypassed, and re-run the workflow after making edits.
Pro Tips:
- Document your favorite denoising and LoRA strength settings for each use case.
- Name your nodes for clarity, especially in complex workflows.
- Use the copy/paste image shortcut for fast iteration and creative exploration.
Section 5: Practical Applications and Creative Experiments
1. Character Design:
Start with a sketch or previous generation, use a character LoRA with appropriate trigger word, and guide the output to a new pose or costume via prompt and denoising strength.
2. Style Transfer:
Take a landscape photo, apply a LoRA for an impressionist or abstract art style, and adjust denoising to blend structure with style.
3. Product Visualization:
Load a prototype image, resize it, and use a LoRA for a specific material or lighting effect (“matte finish”, “studio lighting”). Test with different strengths for subtle or bold effects.
4. Art Restoration or Alteration:
Use low denoising to gently “repair” damaged art scans, or higher settings to reimagine them in new styles.
5. Iterative Exploration:
Generate a base image, copy and paste it, try different LoRAs or prompts, and build a series. This is powerful for storyboarding or concept art.
Section 6: Deepening Your Mastery,Glossary and Key Concepts
Basic Text-to-Image Workflow: Start with empty latent space, generate image from prompt.
Image-to-Image Workflow: Use a user-provided image as the start point, modify with prompt and settings.
SDXL: Stable Diffusion XL, a powerful diffusion model.
Latent Image: Compressed, internal image format used by diffusion models.
Pixel Mode: Standard image format (pixels).
VAE Encode/Decode: Convert between pixel and latent formats.
K Sampler: Generates images, applies denoising.
Denoising Strength: Controls how much the image is changed.
Upscale Image Node: Resizes input images within workflow.
Load Image Node: Loads external image.
Load Checkpoint Node: Loads base model.
LoRA (Low-Rank Adaptation): Efficiently fine-tunes models for specific tasks/styles.
CivitAI: Resource for LoRA downloads.
Trigger Words: Required words to activate LoRA effect.
Strength Model (LoRA Strength): Controls LoRA’s influence.
Positive/Negative Prompts: Describe what to include or avoid in the image.
Seed: Controls randomness.
Bypass Option: Temporarily disables a node.
Section 7: Frequently Asked Questions and Quick Reference
Q: What’s the main difference between text-to-image and image-to-image in ComfyUI?
A: Text-to-image starts from scratch (empty latent), while image-to-image uses your chosen image as the foundation, letting you blend its content with new prompts.
Q: Why do I need to use “VAE Encode” when loading an image?
A: Because stable diffusion models work in latent space, not pixel space. VAE Encode converts your image to the right format.
Q: What happens if I load an image that’s too large?
A: You risk running out of memory, slow generation, or distorted outputs. Always resize to model-appropriate dimensions.
Q: How do I quickly use a generated image as the next input?
A: Right-click to copy the image, then Ctrl+V to paste it,ComfyUI will create a new “Load Image” node for you.
Q: Where do I put downloaded LoRA models?
A: Place them in comfyui/models/loras.
Q: Do I always need to include trigger words?
A: Yes. Without the trigger word(s), the LoRA effect may not show up.
Q: What’s a good starting value for LoRA strength?
A: Between 0.3 and 1. Adjust as needed for more or less effect.
Q: Can I combine LoRA with IMG2IMG?
A: Absolutely. Just remember to lower denoising strength below 1 to keep the input image’s influence.
Conclusion: Take Control,From Experimentation to Creative Direction
By now you’ve learned how to move beyond basic prompting in ComfyUI. You understand the architecture of both IMG2IMG and LoRA workflows, how to merge them, and how to fine-tune every lever,from denoising strength to LoRA parameters,to get the results you want.
The magic of this approach is in the balance: you can remix existing images, imbue them with new styles or subjects, and iterate rapidly,all without losing control or falling into randomness. This isn’t just about making pretty pictures; it’s about creative ownership.
Apply what you’ve learned. Experiment with subtle and bold changes. Try workflows for character design, product visualization, or art style transfer. Remember to document your favorite settings and combinations,they’ll become your creative toolkit.
Master these skills, and you move from asking AI for images to collaborating with it, directing the outcome with intention and precision.
Keep pushing, keep creating, and let ComfyUI become the paintbrush in your digital studio.
Frequently Asked Questions
This FAQ section is designed to provide clear, practical answers for anyone learning to use ComfyUI’s image-to-image workflow and LoRA models. Whether you’re just starting out or looking to optimize your creative process, you’ll find guidance on everything from essential nodes and settings to real-world applications and troubleshooting tips.
What is image-to-image workflow in ComfyUI?
The image-to-image workflow in ComfyUI lets you use an existing image as a foundation for generating a new image.
Rather than starting from scratch, you upload an image and convert it into a format stable diffusion models can work with (latent mode) using a VAE Encode node. This opens up creative options like making subtle edits, style transfers, or targeted changes, while still leveraging the strengths of generative AI.
How does denoising strength affect the output in image-to-image?
Denoising strength controls how much the output image changes compared to the input.
A low value (e.g., 0.1) means only minor tweaks, so the output closely matches the original. A high value (e.g., 1) means the input image is heavily altered, allowing the prompt to drive the result much more. Think of it like tracing paper: low denoising is transparent (original shows through), high denoising is opaque (prompt takes over).
What is a common starting value for denoising strength and how do you adjust it?
A typical starting value for denoising strength is 0.6.
This strikes a balance,your output will be noticeably different but still influenced by the original image. Adjust downward (toward 0) to keep closer to the input, or upward (toward 1) to give your prompt more creative control. Experiment to find the sweet spot for your use case.
Why is image size important in ComfyUI and how can you manage it?
Stable Diffusion models, especially SDXL, work best with images close to certain sizes (e.g., 1024 pixels, divisible by 64).
Large images can cause out-of-memory errors, slow down processing, or produce poor results. You can manage image size by resizing before uploading or using the “upscale image” node within ComfyUI to set precise dimensions. Keeping images at optimal sizes helps workflows run efficiently and delivers better outputs.
What is a LoRA in the context of Stable Diffusion?
LoRA (Low Rank Adaptation) is a technique for fine-tuning large AI models with minimal resources.
It updates only targeted parts of the model instead of retraining everything. In practice, LoRAs let you quickly adapt Stable Diffusion for custom tasks,like capturing a specific person’s likeness, a trademark style, or a unique object,without the overhead of full model training.
Where can you download LoRA models and how do you install them in ComfyUI?
LoRA models are available on websites such as CivitAI.
Download your chosen LoRA and place the file in the “loras” folder inside your ComfyUI models directory (typically comfyui/models/loras). If ComfyUI is open, refresh it to detect the new LoRA. Now it will appear as a selectable option in your workflow.
How do you add and use a LoRA node in a ComfyUI workflow?
Add a “Load LoRA” node between your base model (Load Checkpoint) and the prompt encoding nodes.
Connect the model and clip outputs from the base model to the LoRA node, then route the LoRA’s outputs to the K sampler and prompt encoders. In the “Load LoRA” node, select your downloaded LoRA and include any trigger words in your positive prompt. This setup ensures your LoRA’s influence is applied before prompts are processed.
What does "strength model" mean in the context of using a LoRA?
“Strength model” controls how strongly the LoRA affects your base model.
Higher values make the LoRA’s influence more visible, while lower values blend it more subtly. Most LoRAs work best between 0.3 and 1. Too high (e.g., 2) can distort the image. Adjust this setting to fine-tune the look and behavior for your project.
What is the main difference between text-to-image and image-to-image workflows in ComfyUI?
Text-to-image workflows start with a blank latent canvas, while image-to-image workflows start with your own image.
In image-to-image, you have more control over the starting point, making it ideal for edits, retouching, or style changes based on an existing visual foundation.
What is the purpose of the VAE Encode node in an image-to-image workflow?
The VAE Encode node transforms your uploaded image from standard pixel format to latent space.
Stable diffusion models generate images in latent space, so this node is essential for making your image usable for further processing. Without it, the model can’t interpret the image correctly.
What can happen if you load an image that is too large into ComfyUI?
Loading oversized images can cause out-of-memory errors, slow down your workflow, and lead to poor-quality outputs.
If you run into these issues, reduce the image size before uploading or use the Upscale Image node to set dimensions that are manageable for your hardware.
How does the Upscale Image node help when working with large images in ComfyUI?
The Upscale Image node lets you resize images within your workflow to fit the model’s preferred dimensions.
This is useful not just for making large images smaller, but also for standardizing output size or preparing images for further edits. For example, you might upscale a 512x512 image to 1024x1024 for higher detail, or downscale a photo to prevent memory errors.
What is the keyboard shortcut for pasting an image into ComfyUI and what does it do?
Use Control+V to paste an image directly into ComfyUI.
This automatically creates a Load Image node with your pasted image, speeding up your workflow,especially useful when you’re working with assets copied from other applications or the web.
Where are downloaded LoRA models typically saved in the ComfyUI folder structure?
LoRA models should be placed in comfyui/models/loras.
This is the standard location for ComfyUI to detect and display LoRA options in the workflow interface.
Why is a LoRA node placed between the Load Checkpoint and prompt encoding nodes?
Placing the LoRA node here ensures the base model is adapted before prompts are interpreted.
This means the LoRA’s changes are factored into how text prompts are processed, giving you more predictable and coherent results when combining custom LoRAs with your own prompts.
What are "trigger words" in the context of LoRA models, and why are they important?
Trigger words are specific keywords required in your prompt to activate a LoRA’s intended effect.
If a LoRA is trained to produce a certain style or character, including its trigger word (as specified by the creator) signals the model to apply those changes. For example, a LoRA trained on “cyberpunk” art might need you to include “cyberpunk” in your positive prompt.
How do you convert a text-to-image workflow into an image-to-image workflow in ComfyUI?
Replace the empty latent image node with a Load Image node followed by a VAE Encode node.
This allows you to input your own image, convert it to latent space, and feed it into the rest of your workflow. The rest of the nodes (prompt encoders, samplers, etc.) stay the same.
How does denoising strength balance the input image and the prompt in image generation?
Denoising strength acts as a slider between preserving the original image and letting the prompt reshape the output.
Low values keep the result close to your input, great for subtle edits. High values give creative control to your prompt, useful for more radical transformations or style transfers.
What benefits do LoRA models offer compared to base stable diffusion models?
LoRA models allow for quick, efficient customization of large models for specific tasks.
They’re lightweight, easy to share, and don’t require full retraining. Use cases include creating branded visuals, emulating a favorite artist’s style, or generating consistent product imagery,tasks that would be cumbersome or impossible with a base model alone.
What are the steps for downloading, installing, and using a LoRA model in ComfyUI?
1. Download a LoRA from a site like CivitAI and save it in comfyui/models/loras.
2. Refresh ComfyUI if open.
3. Add a Load LoRA node to your workflow, connect it as described above, and select the new LoRA from the dropdown.
4. Add trigger words to your prompt as needed.
This process allows you to quickly experiment with new creative directions in your workflow.
What issues can arise from processing images of incorrect sizes in ComfyUI?
Using images that are too large or with odd dimensions can cause memory errors, slowdowns, or even visual artifacts.
To avoid this, stick to recommended sizes (divisible by 64, usually around 1024px for SDXL), and resize images before upload or during the workflow using the Upscale Image node.
How do you troubleshoot if your image-to-image output looks distorted or unexpected?
Check image dimensions, denoising strength, and ensure all nodes are properly connected.
Distortion often comes from using an input image with an unsupported size or an excessively high denoising value. Try resizing your input, lowering denoising, and confirming your workflow setup matches best practices.
How can business professionals leverage image-to-image workflows in ComfyUI?
Use image-to-image workflows for creative content generation, rapid prototyping, and brand consistency.
For example, marketing teams can refresh product photos, apply new visual styles, or create campaign variations without starting from scratch,saving time and maintaining visual coherence.
Can you combine multiple LoRA models in a single ComfyUI workflow?
Yes, you can chain multiple Load LoRA nodes to blend different LoRA effects.
Each LoRA node can adjust the base model further. Use the strength setting to balance their influence. For example, combine a style LoRA and a character LoRA to generate images in a specific style featuring a unique person.
What is the difference between positive and negative prompts in ComfyUI?
Positive prompts describe what you want in the generated image; negative prompts describe what you want to exclude.
For instance, you might use “professional office, smiling team” as a positive prompt, and “blurry, low resolution” as a negative prompt to improve quality.
How does seed value affect image generation in ComfyUI?
The seed sets the starting point for randomization, making results reproducible.
Using the same seed and settings will generate the same output each time. This is helpful for comparing changes or sharing results with collaborators.
What should you do if your generated image is too similar or too different from the input?
Adjust the denoising strength in the K Sampler node.
Increase it if you want more change, decrease it for subtler edits. You can also tweak your prompt for more targeted control over the output.
How do you know which LoRA models are compatible with your base model?
Check the LoRA model’s documentation or listing for its base model (e.g., SDXL, v1.5).
Using an incompatible LoRA can cause errors or unexpected results. Always match your base model to the LoRA’s intended version.
What is the purpose of the VAE Decode node in ComfyUI?
The VAE Decode node converts images from latent space back to pixel format for viewing or saving.
It’s the final step in most workflows, ensuring your generated image is accessible and ready for export.
Can you use image-to-image workflows for photo restoration or enhancement?
Yes, image-to-image workflows are useful for tasks like upscaling, colorizing, or removing noise from old photos.
By carefully setting the denoising strength and prompts, you can restore or enhance images while preserving key details.
What are common mistakes when using LoRA models in ComfyUI?
Common mistakes include forgetting trigger words, using the wrong base model, or setting the strength too high.
These can lead to the LoRA having no effect, errors, or poor-quality images. Always read the LoRA documentation and start with moderate strength values.
How do you temporarily disable a node in ComfyUI?
Use the “bypass” option on a node to deactivate it without deleting it from your workflow.
This is helpful for quickly testing different workflow configurations or isolating issues.
Can you use ComfyUI on lower-end hardware for image-to-image generation?
Yes, but you may need to reduce image sizes and batch counts to avoid memory errors.
Stick to smaller resolutions and optimize your workflow to get the most out of available hardware. The Upscale Image node can help you adjust image size as needed.
How do you find the optimal denoising strength for your project?
Start at 0.6 and adjust incrementally based on the results you see.
Preview a few outputs at different strengths, and choose the value that best balances input fidelity with the creative changes you want from your prompt.
How do you save and share your workflows in ComfyUI?
Export your workflow as a JSON file from the interface.
This allows you to share your setup with others or keep versioned backups. Recipients can load your workflow and reproduce your results, provided they have the same models and assets.
Can you use image-to-image and LoRA workflows for commercial projects?
Yes, provided you have the appropriate rights for your input images and LoRA models.
Always check licensing on LoRA models and ensure your final outputs comply with copyright, trademark, and model use policies.
How can you improve the quality of outputs when using low-resolution input images?
Use the Upscale Image node to increase resolution before processing, or apply denoising judiciously to enhance details.
For best results, combine upscaling with prompt engineering to guide the model toward cleaner, sharper outputs.
What should you do if a LoRA doesn’t seem to affect your output?
Check if you’ve included the correct trigger words and matched the LoRA to the right base model.
Also, verify the strength value isn’t set too low and that the LoRA node is properly connected in your workflow.
What are some practical examples of using image-to-image in ComfyUI?
Practical examples include updating product shots, changing backgrounds, converting photos to illustrations, and applying brand-specific color grading.
For instance, a retailer might use image-to-image to generate seasonal ad variations from a single photoshoot.
How can you ensure consistent outputs when generating multiple images?
Use fixed prompts, seed values, and image sizes across your workflow.
For batch generation, keep your setup as consistent as possible. This is especially useful for business scenarios like catalog generation or campaign asset creation.
Certification
About the Certification
Move beyond simple prompts,learn how to blend existing images with custom styles using ComfyUI’s IMG2IMG and LoRA tools. Gain practical skills to control, remix, and iterate your AI art with intention, precision, and creative flair.
Official Certification
Upon successful completion of the "ComfyUI Course: Ep04 - IMG2IMG and LoRA Basics", you will receive a verifiable digital certificate. This certificate demonstrates your expertise in the subject matter covered in this course.
Benefits of Certification
- Enhance your professional credibility and stand out in the job market.
- Validate your skills and knowledge in a high-demand area of AI.
- Unlock new career opportunities in AI and HR technology.
- Share your achievement on your resume, LinkedIn, and other professional platforms.
How to complete your certification successfully?
To earn your certification, you’ll need to complete all video lessons, study the guide carefully, and review the FAQ. After that, you’ll be prepared to pass the certification requirements.
Join 20,000+ Professionals, Using AI to transform their Careers
Join professionals who didn’t just adapt, they thrived. You can too, with AI training designed for your job.