Ep 9: Upscaling

9 min

the problem most open source models produce their best results at around 1024x1024 that's fine for iteration, but far from the sharpness and detail you get at 4k for professional work, that level of quality matters there are many ways to upscale in comfyui we'll cover two the first is fast and lightweight the second uses a diffusion model to add genuine detail during the upscale, producing significantly better results \<font color="#6b21a8">method 1\</font> upscale model (fast) this uses a small specialized model to enlarge your image it's not a diffusion model it doesn't generate new content or interpret prompts its only job is to take an existing image and make it larger while keeping it as sharp as possible, using a mathematical algorithm fast, minimal system resources, and good for quick exports build it load image node drop in the image you want to upscale load upscale model node you'll see a list of models their names hint at what they're best at some are tuned for anime, others for faces or photographs for general use, go with 4x nomos or 4x ultrasharp the 4x/8x prefix is how much the model multiplies the resolution upscale image by model node connect load image to the image input and load upscale model to the model input save image node connect the output click run done in seconds the results won't match what a diffusion model can do, but for a fast, lightweight upscale, it's very effective you can also combine this with method 2 \<font color="#6b21a8">method 2\</font> ultimatesdupscale (diffusion based) this is the best upscale method in comfyui for quality it combines a traditional upscale model with a diffusion model the upscale model enlarges the image and then the diffusion model refines it, adding detail and clarity that the simple method can't achieve the key ultimatesdupscale works in tiles generating a full 4k image in one pass would require enormous vram so instead of processing the entire image at once, it divides it into a grid of smaller squares and processes each tile individually each tile gets refined by the diffusion model, then they're stitched back together this means you can produce very large images even on a gpu with limited vram build it step by step ultimatesdupscale node (double click the canvas and search for it) this is the core it works like a ksampler with additional controls for tiling and upscaling load image node connect its output to ultimatesdupscale's image input load diffusion model node pick any model zimage turbo, stable diffusion, flux, qwan connect the model output to ultimatesdupscale's model input load clip + clip text encode for the positive prompt connect the clip model to the text encode node, then connect text encode to the positive input for the prompt leave it empty or describe quality/style only ("high quality photograph, sharp detail") conditioning zero out node for the negative connect from your positive conditioning through this node to the negative input this gives a blank negative load vae node connect to ultimatesdupscale's vae input load upscale model node (same as method 1) connect to the upscale model input save image node connect ultimatesdupscale's output to it don't describe objects in the prompt since it processes in tiles, writing "horse" makes it try to generate a horse in every tile describe quality and style instead, or leave the prompt empty entirely key settings on the ultimatesdupscale node setting what to use why upscale by 1 5 to 3x higher takes much longer, even with tiling a value of 2 doubles dimensions (1024 becomes 2048) steps, cfg, sampler match your diffusion model same as ksampler for zimage turbo cfg 1, euler denoise 0 15 0 25 the most critical setting see below tile size 1024x1024 match your model reduce tile size if you're hitting memory limits tile padding/overlap leave at defaults only adjust if you're experimenting with higher denoise why denoise matters most denoise controls how much the diffusion model is allowed to change each tile at low values (0 15 0 25), the model makes subtle refinements sharpens textures, adds some detail, keeps the overall image consistent tiles blend together well because no single tile changes drastically at higher values (0 4+), the model makes significant changes individual tiles can look impressive, but neighboring tiles may not match that's when seams appear visible lines or inconsistencies at tile borders there's a seam fix option in the node settings results are inconsistent better to keep denoise low enough that seams don't appear in the first place start at 0 2 for your first run compare the result with method 1 you'll see the difference immediately textures are crisper, edges are cleaner, and the image has detail that the basic upscaler can't produce other approaches the two methods above cover most upscaling needs, but they're not the only options wan and seedvr are ai models that upscale using different architectures ltx also has an upscaler model these use video generation architectures, which means they understand motion and temporal coherence that makes them especially effective at upscaling video without introducing flickering or frame to frame inconsistencies they work for image upscaling too both can be integrated into comfyui workflows they're more resource intensive than the methods above for most day to day work, ultimatesdupscale will serve you well faq what is the best upscale method in comfyui? for quality, ultimatesdupscale (method 2) it uses a diffusion model to add genuine detail during the upscale for speed, the simple upscale model method (method 1) gets the job done in seconds most people use method 2 for final output and method 1 for quick previews what is the best upscale model for comfyui? for general purpose upscaling, 4x nomos and 4x ultrasharp are the most reliable the number (4x, 8x) tells you the default resolution multiplier some models are specialized for anime, faces, or photographs experiment with a few on your specific content to see which fits best why do i see lines or seams in my upscaled image? denoise too high each tile is processed independently by the diffusion model when the ai changes too much, neighboring tiles don't match at the edges keep denoise at 0 15 0 25 how big can i go? with tiling, very large a 1024px image at 4x becomes 4096px (roughly 4k) going beyond 3x takes a long time even with tiling stay at 1 5 3x for practical use what's the difference between an upscale model and a diffusion model for upscaling? an upscale model uses math to make an image bigger and sharper a diffusion model generates new detail ultimatesdupscale uses both the upscale model enlarges, then the diffusion model refines each tile that combination is what produces results that hold up at full zoom