Hidden in Plain Sight: Using QR Code Monster ControlNet to Embed Subliminal Text in AI Art
ComfyUI : Hiding words in your images + fixing faces. QR Code ControlNet Pareidolia Hybrid Image
The tool says QR codes. You're going to ignore that completely... and make something way more interesting.
Scott Detweiler walks us through a ComfyUI workflow that takes the QR Code Monster ControlNet... and repurposes it for something it wasn't designed for. Instead of generating scannable codes, he's driving the word "SHOP" into a street photography scene. The letters show up in the folds of clothing, the angles of shopping bags, the geometry of the background. Subtle. Compositional. A little "They Live" if we're being honest.
And that's where this gets fun.
The Core Trick: ControlNet as Compositional Driver
The setup is deceptively simple. Load a Stable Diffusion 2.0 checkpoint. Build your standard text-to-image pipeline... positive prompt, negative prompt, VAE, KSampler. Nothing fancy yet.
Then you introduce the ControlNet. Specifically, the QR Code Monster ControlNet model. But instead of feeding it a QR code, you hand it a 512x512 PNG with black text on a white background. The word "SHOP." Or "CATS." Or whatever you want buried in the composition.
Here's the part worth understanding: ControlNet operates at the conditioning level. It plugs into the pipeline between your text encoding and the sampler... before pixels even exist. It's not post-processing. It's not a filter. It's shaping the mathematical space the model navigates to generate the image. That's why the text doesn't look stamped on. It looks woven in.
Scott bumps the ControlNet strength to about 1.4 to make the word more visible. Fair warning... push that number too high and you'll start destroying the image. There's a sweet spot where the text is present but not screaming. Finding it takes experimentation.
The Face Problem (and the Elegant Fix)
Here's the trade-off. The QR Code Monster ControlNet only works with Stable Diffusion 2.0 models. And when you're generating a scene with multiple people at 768 resolution, faces become a tiny fraction of the total pixel count. They get wonky. Distorted. The kind of uncanny valley stuff that pulls you right out of the image.
Scott's solution is practical and worth stealing for other workflows: chain a second model.
Once the initial image is decoded from latent space to pixels, it's model-agnostic. Those pixels don't care what generated them. So Scott loads an SDXL checkpoint... specifically the refiner, which handles faces better than the base model... and routes the image into a completely separate face-fixing pipeline.
The tool that makes this work is the FaceDetailer node from the Impact Pack. It uses UltraLytics bounding box detection to find faces in the scene and SAM models to segment them. Then it inpaints each face using the SDXL refiner with about 35 steps and 50% denoise. Automated. No manual masking. No Photoshop.
Scott also enables forced inpainting so even tiny faces don't get skipped. Smart move. The whole point is that these are small faces in a complex scene... exactly the ones that need the most help.
Stacking Passes for Iterative Quality
One detail worth highlighting: you can chain multiple FaceDetailer passes together. First pass roughs it in. Second pass refines. There's no rule that says one pass is enough. This is ComfyUI... the entire philosophy is "build the workflow that serves the output."
That principle extends beyond face fixing. You could swap the empty latent for an image-to-image input. You could try different 2.0 checkpoints for different aesthetic results. You could experiment with different text images... logos, symbols, abstract shapes. The QR Code Monster doesn't know it's supposed to be making QR codes. It just knows it has a conditioning image and a job to do.
Why This Matters Beyond the Trick
The subliminal text thing is cool. Genuinely. But the deeper lesson here is about combining specialized tools for their strengths.
Stable Diffusion 2.0 gives you access to the QR Code Monster ControlNet. SDXL gives you better faces. The Impact Pack gives you automated detection and inpainting. None of these tools alone solves the whole problem. Together, they build something none of them could produce independently.
That's the real workflow wisdom. Not "find the one perfect model." Instead... know what each tool does well, and build a pipeline that lets every tool do its best work.
BAM... that's how you get AI-generated street scenes with hidden messages AND decent faces.
Save this workflow once. Tweak it forever. The QR Code Monster ControlNet is one of those tools that rewards creative misuse... and chaining it with SDXL face detailing turns a clever trick into a genuinely useful pipeline. Try different text images. Push the strength slider around. Break it a few times. That's where the real learning lives. 💙
--- Source: https://www.youtube.com/watch?v=6vc_a4aS19A
From TIG's Notebook
Thoughts that surfaced while watching this.
But what I send out of my mouth will impact everyone around me,— TIG's Notebook — New Captures
Legacy isn't built in isolation.— TIG's Notebook — On Connection & Understanding
The two most important days in your life are the day you are born and the day you find out why. — *Mark Twain*— TIG's Notebook — On Purpose & Legacy
Echoes
Wisdom from across the constellation that resonates with this article.
Dataset diversity, not just scale, unlocked AI generalization... but a massive sample efficiency gap between AI and human learning remains unsolved.
The opportunity for future leaders is not to keep the company in formaldehyde but rather find new ways to bring that original founding purpose to life.
3D printed fractal vise - The coolest tool you didn't know you needed - I saw an awesome video by Hand Tool Rescue of a 100+ year old fractal vise being restored and just had to have one. CAD and 3D printing makes this possible and if you own a 3D printer you can have one