Download Pack
This pack contains 136 VJ loops (122 GB)
https://www.patreon.com/posts/82333625
Behind the Scenes
Graffiti is alive. I've been dreaming up some techniques to create an enhanced version of my graffiti animations using my more mature understanding of StyleGAN2. But to do that I needed a larger dataset of graffiti images so that I wouldn't continue running into overfitting issues. At first I tried just using the NKMD Stable Diffusion app by itself, but creating the perfect text prompt to always output a perpendicular and cropped perspective of graffiti on a wall, without grass or concrete edges visible... that proved too difficult to output even with the 50% consistency of what I as aiming for as a baseline. But seeing as how I manually cropped the images in my graffiti dataset, it's unlikely that SD v1.5 fed tons of this specific imagery when originally trained.
So I decided to fine-tune my own SD model using DreamBooth via Google Colab since I don't have 24GB of VRAM on my local GPU. For that I used my dataset of 716 graffiti images that I curated in the past and then fine-tuned SD v1.5 for 4000 UNet_Training_Steps and 350 Text_Encoder_Training_Steps onto the 'isoscelesgraffiti' keyword. The results were stunning and better than I had hoped for. I trained all the way to 8000 steps but 4000 steps checkpoint was the sweet spot.
Now that I had my own custom SD model fine-tuned to graffiti that I could plug the NKMD app. But when I rendered out images using the 'isoscelesgraffiti' keyword then all of the images had a very similar feel to them, even after experimenting with adding various keywords. So to help guide it I used the IMG2IMG function by inputting each of the 716 graffiti images at 25% strength. This allowed me to create 100 complex variations of each graffiti image. After setting up a batch queue, I ended up with a dataset of 71,600 images output from my SD graffiti model.
Using this dataset of 71,600 images, I started training StyleGAN2 to do some transfer learning using FFHQ-512 as a starting point and trained for 2340kimg. What an amazing jump in quality when training with a dataset this large! Strangely I'm still seeing some overfitting behavior, which confuses me considering I had X-mirror enabled and so the effective size of the dataset was 142,200 images. So my best guess is that maybe there are some repeating themes in the dataset that I can't recognize due to the grand scale, or perhaps I should have further lowered the gamma=10 and trained for longer. I'm not sure which is true, maybe both. But the overfitting was a minor quibble and so I moved forward with curating and ordering the seeds and then rendering out the latent walk videos.
The recent updates to Topaz Video AI have brought some excellent improvements to the upscaling quality. I have found the Proteus model to perform amazingly well when upscaling from 512x512 to 2048x2048. Due to a 512x512 video only having so many details included due to the small size, I've found the 'Revert Compression: 50' or 'Recover Details: 50' to be the ideal settings. In the past I had used the 'Sharpen: 50' attribute but in hindsight feel like I can see the sharpen effect being applied.
As always, playing around in After Effects with these graffiti videos is where I feel like I really get to experiment and have some fun after the tedious steps mentioned prior. Often the raw render are just on the cusp of being interesting and so AE is where I can make it shine and layer on some extra ideas that make it feel finalized. I live for the happy accidents when implementing a complex technique and stumble across something interesting. I did some various Time Displacement slitscan experiments which gave the graffiti a wonderful vitality. It was interesting to apply the Linear Color Key to select only specific colors, then using that selection as an alpha track matte to cutout the original video, and then apply FX to only specific parts of the video (Deep Glow, Pixel Encoder, Pixel Sorter). That was a satisfying evolution of similar techniques that I've been been trying to refine so as to simplify the execution and allow me to experiment with ideas quicker. I also did some displacement map experiments in Maya/Redshift, which was surprising since the JPG encoding ended up making the Maya displacement look like a concrete texture. Then brought these Maya renders back into AE for further compositing, which was bizarre to apply the Difference blend mode on the displacement render versus the original video. And then I rendered that out and injected it into NestDrop, recorded the fire results, and then brought that video back into AE for yet another application of the Difference blend mode technique. It's a vicious little circle of compositing. Long live graffiti!
Discussion (0)