Download Pack
This pack contains 136 VJ loops (122 GB)
https://www.patreon.com/posts/80911779
Behind the Scenes
Let's have an AI fantasize about the most popular parts of the human body... Feeling uncanny glitchy? Probably NSFW-ish.
Looking back it's clear just how much I've learned about how to train StyleGAN2/3 since I've trained 10 different models in one month for this pack. It's been so useful being able to render out locally a ~10,000 image dataset from Stable Diffusion in 1-2 days. But the real game changer has been doing the training locally and being able to run it for however long I need and also try out multiple experimental ideas. It turns out that while Google Colab has top notch GPU's, the credits end up costing too much and so I don't run experiments where I'm unsure of what'll happen.
I first started with the idea of a long tongue whipping wildly from a mouth. Interestingly it proved too difficult to make Stable Diffusion output images of a tongue sticking out. Eventually I got it to work but it looked horrific with teeth in odd places and the lips merging into the tongue. So I put the idea on the back burner for a few months until I realized that I could pivot and instead focus on just lips wearing lipstick. Then I nailed down a text prompt I rendered out 6,299 images from SD. Since the lips were quite self similar I knew it would converge quickly with StyleGAN3. I love how there are sometimes 3 lips that animate into each other. I also did some transfer learning on SG2 but the well known texture sticking aspect did not look good on the lip wrinkles areas and so I didn't use this model.
I thought it would be interesting to have some vape or smoke to layer on top of the lips. I experimented with a few SD text prompts and then rendered out 2,599 images. Then I did some transfer learning on SG2 but had issues with the gamma and it took a few tweaks to fix it. Even still I wasn't thrilled with the results until I did some compositing experiments in AE and tried for the first time doing a slitscan effect with a radial map. I love experimenting with the Time Displacement effect in AE. This made it look like smoke rings moving outwards was perfect.
At this point I realized that it would be fun to make some more SG2/3 models on different parts of the human body. I was a little nervous about it since I wanted to represent all skin tones and body types, which can be difficult since I'm dealing with two different AI systems that each react unpredictably in their own ways. I had to carefully guide Stable Diffusion so as to use specific skin tones and body types and also with added extra priority on those keywords. SD really wanted to output hourglass figures and gym bodies, even when I tried to push it hard in other directions. Through the use of wildcards in the NKMD Stable Diffusion app I was able to achieve some decent results, yet even still it was difficult to maintain an balance of images so that the dataset wasn't too heavy in one category and therefore affect my SG2/3 training downstream. Another tricky aspect is that GAN's can be unpredictable of what patterns it will decide to focus on even with a large dataset. So most of the models have a good spread of skin tones represented, but getting all body types represented was very difficult since for some reason SG2/3 kept converging towards the skinny and mid-weight body types, with large curvy bodies getting less representation. I suspect this is an artifact of the gamma attribute averaged the bodies together. Having AI train another AI has its pitfalls.
Several times in the past I've tried to create a SD text prompt that output a body builder, but I was never happy with the results. Finally I experimented with a text prompt telling it to do an extreme closeup of a bicep and that was the ticket. So I rendered out 9,830 images from SD and then trained SG3 on it. I suspect that everyone looks tan and shiny since bodybuilders often go for the oiled and fake tan look, which makes the skin tones a bit vague, but black skin tones are visualized too. The SG3 converged nicely but exhibits an interesting aspect that I've seen a few other times where it's not a smooth easing between some of the seeds and instead snaps very quickly. This actually looks really interesting in this case since it looks like muscles twitching with unreal speed. I think that perhaps that SG3 learns some different patterns but has trouble interpolating between them and so it becomes a steep transitions between those zones. I also trained a model using SG2 and that has a very different feeling to it, which was interesting to directly compare the two frameworks. When doing some compositing experiments in AE I again experimented with a new type of slitscan, what I call split horizontal or split vertical, which is basically two ramps that meet in the middle. So it looks almost like a mirror effect except that it the content is being seemingly being extruded from the center.
Next up on my list was to create some boobs. I suspect that SD was trained on lots of skinny model photos or bra photo shoots because it difficult to make it output images of large curvy bodies. Once again we're seeing the affect of cultural bias for skinny bodies has make it's way into SD. But I input a bunch of different body type wildcards into the SD text prompt and then output 30,276 images. I rendered a more images than usual to try and get a larger spread of body types but it didn't do as well as I had hoped. It seems unlikely but maybe since I was telling it to make a glowing neon LED bra that it limited itself to a certain trained limitation in body types where the datasets didn't crossover as they should have, yet the neon LED lighting just looks so dramatic and obscures the skin tones at times. Due to the smooth nature of skin, I realized that I would lower the Steps attribute in SD without much of a change in the image quality so as to half the render time per images and therefore get the dataset rendered out in half the time. From there I trained both SG2 and SG3 separately and it converged without any problems. After that I took the muscles dataset and combined it with the boobs dataset and then did some transfer learning on the boobs SG2 model. I didn't know if it would be able to interpolate between the datasets but it did surprisingly well. Strangely it did exhibit the same steep transition between seeds just the SG3 muscles model, which I had never seen SG2 do before.
Lastly I wanted to create some male/female asses and crotches. Interesting to note that I didn't have any trouble creating an SD text prompt and rendering out images for male ass (9,043 images) and male crotch (9,748 images) and see a diversity of skin tones and also various body types. The woman crotch images (10,091 images) had a good diversity of skin tones, but would only output an hourglass body type no matter how much I tweaked the text prompt. And then the woman ass images (42,895 images) were the trickiest since it tended to output tan and brown skin tones, and strictly output a hourglass figure. So as a very rough guess, I'd say the huge dataset that Stable Diffusion was originally trained on has an unequal amount of tan women in bikini's. This is clearly the early days of AI... I then took all of these images and combined them into one dataset in hopes that I could get SG2 to interpolate the skin tones onto all body types. It helped a little bit and also allowed me to animate between the asses and crotches of males/females, which looks so bizarre! In an effort to give the black skin tones some attention I did some AE compositing to key out the skin color and only see the skin highlights. Also I applied some dramatic lighting and harsh shadows to help make the skin tones more difficult to define. It's not a perfect solution but I'm working with what I got. In AE I did all sorts of slitscan compositing experiments to make those hips distort in the wildest ways. Legs for days!
Discussion (0)