1/2 dev and a fair warning: do not go into the comments.
You're going anyway? Good.

I began trying to figure out how to use stable diffusion out of boredom. Couldn't do shit at first, but after messing around for a few days I'm starting to get the hang of it.

Writing long prompts gets tiresome, though. Think I can build myself a tool to help with this. Nothing fancy. A local database to hold trees of tokens, associate each tree to an ID, like say <class 'path'> or some such. Essentially, you use this to save a description of any size.

The rest is textual substitution, which is trivial in devil-speak. Off the top of my head:
my $RE=qr{\< (?<class> [^\s]+) \s+ ' (?<path>) [^'] '\>}x;

And then? match |> fetch(validate) |> replace, recurse. Say:
while ($in =~ $RE) {
my $tree=db->fetch $+{class},$+{path};
$in=~ s[$RE][$tree];

Is that it? As far the substitution goes, then yeah, more or less. We have to check that a tree's definition does not recurse for this to work though, but I would do that __before__ dumping the tree to disk, not after.

There is most likely an upper limit to how much abstraction can be achieved this way, one can only get so specific before the algorithm starts tripping balls I reckon, the point here is just reaching that limit sooner.

So pasting lists of tokens, in a nutshell. Not a novel idea. I'd just be making it easier for myself. I'd rather reference things by name, and I'd rather not define what a name means more than once. So if I've already detailed what a Nazgul is, for instance, then I'd like to reuse it. Copy, paste, good times.

Do promise to slay me in combat should you ever catch me using the term "prompt engineering" unironically, what a stupid fucking joke.

Anyway, the other half, so !dev and I repeat the warning, just out of courtesy. I don't think it needs to be here, as this is all fairly mild imagery, but just in case.

I felt disappointed that a cursed image would scare me when I've seen far worse shit. So I began experimenting, seeing if I could replicate the result. No luck yet, but I think we're getting somewhere.

Our mission is clearly the bronwning of pants, that much is clear. But how do we come to understand fear? I don't know. "Scaring" seems fairly subjective.

But I fear what I know to be real,
And I believe my own two eyes.

  • 1
    Oh, sweet mother...
  • 2
    Would fit great with a black metal album
  • 0
    dear sister,
  • 0
    let us hunt together.
  • 0
    blood for blood.
  • 0
    I have hundreds of these and I can't stop generating them please help
  • 1
    @jestdotty I'm trying to replicate my previous scare, but I can't do it. In part because I put the veil on the monster and force a long shot. That way I don't see demon eyes in the darkness and get startled again ;>

    The way the hypnosis part works, from what I can tell, is you repeat an element to fill in detail...

    Say, "a man driving a car", then "(a man list,list), (driving a car list,list)". It's not perfect but it's what seems to work more consistently.
  • 2
    generating photos is easy, generating single fucking things on white background is fucking hard as hell

    training loras, vae, fucking control nets it’s like fucking unpredictable fucking api

    sometimes I think it would be easier to just learn how to draw and draw those things
  • 2
    @vane You can get something more or less coherent with very simple wording. I only understand how neural nets are implemented at a surface level though. I can write a short program to answer basic mathematical problems without explicitly telling the computer how to solve it, via randomized weights and backpropagation, which is a fairly outdated model by now I think.

    The jump to hallucinations is very much beyond my comprehension. "Same operation in reverse" is what I've heard people say, but I never dug in.

    But I don't think knowing shit would make it any less unpredictable. Moving subjects around the frame, or making them perform some basic interaction, I don't want to say that it's impossible, but I very much gave up trying to be in control. You're not commanding the algorithm as much as you're just nudging it in different directions, but you never know how it's going to respond. Pretty mindless iterative process overall.
  • 1
    @Liebranca wording is just wrap up extension of transformer above another transformer model, what I want is predictable results so I dropped llm completely from my stable diffusion and try to rely on image to image models.

    Like neutral networks are no brainer. Force them to do something productive and predictable so people can use it for more than silly pictures is very hard.
  • 2
    @vane My gut feeling is that natural language processing is self-defeating to begin with, as there's still a very specific way in which you have to structure the input to get good results.

    So the correct direction would be a formally defined grammar, a DSL basically, which is already the case in practice: wrapping lists of tokens around parenthesis to modify their weight, being mindful of where and if to put a comma, or spaces and hyphens, BREAKs, etcetera.

    The most effective and consistent prompts are not natural at all, hence the "engineering" joke. So 'just cut the shit' would be my advice to researchers in the field.
Add Comment