ml - The next step for improving large language models (if not diffusion) is hot-encoding. The idea is pretty straightforwar

Ranter

Wisecrack

9419

Comments

4

typosaurus

10598

2y

With such brains you're probably never bored
2

Wisecrack

9419

2y

@retoor lol. Dont know about brains. Like I said this is an obvious next step. If researchers are doing searches for sub graphs that perform as well as the whole graph (or using a network as a teacher to train miniture version of itself, also called "distillation"), the next logical step is simply to do a lot of testing to find a partial activation (inference) of the network that performs nearly as well as any full activation.

What we are after is useful partial network activations that generalize as well as full activations of the graph. We dump the network state for future use, and when a prompt is entered reload this partial state, and insert a lora-like layer for our prompt, calculating the transform of this new layer with the precomputed prior inference state, therefore cutting computation time down significantly.
2

typosaurus

10598

2y

What you said about recognizing dogs and stuff. We can just ask ChatGPT what makes a dog a dog. I'll check. I'm interested
3

typosaurus

10598

2y

Ah, he great answer if you ask him how to distinguish a dog from other animals. Laptop doesn't want to post the answer for some reason
2

Wisecrack

9419

2y

Deep dream > stablediffusion.

If you ask for nightmare fuel it gives you nightmare fuel.

Jokes aside, I was thinking more along the lines of YOLOv3, where it attends to many patches and tries to detect candidate samples for a certain number of categories of object. It can be reasonably expected that any sufficiently advanced object detection algorithm will approximate this same process, either directly or indirectly through some sub graphs that emerge while training (incidentally, see emergent inference heads for a similar phenomenon in transformers).

Knowing this, an implementation of semantic centering might offer better priors for earlier probable detection of any given instances of a subset of categories a system is trained to detect, thus cutting down the amount of area and categories that need to be considered all together for any given run.

Related Rants

Add Comment

random

ml

chatgpt

stable diffusion

llm