Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "digital hoarder"
-
The next step for improving large language models (if not diffusion) is hot-encoding.
The idea is pretty straightforward:
Generate many prompts, or take many prompts as a training and validation set. Do partial inference, and find the intersection of best overall performance with least computation.
Then save the state of the network during partial inference, and use that for all subsequent inferences. Sort of like LoRa, but for inference, instead of fine-tuning.
Inference, after-all, is what matters. And there has to be some subset of prompt-based initializations of a network, that perform, regardless of the prompt, (generally) as well as a full inference step.
Likewise with diffusion, there likely exists some priors (based on the training data) that speed up reconstruction or lower the network loss, allowing us to substitute a 'snapshot' that has the correct distribution, without necessarily performing a full generation.
Another idea I had was 'semantic centering' instead of regional image labelling. The idea is to find some patch of an object within an image, and ask, for all such patches that belong to an object, what best describes the object? if it were a dog, what patch of the image is "most dog-like" etc. I could see it as being much closer to how the human brain quickly identifies objects by short-cuts. The size of such patches could be adjusted to minimize the cross-entropy of classification relative to the tested size of each patch (pixel-sized patches for example might lead to too high a training loss). Of course it might allow us to do a scattershot 'at a glance' type lookup of potential image contents, even if you get multiple categories for a single pixel, it greatly narrows the total span of categories you need to do subsequent searches for.
In other news I'm starting a new ML blackbook for various ideas. Old one is mostly outdated now, and I think I scanned it (and since buried it somewhere amongst my ten thousand other files like a digital hoarder) and lost it.
I have some other 'low-hanging fruit' type ideas for improving existing and emerging models but I'll save those for another time.6 -
I regret downloading too many Photoshop brushes, gradients, actions, patterns & etc.
Purging files at the moment.1 -
I finally have some motivation to write some personal code... on an existing project.
(Work has been too hectic the last few months so don't want to do anymore at home...)
Anyway... I noticed that my Prime Video Tracker app doesn't pick up some of the new Movies now available on Prime, so I did some fixing.
Good News (GN): The search URL is actually static so can goto the same URL for the same search results
GN: The program can filter the movies by a Minimum # of Ratings they have (currently set to 100... use to be 10)
Bad News (BN): The number of movies in the search results is over 5000 (used to be 100-200) so even with this filter, a lot get returned.
GN: the traversal is fully automated
BN: Need to manually look at the descriptions of each and add them the Watchlist
BN: I now have 200 movies on my Watchlist and still going...
So now I have another "Infinite list". Existing ones:
-TED Talks
-NLegs
-Blinkist Read List
-Comics (sort of, I have a huge backlog for Cyanide and Happiness)
-Photos that need "post-processing"
I'm pretty sure I'm forgetting some others...