Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API

From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "data hoarder"
-
About to delete ~2TB of data.
I need a fresh start. I'm a data hoarder.
About to wipe my drives and start over. I've had this piticular instance since I was 17 (I'm 21)
I'm sticking with windows though, because my games wont run on linux.23 -
Probably the most rage inducing data loss story...
When it comes to my cellphone I'm a data hoarder, I store each relevant meme, conversation, video, contact, nudes, etc. Had to replace my phone? Easy, change the SD.
I did this for about 4 years, had over 11GB of almost everything and anything in a 36GB SD, one afternoon my buddies and I went to a small tech convention and on our way to my car we got mugged by 5 armed men.
They took my brand new phone along with my wallet and all my cash, luckily I had GPS tracking enabled and we were able to pinpoint the exact location of my phone within 30min.
So far so good...
We called the cops and went with them, we found the car with illegal plates and weapons inside (knives, a bat, gun) so I tell the robbers were in there inside a closed cyber cafe and showed him the point on the map confirming this.
Cop: oh we can't do that we don't have an order...
Me: are you kidding me, here's the GPS, there's the car, there's the weapons, doesnt that count as at least probable cause or some shit?
Cop: we don't have that in this country, you can file a report and after 3 business days we can come here to inquire.
Me: (fucking lost it) do you fucking think they'll be here in 3 days?! I'll give you 500 bucks if you go bust their ass now.
Cop: (thinks about it) but what if they are armed? [4 patrols, 8 cops, 4 rifles and at least 6 guns plus vests] Maybe if you had contacts within the bureau we could have an order now...
(┛✧Д✧))┛彡┻━┻
I lost a lot that day, including respect to this fucked up system.
t(ಠ益ಠt) FUCK THE POLICE go eat a dick.10 -
So... the US Govt. just released a shit ton of files on JFK assasination, and being the data hoarder that I am, I promptly requested a bulk download link...
Apparently I underestimated the "shit ton" part, coz each of these files is around 2.4GBs... and I dont have the data to download them :-D :-D
FML26 -
I deleted over a petabyte worth of snapshots from AWS today.
As a data hoarder this feels like genocide.5 -
The next step for improving large language models (if not diffusion) is hot-encoding.
The idea is pretty straightforward:
Generate many prompts, or take many prompts as a training and validation set. Do partial inference, and find the intersection of best overall performance with least computation.
Then save the state of the network during partial inference, and use that for all subsequent inferences. Sort of like LoRa, but for inference, instead of fine-tuning.
Inference, after-all, is what matters. And there has to be some subset of prompt-based initializations of a network, that perform, regardless of the prompt, (generally) as well as a full inference step.
Likewise with diffusion, there likely exists some priors (based on the training data) that speed up reconstruction or lower the network loss, allowing us to substitute a 'snapshot' that has the correct distribution, without necessarily performing a full generation.
Another idea I had was 'semantic centering' instead of regional image labelling. The idea is to find some patch of an object within an image, and ask, for all such patches that belong to an object, what best describes the object? if it were a dog, what patch of the image is "most dog-like" etc. I could see it as being much closer to how the human brain quickly identifies objects by short-cuts. The size of such patches could be adjusted to minimize the cross-entropy of classification relative to the tested size of each patch (pixel-sized patches for example might lead to too high a training loss). Of course it might allow us to do a scattershot 'at a glance' type lookup of potential image contents, even if you get multiple categories for a single pixel, it greatly narrows the total span of categories you need to do subsequent searches for.
In other news I'm starting a new ML blackbook for various ideas. Old one is mostly outdated now, and I think I scanned it (and since buried it somewhere amongst my ten thousand other files like a digital hoarder) and lost it.
I have some other 'low-hanging fruit' type ideas for improving existing and emerging models but I'll save those for another time.6 -
I can't delete stuff!
I am currently sorting through my harddrive(s) and realized I have over 800 gigabytes of raw audio and video from four of our theatre productions lying around. The films have long been edited and there is no use for the source material anymore.
But just in case, I'm keeping it. You never know...