im back

Ranter

Wisecrack

9212

Comments

3

Wisecrack

9212

139d

The original 0.15% entropy improvement over shannon entropy shouldn't even be possible, and is what got me thinking.

Because I ran out of room in the original post

Each output of [H0, H1, H2, H3, salt, depth, starting value] becomes a block.

We append all these blocks together into a layer.

This layer can then be converted into *another* block and fed through the entire process.

Blocks can be processed in parallel be it decoding or encoding. During decoding layers have to be decoded serially back into blocks, before those blocks can be decoded in parallel, in a process of unpacking.

We repeat this process, blocks to layer, layers into blocks.

If it work, it should improve the lower demonstrated SOTA limit of compression.

To make block encoding/decoding at least break even, the depth of numbers it can reconstruct must be at least 22, because each block has 8 numbers to reconstruct the hashes, along with the final salt value, depth, and its starting value
4

Wisecrack

9212

139d

@retoor Hypothetically it lets you store absolutely absurd amounts of data in 13 measly values. 2 values for each hash, and one value per other variable per block.

layer stack size, and block count are of course unbounded values, commiserate with the amount of blocks and layers necessary to fully compress a given set of data. Block count will probably be a fixed count, found by parameter sweep or coefficient estimation, or curve fitting to some polynomial, so that leaves layer count as a potentially big number that grows at some unknown rate w/ the size of the data.

Hypothetically you could compress any amount of data into those 13 values, as long as your comfortable with absurd amounts of decompression time.

I jailbroke shannon entropy (if it works, big if), to the lowest limit that will likely ever be achieved, by anyone, ever.

If not, I still got that measly 0.15% entropy improvement over the known theoretical limit.
2

BordedDev

1413

139d

@Wisecrack I'm still reading through it, and it reminds me of a friend telling about people storing data using PI (just some random offset and length was that was the end result) but it looks fascinating :D
3

Wisecrack

9212

139d

@BordedDev Clever use of pi btw. Seen mathematicians ask if pi contains all sequences. If it does, then its sufficient to find the offset into pi that contains your data.

If pi doesn't, then the question becomes, is there an irrational number that *does* contain all sequences, and how would we go about finding, defining, and calculating it? Then the next question becomes, is there a way to calculate its digits at any given arbitrary offset?

And is there a way to find an offset matching any arbitrary data, without calculating all the prior digits?

Thats a fun one to think about.
2

Wisecrack

9212

139d

@BordedDev also the image here is dense, but the bottom right is how individual values are processed in the normal case.

The bottom-left corner has 3 notes that really contain the meat of the work.

The heavy lifting is done by finding a hash function h2 that given Zn, outputs Xn-1

AND also such that h2(yn) outputs Zn-1, because we know that h2(z) gives us y for a given z in the single-input case, and h3(y) gives x.

So having a hash function h2(), and a sequence-compressed Z, means automatically getting Zn-1, and thus yn-1, which with h3 gives us the next x value in the sequence during decompression.

The result is we go from Zn-1, to yn-1, to xn-1

then Zn-2, to yn-2, to xn-2, and so on rebuilding the original data in reverse, one element at a time.

People can get confused b/c of hashes like md5, where its letters and numbers. Here our hashes are numbers using a modular algorithm. Numbers go in, numbers come out. Means you have to convert inputs to integers 1st.
3

Lensflare

18871

139d

I only know bloom from computer graphics. Is it related?
3

Wisecrack

9212

139d

@Lensflare > I only know bloom from computer graphics. Is it related?

I'm not familiar with computer graphics (I say, as I look at a computer monitor while typing this comment).
2

Lensflare

18871

139d

@Wisecrack

https://en.m.wikipedia.org/wiki/...
3

Wisecrack

9212

139d

@Lensflare Took a quote from there "One physical basis of bloom is that, in the real world, lenses can never focus perfectly. Even a perfect lens will convolve the incoming image with an Airy disk (the diffraction pattern produced by passing a point light source through a circular aperture).[2] Under normal circumstances, these imperfections are not noticeable, but an intensely bright light source will cause the imperfections to become visible. As a result, the image of the bright light appears to bleed beyond its natural borders.

The Airy disc function falls off very quickly but has very wide tails (actually, infinitely wide tails)."

So not related. But it makes me think. If LLMs sometimes generate wrong answers b/c of issues with tail distribution, what would a statistical distribution and attention mechanism that mimics the airy disk look like?

Tails are after all infinite.

Probably need some non-image interpretation of the phenomenon 1st. It's fascinating though.
3

Wisecrack

9212

139d

@Demolishun "I know its not related. But I wonder if we will ever have a VM that simulates dna in a cell. "

If we ever achieve workable and efficient quantum processors, you can bet thats exactly something that will be developed to run on them.
3

Lensflare

18871

139d

@Wisecrack right after we successfully ran doom on it.
3

Wisecrack

9212

139d

@Lensflare

As primitive as quantum is now, it'll probably be a pong clone first, but I wouldn't be surprised at all.

Optical quantum seems more practicable in all cases.

I'm surprised they haven't tried to use whether an output has an interference pattern or not (the quantum eraser, and two-slit experiment) to build optical quantum gates.
2

Wisecrack

9212

139d

@retoor still needs to be cooled to sub-zero temperatures, but I thought it was an interesting project all-in-all, from microsoft of all places.

Optical quantum computing is a pretty neglected field compared to some of the other approaches.
2

Ranchonyx

10398

139d

You're a madwoman.

I didn't even bother to read further than the first three paragraphs, as I simply wouldn't understand it.

Good to see you back here.
2

Wisecrack

9212

139d

@Ranchonyx "You're a madwoman."

I'm actually a guy, but thanks.
2

Wisecrack

9212

139d

@Ranchonyx I'll post some code sometime this weekend so people can play around with the system, see it in action, and work out first principles from demonstration.

The simplest version was generating two hash functions, h0, and h1.

h0 takes a list of numbers x, and outputs a seemingly random looking list of numbers y.

The magic happens with h1. in the earliest versions h1 tok y and outputs x again.

The trick was to write a function that, given h0, x, and y, found a hash function (h1) that output x again.

It's kinda beautiful actually and way simpler than it appears on the surface.

The first time you run gen_hash, and then plug in a list of numbers, or a converted string, only to get a seemingly random output, and then plug that into *another* hash which returns your original output 'unhashed', you'll understand the entire principle of the thing immediately.

Everything follows from that.
2

Wisecrack

9212

139d

@iiii not as weird as all the prior math.

the math for hash functions is pretty normal, it's the way it is being used that is novel.
4

SidTheITGuy

9637

138d

what the fuck do I need to do to get into someone's list ffs
2

AlgoRythm

49901

135d

Everyone always forgets good old Algo

That’s okay, Algo doesn’t like to be Perceived (TM)

I’ve thought about implementing my own bloom stuff but never thought about encoding data into it. If I didn’t know any better I would say you reinvented DEFLATE
2

Wisecrack

9212

126d

@AlgoRythm, lol, probably for some definition of DEFLATE.

Related Rants

Add Comment

rant

math