prng

Ranter

netikras

34638

Comments

6

stop

6580

5y

https://dilbert.com/strip/...
3

alexbrooklyn

13618

5y

What was his solution then?
2

netikras

34638

5y

@alexbrooklyn He didn't suggest me one
3

netikras

34638

5y

@stop Spot on!
7

lorentz

15364

5y

No. If you toss a coin a million times, heads should be around 500k. If it's way off, you know your coin is likely not really random. Even still, if you happened to have tossed 999k heads with a truly balanced coin, the next toss will still be 50%.
1

lorentz

15364

5y

It's ugly because no matter what you do your test will always have a fail rate, but that's just how statistics works.
8

Fast-Nop

36583

5y

Well the dev was right, there are test suites that check for various statistical properties that have to be satisfied. Uniformity is only one point out of many, and only checking a 100 rolls is far too little anyway.

Thinking of test suites like Diehard(er).
0

stop

6580

5y

Are at least the random factors saved?
0

netikras

34638

5y

@Lor-inc Around 50% - yes, I agree. But can you define that "around"? :) But I find it very unlikely to have an equal distribution.

That is what I meant. The problem with probabilities you can never be certain. And if you write a unittest that relies on probabilities you will have randomly ( :) ) failing CI pipelines.

IMO it would be an option to define an SLA for deviation from 50% that is "good enough" and anything breaching that level would make a test fail (ruling the RNG/PRNG as degraded, not suitable). Please do correct me if you feel I'm wrong.

But I doubt this is a job for a 20 minutes coding session :)
2

Fast-Nop

36583

5y

@netikras Of course you can define "around". The easiest test with N coin tosses is that the expected value is N/2, and 2 sigma confidence interval (95%) means that a +/-sqrt(N) interval around N/2. Totally doable in a 20 minutes test.
1

lorentz

15364

5y

@netikras In the case of the coin, the amount of heads in a million tosses will give you something close to [normal distribution](https://en.wikipedia.org/wiki/...). You are free to choose how "strict" you want your tests to be. The stricter the test, the more false positives and the less false negatives. Ideally you'd probably want the tolerance to equal the [deviation](https://en.wikipedia.org/wiki/...) of your [distribution](https://en.wikipedia.org/wiki/...), as this gives the best ratio of false positives to actual faulty RNGs.
0

lorentz

15364

5y

Our industry is built around the concept of determinism. We are grossly unprepared even for basic cases of randomness, yet quantum computing is right behind the corner.
2

Fast-Nop

36583

5y

@Lor-inc I guess anything like throwing in questions about the required confidence interval and false-positives vs. false-negatives would have already have passed that question.

The answer from @netikras just failed. The dev only kinda agreed to get to an end because it was obvious that the applicant didn't have even basic knowledge in statistics - and because it was probably not part of the job requirements anyway.
0

netikras

34638

5y

@Fast-Nop alright, what would be that "almost" for 100 items in the data set then?
0

netikras

34638

5y

@Fast-Nop I guess you're right :)
1

Fast-Nop

36583

5y

@netikras SQRT(100)=10, N/2=50, so anything from 40 to 60 for a binary coin toss would count as random with 95% confidence. Pretty loose, that is.

Note that in order to have any significance, N must be big in comparison to SQRT(N). An order of magnitude is the absolute minimum, so some degree of significance barely starts at 100, but with a pretty big interval (20%).

Same calculation for N=10000 would be anything from 4900 to 5100, i.e. only 2% relative to N.
1

Fast-Nop

36583

5y

@netikras The easy take-away for coin tosses is +/-SQRT(N) as rule of thumb.
0

Fast-Nop

36583

5y

@Demolishun If you run simulations that involve noise, and the RNG is crap, then this can fuck up your simulations to the point where the results are more or less just RNG artifacts.

Even worse for crypto stuff. A bad RNG can make shit so easy to crack that someone like Bruce Schneier could do it between two cups of coffee for breakfast.
0

netikras

34638

5y

@Demolishun It was TDD and the feature had to have 2 versions: 1 -- sequential and 2 -- random :) So I had to somehow show the interviewer with test asserts that versions are working as expected.
0

Wombat

10143

5y

📌

Related Rants

Add Comment

rant

determinism

testing randomness

rng