Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "prime factorization"
-
So I cracked prime factorization. For real.
I can factor a 1024 bit product in 11hours on an i3.
No GPU acceleration, no massive memory overhead. Probably a lot faster with parallel computation on a better cpu, or even on a gpu.
4096 bits in 97-98 hours.
Verifiable. Not shitting you. My hearts beating out of my fucking chest. Maybe it was an act of god, I don't know, but it works.
What should I do with it?241 -
I messaged a professor at MIT and surprisingly got a response back.
He told me that "generating primes deterministically is a solved problem" and he would be very surprised if what I wrote beat wheel factorization, but that he would be interested if it did.
It didnt when he messaged me.
It does now.
Tested on primes up to 26 digits.
Current time tends to be 1-100th to 2-100th of a second.
Seems to be steady.
First n=1million digits *always* returns false for composites, while for primes the rate is 56% true vs false, and now that I've made it faster, I'm fairly certain I can get it to 100% accuracy.
In fact what I'm thinking I'll do is generate a random semiprime using the suspected prime, map it over to some other factor tree using the variation on modular expotentiation several of us on devrant stumbled on, and then see if it still factors. If it does then we know the number in question is prime. And because we know the factor in question, the semiprime mapping function doesnt require any additional searching or iterations.
The false negative rate, I think goes to zero the larger the prime from what I can see. But it wont be an issue if I'm right about the accuracy being correctable.
I'd like to thank the professor for the challenge. He also shared a bunch of useful links.
That ones a rare bird.21 -
POSTMORTEM
"4096 bit ~ 96 hours is what he said.
IDK why, but when he took the challenge, he posted that it'd take 36 hours"
As @cbsa wrote, and nitwhiz wrote "but the statement was that op's i3 did it in 11 hours. So there must be a result already, which can be verified?"
I added time because I was in the middle of a port involving ArbFloat so I could get arbitrary precision. I had a crude desmos graph doing projections on what I'd already factored in order to get an idea of how long it'd take to do larger
bit lengths
@p100sch speculated on the walked back time, and overstating the rig capabilities. Instead I spent a lot of time trying to get it 'just-so'.
Worse, because I had to resort to "Decimal" in python (and am currently experimenting with the same in Julia), both of which are immutable types, the GC was taking > 25% of the cpu time.
Performancewise, the numbers I cited in the actual thread, as of this time:
largest product factored was 32bit, 1855526741 * 2163967087, took 1116.111s in python.
Julia build used a slightly different method, & managed to factor a 27 bit number, 103147223 * 88789957 in 20.9s,
but this wasn't typical.
What surprised me was the variability. One bit length could take 100s or a couple thousand seconds even, and a product that was 1-2 bits longer could return a result in under a minute, sometimes in seconds.
This started cropping up, ironically, right after I posted the thread, whats a man to do?
So I started trying a bunch of things, some of which worked. Shameless as I am, I accepted the challenge. Things weren't perfect but it was going well enough. At that point I hadn't slept in 30~ hours so when I thought I had it I let it run and went to bed. 5 AM comes, I check the program. Still calculating, and way overshot. Fuuuuuuccc...
So here we are now and it's say to safe the worlds not gonna burn if I explain it seeing as it doesn't work, or at least only some of the time.
Others people, much smarter than me, mentioned it may be a means of finding more secure pairs, and maybe so, I'm not familiar enough to know.
For everyone that followed, commented, those who contributed, even the doubters who kept a sanity check on this without whom this would have been an even bigger embarassement, and the people with their pins and tactical dots, thanks.
So here it is.
A few assumptions first.
Assuming p = the product,
a = some prime,
b = another prime,
and r = a/b (where a is smaller than b)
w = 1/sqrt(p)
(also experimented with w = 1/sqrt(p)*2 but I kept overshooting my a very small margin)
x = a/p
y = b/p
1. for every two numbers, there is a ratio (r) that you can search for among the decimals, starting at 1.0, counting down. You can use this to find the original factors e.x. p*r=n, p/n=m (assuming the product has only two factors), instead of having to do a sieve.
2. You don't need the first number you find to be the precise value of a factor (we're doing floating point math), a large subset of decimal values for the value of a or b will naturally 'fall' into the value of a (or b) + some fractional number, which is lost. Some of you will object, "But if thats wrong, your result will be wrong!" but hear me out.
3. You round for the first factor 'found', and from there, you take the result and do p/a to get b. If 'a' is actually a factor of p, then mod(b, 1) == 0, and then naturally, a*b SHOULD equal p.
If not, you throw out both numbers, rinse and repeat.
Now I knew this this could be faster. Realized the finer the representation, the less important the fractional digits further right in the number were, it was just a matter of how much precision I could AFFORD to lose and still get an accurate result for r*p=a.
Fast forward, lot of experimentation, was hitting a lot of worst case time complexities, where the most significant digits had a bunch of zeroes in front of them so starting at 1.0 was a no go in many situations. Started looking and realized
I didn't NEED the ratio of a/b, I just needed the ratio of a to p.
Intuitively it made sense, but starting at 1.0 was blowing up the calculation time, and this made it so much worse.
I realized if I could start at r=1/sqrt(p) instead, and that because of certain properties, the fractional result of this, r, would ALWAYS be 1. close to one of the factors fractional value of n/p, and 2. it looked like it was guaranteed that r=1/sqrt(p) would ALWAYS be less than at least one of the primes, putting a bound on worst case.
The final result in executable pseudo code (python lol) looks something like the above variables plus
while w >= 0.0:
if (p / round(w*p)) % 1 == 0:
x = round(w*p)
y = p / round(w*p)
if x*y == p:
print("factors found!")
print(x)
print(y)
break
w = w + i
Still working but if anyone sees obvious problems I'd LOVE to hear about it.38 -
CONTEST - Win big $$$ straight from Wisecrack!
For all those who participated in my original "cracking prime factorization" thread (and several I decided to add just because), I'm offering a whopping $5 to anyone who posts a 64 bit *product* of two primes, which I cant factor. Partly this is a thank you for putting up with me.
FIVE WHOLE DOLLARS! In 1909 money thats $124 dollars! Imagine how many horse and buggy rides you could buy with that back then! Or blowjobs!
Probably not a lot!
But still.
So the contest rules are simple:
Go to
https://asecuritysite.com/encryptio...
Enter 32 for the number of bits per prime, and generate a 64 bit product.
Post it here to enter the contest.
Products must be 64 bits, and the result of just *two* prime numbers. Smaller or larger bit lengths for products won't be accepted at this time.
I'm expecting a few entries on this. Entries will generally be processed in the order of submission, but I reserve the right to wave this rule.
After an entry is accepted, I'll post "challenge accepted. Factoring now."
And from that point on I have no more than 5 hours to factor the number, (but results usually arrive in 30-60 minutes).
If I fail to factor your product in the specified time (from the moment I indicate I've begun factoring), congratulations, you just won $5.
Payment will be made via venmo or other method at my discretion.
One entry per user. Participants from the original thread only, as well as those explicitly mentioned.
Limitations: Factoring shall be be done
1. without *any* table lookup of primes or equivalent measures, 2. using anything greater than an i3, 3. without the aid of a gpu, 4. without multithreading. 5. without the use of more than one machine.
FINALLY:
To claim your prize, post the original factors of your product here, after the deadline has passed.
And then I'll arrange payment of the prize.
You MUST post the factors of your product after the deadline, to confirm your product and claim your prize.99 -
Still on the primenumbers bender.
Had this idea that if there were subtle correlations between a sufficiently large set of identities and the digits of a prime number, the best way to find it would be to automate the search.
And thats just what I did.
I started with trace matrices.
I actually didn't expect much of it. I was hoping I'd at least get lucky with a few chance coincidences.
My first tests failed miserably. Eight percent here, 10% there. "I might as well just pick a number out of a hat!" I thought.
I scaled it way back and asked if it was possible to predict *just* the first digit of either of the prime factors.
That also failed. Prediction rates were low still. Like 0.08-0.15.
So I automated *that*.
After a couple days of on-and-off again semi-automated searching I stumbled on it.
[1144, 827, 326, 1184, -1, -1, -1, -1]
That little sequence is a series of identities representing different values derived from a randomly generated product.
Each slots into a trace matrice. The results of which predict the first digit of one of our factors, with a 83.2% accuracy even after 10k runs, and rising higher with the number of trials.
It's not much, but I was kind of proud of it.
I'm pushing for finding 90%+ now.
Some improvements include using a different sort of operation to generate results. Or logging all results and finding the digit within each result thats *most* likely to predict our targets, across all results. (right now I just take the digit in the ones column, which works but is an arbitrary decision on my part).
Theres also the fact that it's trivial to correctly guess the digit 25% of the time, simply by guessing 1, 3, 7, or 9, because all primes, except for 2, end in one of these four.
I have also yet to find a trace with a specific bias for predicting either the smaller of two unique factors *or* the larger. But I haven't really looked for one either.
I still need to write a generate that takes specific traces, and lets me mutate some of the values, to push them towards certain 'fitness' levels.
This would be useful not just for very high predictions, but to find traces with very *low* predictions.
Why? Because it would actually allow for the *elimination* of possible digits, much like sudoku, from a given place value in a predicted factor.
I don't know if any of this will even end up working past the first digit. But splitting the odds, between the two unique factors of a prime product, and getting 40+% chance of guessing correctly, isn't too bad I think for a total amateur.
Far cry from a couple years ago claiming I broke prime factorization. People still haven't forgiven me for that, lol.6 -
Officially faster bruteforcing:
https://pastebin.com/uBFwkwTj
Provided toy values for others to try. Haven't tested if it works with cryptographic secure prime pairs (gcf(p, q) == 1)
It's a 50% reduction in time to bruteforce a semiprime. But I also have some inroads to a/30.
It's not "broke prime factorization for good!" levels of fast, but its still pretty nifty.
Could use decimal support with higher precision so I don't cause massive overflows on larger numbers, but this is just a demonstration after all.13 -
Managed to derive an inverse to karatsuba's multiplication method, converting it into a factorization technique.
Offers a really elegant reason for why non-trivial semiprimes (square free products) are square free.
For a demonstration of karatsubas method, check out:
https://getpocket.com/explore/item/...
Now for the reverse, like I said something elegant emerges.
So we can start by taking the largest digit in our product. Lets say our product is 697.
We find all the digits that produce 6 when summed, along with their order.
thats (1,5), (5,1), (2,4), (4,2), and (3,3)
That means for one of our factors, its largest digit can ONLY be 1, 5, 2, 4, or 3.
Lets take karatsubas method at step f (in the link) and reverse it. Instead of subtracting, we're adding.
If we assume (3,3)
Then we take our middle digit of our product p, in this case the middle digit of 697. is 9, and we munge it with 3.
Then we add our remaining 3, and our remaining unit digit, to get 3+39+7 = 49.
Now, because karatsuba's method ONLY deals with multiplication in single digits, we only need to consider *at most* two digit products.
And interestingly, the only factors of 49 are 7.
49 is a square!
And the only sums that produce 7, are (2,5), (5,2), (3,4), and (4,3)
These would be the possible digits of the factors of 697 if we initially chose (3,3) as our starting point for calculating karatsubas inverse f step.
But you see, 25 can't be a factor of p=697, because 25 is a square, and ends in a 5, so its clearly not prime. 52 can't be either because it ends in 2, likewise 34 ending in 4.
Only 43 could be our possible factor of p.
And we *only* get one factor because our starting point has two of the same digit. Which would mean p would have to equal 43 (a prime) or 1. And because p DOESNT (it equals 697), we can therefore say (3,3) is the wrong starting point, as are ALL starting points that share only one digit, or end in a square.
Ergo we can say the products of non-squares, are specifically non-prime precisely because if they *were* prime, their only factors would HAVE to be themselves, and 1.
For an even BETTER explanation go try karatsuba's method with any prime as the first factor, and 1 as the second factor (just multiply the tens column by zero). And you can see why the inverse, where you might try a starting point that has two matching digits (like 3,3), would obviously fail, because the values it produces could only have two factors; some prime thats not our product, or the value one, which is also not our product.
It's elegant almost to the level of a tautology. -
Is there any exact way to get the product of all primes under n multiplied together, without explicitly knowing what those primes are?
Lets call this number V.
Because hypothethetically, if we calculate from the *base* of V, then we can derive easy divisibility rules for V-1 and V+1, as laid out
here:
https://notaboutapples.wordpress.com/...
And then, unless I've misunderstood something, the problem of factorization has been changed from division into an addition and subtraction problem.12 -
I may have accidentally found a legit factorization method that converts factoring to a combinatorics problem over a graph, with a time complexity that is the factorial of the logarithm of the semi prime being factored.
I don't know if this is supposed to be good or not, and I don't want to post it prematurely like I almost always do. Not at least until I study its properties better, but it's still a pretty interesting find I think.11 -
So I promised a post after work last night, discussing the new factorization technique.
As before, I use a method called decon() that takes any number, like 697 for example, and first breaks it down into the respective digits and magnitudes.
697 becomes -> 600, 90, and 7.
It then factors *those* to give a decomposition matrix that looks something like the following when printed out:
offset: 3, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('7'), Decimal('1')]]
Each entry is a pair of numbers representing a prime base and an exponent.
Now the idea was that, in theory, at each magnitude of a product, we could actually search through the *range* of the product of these exponents.
So for offset three (600) here, we're looking at
2^3 * 3 ^ 1 * 5 ^ 2.
But actually we're searching
2^3 * 3 ^ 1 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
2^3 * 3 ^ 1 * 5 ^ 0
2^3 * 3 ^ 0 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
etc..
On the basis that whatever it generates may be the digits of another magnitude in one of our target product's factors.
And the first optimization or filter we can apply is to notice that assuming our factors pq=n,
and where p <= q, it will always be more efficient to search for the digits of p (because its under n^0.5 or the square root), than the larger factor q.
So by implication we can filter out any product of this exponent search that is greater than the square root of n.
Writing this code was a bit of a headache because I had to deal with potentially very large lists of bases and exponents, so I couldn't just use loops within loops.
Instead I resorted to writing a three state state machine that 'counted down' across these exponents, and it just works.
And now, in practice this doesn't immediately give us anything useful. And I had hoped this would at least give us *upperbounds* to start our search from, for any particular digit of a product's factors at a given magnitude. So the 12 digit (or pick a magnitude out of a hat) of an example product might give us an upperbound on the 2's exponent for that same digit in our lowest factor q of n.
It didn't work out that way. Sometimes there would be 'inversions', where the exponent of a factor on a magnitude of n, would be *lower* than the exponent of that factor on the same digit of q.
But when I started tearing into examples and generating test data I started to see certain patterns emerge, and immediately I found a way to not just pin down these inversions, but get *tight* bounds on the 2's exponents in the corresponding digit for our product's factor itself. It was like the complications I initially saw actually became a means to *tighten* the bounds.
For example, for one particular semiprime n=pq, this was some of the data:
n - offset: 6, exp: [[Decimal('2'), Decimal('5')], [Decimal('5'), Decimal('5')]]
q - offset: 6, exp: [[Decimal('2'), Decimal('6')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('5')]]
It's almost like the base 3 exponent in [n:7] gives away the presence of 3^1 in [q:6], even
though theres no subsequent presence of 3^n in [n:6] itself.
And I found this rule held each time I tested it.
Other rules, not so much, and other rules still would fail in the presence of yet other rules, almost like a giant switchboard.
I immediately realized the implications: rules had precedence, acted predictable when in isolated instances, and changed in specific instances in combination with other rules.
This was ripe for a decision tree generated through random search.
Another product n=pq, with mroe data
q(4)
offset: 4, exp: [[Decimal('2'), Decimal('4')], [Decimal('5'), Decimal('3')]]
n(4)
offset: 4, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('3')]]
Suggesting that a nontrivial base 3 exponent (**2 rather than **1) suggests the exponent on the 2 in the relevant
digit of [n], is one less than the same base 2 digital exponent at the same digit on [q]
And so it was clear from the get go that this approach held promise.
From there I discovered a bunch more rules and made some observations.
The bulk of the patterns, regardless of how large the product grows, should be present in the smaller bases (some bound of primes, say the first dozen), because the bulk of exponents for the factorization of any magnitude of a number, overwhelming lean heavily in the lower prime bases.
It was if the entire vulnerability was hiding in plain sight for four+ years, and we'd been approaching factorization all wrong from the beginning, by trying to factor a number, and all its digits at all its magnitudes, all at once, when like addition or multiplication, factorization could be done piecemeal if we knew the patterns to look for.7 -
Math question time!
Okay so I had this idea and I'm looking for anyone who has a better grasp of math than me.
What if instead of searching for prime factors we searched for a number above p?
One with a certain special property. BEAR WITH ME. I know I make these posts a lot and I'm a bit of a shitposter, but I'm being genuine here.
Take this cherry picked number, 697 for example.
It's factors are 17, and 41. It's trivial but just for demonstration.
If we represented it's factors as a bit string, where each bit represents the index that factor occurs at in a list of primes, it looks like this
1000001000000
When converted back to an integer that number becomes 4160, which we will call f.
And if we do 4160/(2**n) until the result returns
a fractional component, then N in this case will be 7.
And 7 is the index of our lowest factor 17 (lets call it A, and our highest factor we'll call B) in our primes list.
So the problem is changed from finding a factorization of p, to finding an algorithm that allows you to convert p into f. Once you have f it's a matter of converting it to binary, looking up the indexes of all bits set to 1, and finding the values of those indexes in the list of primes.
I'm working on doing that and if anyone has any insights I'm all ears.9 -
Hang with me! This is *not* a math shitpost, I repeat, it is NOT a math shitpost, not entirely anyway.
It appears there is for products of two non-trivial factors, a real number n (well a rational number anyway) such that p/n = i (some number in the set of integers), whos factor chain is apparently no greater than floor(log(log(p))**2)-2, and whos largest factor is never greater than p^(1/4).
And that this number is at least derivable, laboriously with the following:
where p=a*b
https://pastebin.com/Z4thebha
And assuming you have the factors of p/z = jkl..
then instead of doing
p/(jkl..) = z
you can do
p-(jkl) to get the value of [result] whos index is a-1
Getting the actual factor tree of p/z is another matter, but its a start.
Edit: you have to provide your own product.
Preferably import Decimal first.3 -
I'm teaching a couple of classes where students (~18 years old) work on their own projects. I just deleted two of those from my machine: one Angular and one Spring Boot, but just boilerplate. Together, they were about 500 MB. I spent 2-3 hours working on a little Go tool to make concurrent HTTP requests and to report statistics on the response time. The entire repository is roughly 500 kB in size, but solves a genuine problem. My students have a bloat ratio of 1000 compared to me as a baseline, but my stuff actually does things. Today, I programmed prime factorization in PHP for some load tests (mod_php vs. PHP-FPM). The PHP script is 1148 bytes long (but the file system reports 4 kB). My students could learn more from such a script than from their overblown "projects", but "PHP sucks" I hearsay, so let's bloat on.11