Introduction
For Defcon Quals 2019, I wrote a challenge called ASRybaB.
You can find the challenge, its solution, and other scripts here.
challenge.py
contains what given during the competition. Some of the code is obfuscated.secret.py
contains the secret used to compute the HMAC, and it was, obviously, not released.compile.py
was used to generatechallenge.py
starting fromoriginalchallenge.py
.originalchallenge.py
contains the non-obfuscated challenge code.solve_d.sage
contains my solution, which usessage
.
The challenge sends you 3 RSA “challenges” in the form of n, e, v
.
The goal is to provide a “signed” value of v
, which requires finding the secret exponent d
, corresponding to the provided public key (n, e)
.
To get the flag, it is necessary to provide the 3 signed messages in about 15 minutes.
To ensure this time constraint, I used an HMAC, “authenticating” the sent n, e, v
values and the timestamp of when they were generated.
I chose this solution to avoid requiring a long-term connection or saving some state on the challenge’s server.
Obfuscation
The public key (n, e)
is computed so that the corresponding private exponent has special properties.
This function computing the key (create_key
) is obfuscated (see the compile.py
file) by:
- serializing it using Python
marshal
, - modifying its bytecode to make decompilers fail.
Specifically, I inserted a STOP_CODE
bytecode instruction in some dead code.
It was still possible to reverse engineer the create_key
function by using the dis
module, however that required reading Python bytecode instead of Python code.
Another possibility was to add a .pyc
header to the marshaled code, manually remove the STOP_CODE
bytecode, and then use a Python decompiler (e.g., uncompyle6).
Properties of the private exponent
The non-obfuscated code of the create_key
function is the following:
def create_key():
if False: #used by obfuscation
x = 7/0
x = 7/0
Nsize = NSIZE
pqsize = Nsize/2
N = 0
while(N.bit_length()!=Nsize):
while True:
p = number.getStrongPrime(pqsize)
q = number.getStrongPrime(pqsize)
if abs(p-q).bit_length() > (Nsize*0.496):
break
N = p*q
phi = (p-1)*(q-1)
limit1 = 0.261
limit2 = 0.293
while True:
d = number.getRandomRange(pow(2,int(Nsize*limit1)),pow(2,int(Nsize*limit1)+1))
while d.bit_length()<Nsize*limit2:
ppp = 0
while not number.isPrime(ppp):
ppp = number.getRandomRange(pow(2,45),pow(2,45)+pow(2,12))
d *= ppp
if number.GCD(d, phi)!=1:
continue
e = number.inverse(d, phi)
if number.GCD(e, phi)!=1:
continue
break
zzz = 3
return (N, e)
The key is generated so that the private exponent d
is slightly larger of the 0.292
limit, so that the Boneh-Durfee attack cannot be applied directly.
However, given how the key is computed:
- it is guaranteed to contain a prime
ppp
(within a set of 125 values), d/ppp < N^0.292
My solution
My solution reuses the Boneh-Durfee attack implemented by David Wong and available on GitHub, with 2 major modifications.
Exploiting the known prime
Let’s assume for now that we know the value ppp
of the prime composing d
.
We can then rewrite the equations at page 14 of the paper mentioned above in the following way:
e*d = 1 (mod phi(N))
e*d = k * phi(N) + 1
(e*ppp)*(d/ppp) = k * phi(N) + 1
k*phi(N) + 1 = 0 (mod e*ppp)
Consequently, if we know ppp
we can just apply the “standard” Boneh-Durfee attack using ppp * e
as the modulo of our equation, instead of e
.
In my code this is implemented by the line:
modulus *= ppp
Bruteforcing the known prime
As mentioned above ppp
can only have 125 values. Since we need to solve 3 challenges, at most, we need to run the Boneh-Durfee 125*3 = 375
times.
My code uses a Python multiprocessing Queue to parallelize this process.
On a modern 64 cores machine, this takes, at most, about 15 minutes.
Some considerations
Bruteforcing
I wanted to write a challenge requiring writing parallel code.
At the same time, I wanted to avoid a solution that could only be implemented by spending a lot of money or using special hardware.
For this reason, while implementing the challenge I intentionally tuned things so that it was solvable using less than $1
of cloud computing.
On Amazon EC2 a 64 vCPU machine (m4.16xlarge
) costs $3.20 per hour, which is only about $0.80 for the required 15 minutes of computation.
The 0.292
limit
I did quite a lot of testing and, at least with the code I used, it was basically impossible to find a solution for d
above N ^ 0.28
.
For this reason, I think that the 0.292
is more a theoretical limit than a real one.
p
and q
This paper explains that if p-q
is “small” a variation of the Boneh-Durfee could be possible even if d > N ^ 0.292
.
To avoid this issue, my code rejects p
and q
values if their difference is not large enough.
However, this check increased the computation time needed to generate the keys and made me concerned about the service using too much CPU, slowing down our infrastructure.
For this reason, to access this challenge, it was first needed to generate a Proof-of-Work.