Today: Applications of crypto math, Monday, March 30th, 2015

Today:

Pedersen commitment

Commit(x) -> commitment c
Reveal(c) -> opens the commitment, revealing x

Properties:

One scheme:

c = hash(x || r), r = random nonce

Pedersen commitment has information-theoretic hiding properties, but computationally-hard binding

Setup:

Receiver sets up h = g^a mod p, a is random from Z_q, he knows a and it is secret.

Commit(x \in Z_q):
   choose r \in Z_q at random
   c = g^x h^r mod p

Reveal(c):
   reveal x and r
   receiver verifies that c = g^x h^r mod p

Properties:

Prove that for any x', there exists an r' such that g^x' h^r' = c

g^x g^ar = g^x' g^ar' (mod p) <=> g^(x+ar) = g^(x'+ar') (mod p) <=>
x + ar = x' + ar' (mod q) <=> r' = (x+ar - x') / a = (x-x')/a + r

See discrete log identity and discrete log properties.

Therefore there exists an r' for any x' such that the commitment c for x is also a commitment for x' => c hides x.

Thus, this commitment is perfectly hiding (information theoretic hiding) because I haven't really committed to any value. I could open it in lots of different ways (however, we will show it is very hard to open it in different ways)

Then, I can compute a discrete log on h and reveal a:

g^(x+ar) = g^(x'+ar') <=> x+ar = x' +ar' <=> a = (x-x')/(r-r')

This is (sort of) a contradiction, because I assumed discrete log is hard and I can't know a. If I had an algorithm for efficiently computing the x' and r' then I would have an algorithm for computing discrete logs efficiently.

Thus, it's computationally binding.

Yes:

c = g^x h^r, commits x
=> c' = c * g = g^(x+1) h^r, commits x+1

In some applications that's a bad thing.

You can hash x first, to fix this!

Public key encryption

What do we need:

Keygen(1^l) -> (pk, sk)

Enc(pk, m) -> c
 - typically this will be randomized to avoid guessing attacks
   where attackers try different m's and hope to get the attacked c

Dec(sk, c) -> m

We need Dec(sk, Enc(pk, m)) = m

RSA doesn't meet this definition because it is not randomized. ElGamal is.

ElGamal

G = <g> (cyclic group generated by g)
discrete log needs to be hard in G

Keygen(): 
    pick x at random from {0, 1, ... |G|-1}
    sk = x
    pk = g^x 
    return (pk, sk)

Note:
 - security parameter is the group size |G|
 - discrete logarithm is hard => given pk = g^x it is hard to obtain sk = x

Enc(g^x, m):
    pick k at random from {0, 1, ... |G|-1}
    return (g^k, m*((g^x)^k))

Dec(x, c = (a = g^k, b = m*g^(xk))):
   ( you have x, so you can compute g^(xk) )
   g^(xk) = (g^k)^x = a^x
   m = b/(a^x)


Note:
 - assuming we can represent messages as members of the group G

If your m's are too big, you would need a huge group G. Typically, you keep a reasonable group size of 2048 bits and you encrypt a symmetric key using ElGamal. Then you encrypt your big message m with the symmetric key.

ElGamal as a Diffie-Hellman key exchange

Alice                                   Bob
                    g^x
        ------------------------------>

                a = g^k
        <------------------------------

DH key                                  DH key
= (g^k)^x = g^(xk)                      = (g^x)^k = g^(xk)

                b = m * g^(xk)            
        <------------------------------

Why does ElGamal work?

What are our security specifications?

Semantic security

  1. Phase I
  2. Phase II

Getting the right answer half the time is easy: just guess. However, doing any better than that should be hard!

Scheme is semantically secure if the probability that the adversary wins is 1/2 + negl(l), where l is the security parameter

From this definition, it should be clear that the scheme should be non-deterministic or randomized, because the adversary gets oracle access. This would allow him to check if Enc(m_0) = c* or Enc(m_1) = c* if the scheme were deterministic.

Semantic security requires either:

Decisional Diffie-Hellman (DDH)

Not computing discrete logs (CDH assumption).

Given group G and a generator g, it is infeasible to decide whether a given triple of elements is of form:

or,

It turns out it is very hard to tell these apart, given that triple.

Theorem: If we assume DDH is hard, then CDH is hard. (because if you could compute it, then you can recognize it).

"DDH is considered a stronger assumption than discrete log, because there are groups for which detecting DDH tuples is easy, but computing discrete logs is believed to be hard. Thus, requiring that the DDH assumption holds in a group is a more restricting requirement." -- Wikipedia

Theorem: ElGamal is semantically secure <=> DDH holds in G

Proof <=: Suppose you have a way to break the IND-CPA game for ElGamal => then you can solve a DDH instance. Let's see how:

Being able to break the IND-CPA means that given a ciphertext c*, either of m1 or of m2 (where m1 and m2 are chosen by you).

c1 = Enc(g^x, m1) = (g^k1, m1*(g^x)^k1)
c2 = Dec(g^x, m2) = (g^k2, m2*(g^x)^k2)

c* = c1 or c2

...you have an algorithm A that is able to tell which one of m1 or m2 is encrypted in c*

We'll see how you can build another algorithm D that can solve the DDH problem given an algorithm for A

Back to ElGamal

ElGamal has nice properties, like homomorphism.

Suppose I have:

sk = x
pk = y = g^x

E(m1) = c1 = (g^r, m1 * y^r)
E(m2) = c2 = (g^s, m2 * y^s)

Then,

c1*c2 = (g^(r+s), m1*m2 * y^(r+s)) = E(m1*m2)

In particular if m2 = 1, then c1 * c2 will rerandomize the encryption of m1

This is useful for mix-nets.

It can also be annoying if you want to meet a stronger definition of security.

IND-CCA2: Indistinguishability under chosen-ciphertext attack

We can change the semantic security game, by allowing the adversary to decrypt as well (except decrypting the challenge ciphertext).

You can see that because ElGamal ciphertexts can be rerandomized => it is not secure under IND-CCA2 (adversary can generate different ciphertext for the same message m_b and ask decryption oracle to decrypt, revealing m_b and thus b)