Today: Applications of crypto math, Monday, March 30th, 2015

Today:

Pedersen commitments
Public key encryption
ElGamal PK encryption
Semantic security
DDH (Decisional Diffie-Hellman)
IND-CCA2 (chosen ciphertext indistinguishability)
Cramer-Shoup PK encryption

Pedersen commitment

Commit(x) -> commitment c
Reveal(c) -> opens the commitment, revealing x

Properties:

Hiding: commitment c does not reveal anything about x
- information theoretic (no way it reveals anything)
- computational (no feasible way to discover what is revealed)
Binding: can only open c in one way (you can't change what you committed to)
Non-malleability: can't produce another commitment to x' != x given c

One scheme:

c = hash(x || r), r = random nonce

Pedersen commitment has information-theoretic hiding properties, but computationally-hard binding

Setup:

large prime p and q, such that q | p-1 (safe prime p = 2q + 1)
g generator of order q subgroup Q_p

Receiver sets up h = g^a mod p, a is random from Z_q, he knows a and it is secret.

Commit(x \in Z_q):
   choose r \in Z_q at random
   c = g^x h^r mod p

Reveal(c):
   reveal x and r
   receiver verifies that c = g^x h^r mod p

Properties:

Hiding: Does it hide x? Given g^x h^r mod p, and you don't know a.
- Claim is that c = g^x h^r = g^x' h^r' can be a commitment for any x

Prove that for any x', there exists an r' such that g^x' h^r' = c

g^x g^ar = g^x' g^ar' (mod p) <=> g^(x+ar) = g^(x'+ar') (mod p) <=>
x + ar = x' + ar' (mod q) <=> r' = (x+ar - x') / a = (x-x')/a + r

See discrete log identity and discrete log properties.

Therefore there exists an r' for any x' such that the commitment c for x is also a commitment for x' => c hides x.

Thus, this commitment is perfectly hiding (information theoretic hiding) because I haven't really committed to any value. I could open it in lots of different ways (however, we will show it is very hard to open it in different ways)

Binding: If it's not binding, then I know two ways of opening it c = g^x h^r and c = g^x' h^r', and I know x, r, x', r' for the same commitment c

Then, I can compute a discrete log on h and reveal a:

g^(x+ar) = g^(x'+ar') <=> x+ar = x' +ar' <=> a = (x-x')/(r-r')

This is (sort of) a contradiction, because I assumed discrete log is hard and I can't know a. If I had an algorithm for efficiently computing the x' and r' then I would have an algorithm for computing discrete logs efficiently.

Thus, it's computationally binding.

Non-malleability: If I bid with commitment c for value x can you compute a c' for a value x+1 that will let you bid higher than me?

Yes:

c = g^x h^r, commits x
=> c' = c * g = g^(x+1) h^r, commits x+1

In some applications that's a bad thing.

You can hash x first, to fix this!

Public key encryption

What do we need:

Keygen(1^l) -> (pk, sk)

Enc(pk, m) -> c
 - typically this will be randomized to avoid guessing attacks
   where attackers try different m's and hope to get the attacked c

Dec(sk, c) -> m

We need Dec(sk, Enc(pk, m)) = m

RSA doesn't meet this definition because it is not randomized. ElGamal is.

ElGamal

G = <g> (cyclic group generated by g)
discrete log needs to be hard in G

Keygen(): 
    pick x at random from {0, 1, ... |G|-1}
    sk = x
    pk = g^x 
    return (pk, sk)

Note:
 - security parameter is the group size |G|
 - discrete logarithm is hard => given pk = g^x it is hard to obtain sk = x

Enc(g^x, m):
    pick k at random from {0, 1, ... |G|-1}
    return (g^k, m*((g^x)^k))

Dec(x, c = (a = g^k, b = m*g^(xk))):
   ( you have x, so you can compute g^(xk) )
   g^(xk) = (g^k)^x = a^x
   m = b/(a^x)


Note:
 - assuming we can represent messages as members of the group G

If your m's are too big, you would need a huge group G. Typically, you keep a reasonable group size of 2048 bits and you encrypt a symmetric key using ElGamal. Then you encrypt your big message m with the symmetric key.

ElGamal as a Diffie-Hellman key exchange

Alice                                   Bob
                    g^x
        ------------------------------>

                a = g^k
        <------------------------------

DH key                                  DH key
= (g^k)^x = g^(xk)                      = (g^x)^k = g^(xk)

                b = m * g^(xk)            
        <------------------------------

Why does ElGamal work?

What are our security specifications?

Semantic security

Phase I
- Adversary sees pk
- Adversary outputs m_0, m_1 different than each other, |m_0| = |m_1|
- Adversary maybe keeps some state s
Phase II
- Examiner picks a random bit b
- Examiner encrypts one of the messages: c* = Enc(pk, m_b)
- Examiner sends c* to adversary, he has to figure out if m_0 or m_1 were encrypted
  - Adversary can look at his state s
- Adversary gets encryption oracle access, but no decryption oracle access
- Adversary outputs b*, which is his guess for b (ie: his guess for which message was encrypted)
- Adversary wins the game if b* = b

Getting the right answer half the time is easy: just guess. However, doing any better than that should be hard!

Scheme is semantically secure if the probability that the adversary wins is 1/2 + negl(l), where l is the security parameter

From this definition, it should be clear that the scheme should be non-deterministic or randomized, because the adversary gets oracle access. This would allow him to check if Enc(m_0) = c* or Enc(m_1) = c* if the scheme were deterministic.

Semantic security requires either:

randomness in E
have some state in E that results in different ciphertexts for the same message
- maybe a simple counter passed through a hash function can work

Decisional Diffie-Hellman (DDH)

Not computing discrete logs (CDH assumption).

Given group G and a generator g, it is infeasible to decide whether a given triple of elements is of form:

(g^a, g^b, g^c) where a,b,c are random

or,

(g^a, g^b, g^ab) where a,b are random

It turns out it is very hard to tell these apart, given that triple.

Theorem: If we assume DDH is hard, then CDH is hard. (because if you could compute it, then you can recognize it).

"DDH is considered a stronger assumption than discrete log, because there are groups for which detecting DDH tuples is easy, but computing discrete logs is believed to be hard. Thus, requiring that the DDH assumption holds in a group is a more restricting requirement." -- Wikipedia

Theorem: ElGamal is semantically secure <=> DDH holds in G

Proof <=: Suppose you have a way to break the IND-CPA game for ElGamal => then you can solve a DDH instance. Let's see how:

Being able to break the IND-CPA means that given a ciphertext c*, either of m1 or of m2 (where m1 and m2 are chosen by you).

c1 = Enc(g^x, m1) = (g^k1, m1*(g^x)^k1)
c2 = Dec(g^x, m2) = (g^k2, m2*(g^x)^k2)

c* = c1 or c2

...you have an algorithm A that is able to tell which one of m1 or m2 is encrypted in c*

We'll see how you can build another algorithm D that can solve the DDH problem given an algorithm for A

Back to ElGamal

ElGamal has nice properties, like homomorphism.

Suppose I have:

sk = x
pk = y = g^x

E(m1) = c1 = (g^r, m1 * y^r)
E(m2) = c2 = (g^s, m2 * y^s)

Then,

c1*c2 = (g^(r+s), m1*m2 * y^(r+s)) = E(m1*m2)

In particular if m2 = 1, then c1 * c2 will rerandomize the encryption of m1

This is useful for mix-nets.

It can also be annoying if you want to meet a stronger definition of security.

IND-CCA2: Indistinguishability under chosen-ciphertext attack

We can change the semantic security game, by allowing the adversary to decrypt as well (except decrypting the challenge ciphertext).

You can see that because ElGamal ciphertexts can be rerandomized => it is not secure under IND-CCA2 (adversary can generate different ciphertext for the same message m_b and ask decryption oracle to decrypt, revealing m_b and thus b)