Today:
Commit(x) -> commitment c
Reveal(c) -> opens the commitment, revealing x
Properties:
c does not reveal anything about x
c in one way (you can't change what you committed to)x' != x given cOne scheme:
c = hash(x || r), r = random nonce
Pedersen commitment has information-theoretic hiding properties, but computationally-hard binding
Setup:
p and q, such that q | p-1 (safe prime p = 2q + 1)g generator of order q subgroup Q_pReceiver sets up h = g^a mod p, a is random from Z_q, he knows a and it is secret.
Commit(x \in Z_q):
   choose r \in Z_q at random
   c = g^x h^r mod p
Reveal(c):
   reveal x and r
   receiver verifies that c = g^x h^r mod p
Properties:
x? Given g^x h^r mod p, and you don't know a.
c = g^x h^r = g^x' h^r' can be a commitment for any xProve that for any x', there exists an r' such that g^x' h^r' = c
g^x g^ar = g^x' g^ar' (mod p) <=> g^(x+ar) = g^(x'+ar') (mod p) <=>
x + ar = x' + ar' (mod q) <=> r' = (x+ar - x') / a = (x-x')/a + r
See discrete log identity and discrete log properties.
Therefore there exists an r' for any x' such that the commitment c for x
is also a commitment for x' => c hides x.
Thus, this commitment is perfectly hiding (information theoretic hiding) because I haven't really committed to any value. I could open it in lots of different ways (however, we will show it is very hard to open it in different ways)
c = g^x h^r
and c = g^x' h^r', and I know x, r, x', r' for the same commitment cThen, I can compute a discrete log on h and reveal a:
g^(x+ar) = g^(x'+ar') <=> x+ar = x' +ar' <=> a = (x-x')/(r-r')
This is (sort of) a contradiction, because I assumed discrete log is hard and I 
can't know a. If I had an algorithm for efficiently computing the x' and r'
then I would have an algorithm for computing discrete logs efficiently.
Thus, it's computationally binding.
c for value x can you compute
a c' for a value x+1 that will let you bid higher than me?Yes:
c = g^x h^r, commits x
=> c' = c * g = g^(x+1) h^r, commits x+1
In some applications that's a bad thing.
You can hash x first, to fix this!
What do we need:
Keygen(1^l) -> (pk, sk)
Enc(pk, m) -> c
 - typically this will be randomized to avoid guessing attacks
   where attackers try different m's and hope to get the attacked c
Dec(sk, c) -> m
We need Dec(sk, Enc(pk, m)) = m
RSA doesn't meet this definition because it is not randomized. ElGamal is.
G = <g> (cyclic group generated by g)
discrete log needs to be hard in G
Keygen(): 
    pick x at random from {0, 1, ... |G|-1}
    sk = x
    pk = g^x 
    return (pk, sk)
Note:
 - security parameter is the group size |G|
 - discrete logarithm is hard => given pk = g^x it is hard to obtain sk = x
Enc(g^x, m):
    pick k at random from {0, 1, ... |G|-1}
    return (g^k, m*((g^x)^k))
Dec(x, c = (a = g^k, b = m*g^(xk))):
   ( you have x, so you can compute g^(xk) )
   g^(xk) = (g^k)^x = a^x
   m = b/(a^x)
Note:
 - assuming we can represent messages as members of the group G
If your m's are too big, you would need a huge group G. Typically, you keep
a reasonable group size of 2048 bits and you encrypt a symmetric key using ElGamal.
Then you encrypt your big message m with the symmetric key.
Alice                                   Bob
                    g^x
        ------------------------------>
                a = g^k
        <------------------------------
DH key                                  DH key
= (g^k)^x = g^(xk)                      = (g^x)^k = g^(xk)
                b = m * g^(xk)            
        <------------------------------
What are our security specifications?
pkm_0, m_1 different than each other, |m_0| = |m_1|sbc* = Enc(pk, m_b)c* to adversary, he has to figure out if m_0 or m_1
were encrypted
sb*, which is his guess for b (ie: his guess for
which message was encrypted)b* = bGetting the right answer half the time is easy: just guess. However, doing any better than that should be hard!
Scheme is semantically secure if the probability that the adversary wins is
1/2 + negl(l), where l is the security parameter
From this definition, it should be clear that the scheme should be non-deterministic or 
randomized, because the adversary gets oracle access. This would allow him to check
if Enc(m_0) = c* or Enc(m_1) = c* if the scheme were deterministic.
Semantic security requires either:
Not computing discrete logs (CDH assumption).
Given group G and a generator g, it is infeasible to decide whether a given
triple of elements is of form:
(g^a, g^b, g^c) where a,b,c are randomor,
(g^a, g^b, g^ab) where a,b are randomIt turns out it is very hard to tell these apart, given that triple.
Theorem: If we assume DDH is hard, then CDH is hard. (because if you could compute it, then you can recognize it).
"DDH is considered a stronger assumption than discrete log, because there are groups for which detecting DDH tuples is easy, but computing discrete logs is believed to be hard. Thus, requiring that the DDH assumption holds in a group is a more restricting requirement." -- Wikipedia
Theorem: ElGamal is semantically secure <=> DDH holds in G
Proof <=: Suppose you have a way to break the IND-CPA game for ElGamal =>
then you can solve a DDH instance. Let's see how:
Being able to break the IND-CPA means that given a ciphertext c*, either of 
m1 or of m2 (where m1 and m2 are chosen by you).
c1 = Enc(g^x, m1) = (g^k1, m1*(g^x)^k1)
c2 = Dec(g^x, m2) = (g^k2, m2*(g^x)^k2)
c* = c1 or c2
...you have an algorithm A that is able to tell which one of m1 or m2 is
encrypted in c*
We'll see how you can build another algorithm D that can solve the DDH problem
given an algorithm for A
ElGamal has nice properties, like homomorphism.
Suppose I have:
sk = x
pk = y = g^x
E(m1) = c1 = (g^r, m1 * y^r)
E(m2) = c2 = (g^s, m2 * y^s)
Then,
c1*c2 = (g^(r+s), m1*m2 * y^(r+s)) = E(m1*m2)
In particular if m2 = 1, then c1 * c2 will rerandomize the encryption of m1
This is useful for mix-nets.
It can also be annoying if you want to meet a stronger definition of security.
We can change the semantic security game, by allowing the adversary to decrypt as well (except decrypting the challenge ciphertext).
You can see that because ElGamal ciphertexts can be rerandomized => it is not
secure under IND-CCA2 (adversary can generate different ciphertext for the same
message m_b and ask decryption oracle to decrypt, revealing m_b and thus b)